CSCE 420 Lecture 11

From Notes
Jump to navigation Jump to search

« previous | Tuesday, February 19, 2013 | next »


Game Playing

  • 2-player 0-sum Games
  • Competitive environments
  • Alternating turns

Minimax Search

lookahead and anticipate opponent's moves

  • assuming worst case
  • rational opponent will maximize his utility:

Steps:

  1. Define minimax score
  2. calculate with doubly-recursive algorithm (Min-Value(), Max-Value()


αβ-Pruning

|-- 0
|   |-- 0 (look here first)
|   |   |-- 0
|   |   `-- -1
|   `-- >= 1
|       |-- 1
|       `-X ?
`-- <= -2
    |-- -2
    |   |-- -2
    |   `-- -3
    `-- >= 5
        |-- 5
        `-X -9

Algorithm:

αβ-pruning(init_state)
  return max-value(init_state, -Infinity, +Infinity)
 
(*
  α represents lower bound on value of max node (best choice)
  β represents upper bound on value of min node (worst choice)
*)
function max-value(state, α, β)
  v := -Infinity
  for a in actions
    w := min-value(result(state,a), α, β)
    v := max(v, w)
    if v >= β then return v   (* prune; skip rest of children *)
    α := max(α, v)
  end for
  return v   (* minimax value *)
end function
 
function min-value(state, α β)
  v := Infinity
  for a in actions
    w := max-value(result(state,a), α, β)
    v := min(v, w)
    if v <= α then return v   (* prune; skip rest of children *)
    β := min(β, v)
  end for
  return v   (* minimax value *)
end function

What about time restrictions on search space?

  • Depth-limited search with limit (requires a "heuristic" / board evaluation function / score that approximates probability of win/payoff)

Board Evaluation: Tic-Tac-Toe

  • Control of center square is a strategic position
  • Number of ways that player can win

Linear combination of weighted features of the board:

Risks:

  • accuracy
  • quiescent search - Wait, something big is about to happen. Hold on, just a few more levels...
  • horizon effect - forestalling; can't see something about to happen


Randomness

Stochastic games

interleave min and max nodes with a chance node

max
|-- chance
|   |-- min
|   |   |-- chance
|   |   |   |-- max
|   |   |   |   |-- chance
|   |   |   |   |-- chance
|   |   |   |   `-- ...
|   |   |   |-- max
|   |   |   |   |-- chance
|   |   |   |   |-- chance
|   |   |   |   `-- ...
|   |   |   `-- ...
|   |   |-- chance
|   |   |   |-- max
|   |   |   |   |-- chance
|   |   |   |   |-- chance
|   |   |   |   `-- ...
|   |   |   |-- max
|   |   |   |   |-- chance
|   |   |   |   |-- chance
|   |   |   |   `-- ...
|   |   |   `-- ...
|   |   `-- ...
|   |-- min
|   |   |-- chance
|   |   |   |-- max
|   |   |   |   |-- chance
|   |   |   |   |-- chance
|   |   |   |   `-- ...
|   |   |   |-- max
|   |   |   |   |-- chance
|   |   |   |   |-- chance
|   |   |   |   `-- ...
|   |   |   `-- ...
|   |   |-- chance
|   |   |   |-- max
|   |   |   |   |-- chance
|   |   |   |   |-- chance
|   |   |   |   `-- ...
|   |   |   |-- max
|   |   |   |   |-- chance
|   |   |   |   |-- chance
|   |   |   |   `-- ...
|   |   |   `-- ...
|   |   `-- ...
|   `-- ...
|-- chance
|   |-- min
|   |   |-- chance
|   |   |   |-- max
|   |   |   |   |-- chance
|   |   |   |   |-- chance
|   |   |   |   `-- ...
|   |   |   |-- max
|   |   |   |   |-- chance
|   |   |   |   |-- chance
|   |   |   |   `-- ...
|   |   |   `-- ...
|   |   |-- chance
|   |   |   |-- max
|   |   |   |   |-- chance
|   |   |   |   |-- chance
|   |   |   |   `-- ...
|   |   |   |-- max
|   |   |   |   |-- chance
|   |   |   |   |-- chance
|   |   |   |   `-- ...
|   |   |   `-- ...
|   |   `-- ...
|   |-- min
|   |   |-- chance
|   |   |   |-- max
|   |   |   |   |-- chance
|   |   |   |   |-- chance
|   |   |   |   `-- ...
|   |   |   |-- max
|   |   |   |   |-- chance
|   |   |   |   |-- chance
|   |   |   |   `-- ...
|   |   |   `-- ...
|   |   |-- chance
|   |   |   |-- max
|   |   |   |   |-- chance
|   |   |   |   |-- chance
|   |   |   |   `-- ...
|   |   |   |-- max
|   |   |   |   |-- chance
|   |   |   |   |-- chance
|   |   |   |   `-- ...
|   |   |   `-- ...
|   |   `-- ...
|   `-- ...
`-- ...


Expectiminimax Search

The fourth case is a general theme for decision-making in uncertain environments

What if there are too many outcomes to enumerate?

  • Monte Carlo Sampling/Simulation (have a drunk guy throw darts at it)

Observability

Partially-observable are generally the hardest to solve