CSCE 420 Lecture 11

From Notes
Jump to navigation Jump to search

« previous | Tuesday, February 19, 2013 | next »


Game Playing

  • 2-player 0-sum Games
  • Competitive environments
  • Alternating turns

Minimax Search

lookahead and anticipate opponent's moves

  • assuming worst case
  • rational opponent will maximize his utility: Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle +u_2(s) = -u_1(s)}

Steps:

  1. Define minimax score
  2. calculate with doubly-recursive algorithm (Min-Value(), Max-Value()


αβ-Pruning

|-- 0
|   |-- 0 (look here first)
|   |   |-- 0
|   |   `-- -1
|   `-- >= 1
|       |-- 1
|       `-X ?
`-- <= -2
    |-- -2
    |   |-- -2
    |   `-- -3
    `-- >= 5
        |-- 5
        `-X -9

Algorithm:

αβ-pruning(init_state)
  return max-value(init_state, -Infinity, +Infinity)
 
(*
  α represents lower bound on value of max node (best choice)
  β represents upper bound on value of min node (worst choice)
*)
function max-value(state, α, β)
  v := -Infinity
  for a in actions
    w := min-value(result(state,a), α, β)
    v := max(v, w)
    if v >= β then return v   (* prune; skip rest of children *)
    α := max(α, v)
  end for
  return v   (* minimax value *)
end function
 
function min-value(state, α β)
  v := Infinity
  for a in actions
    w := max-value(result(state,a), α, β)
    v := min(v, w)
    if v <= α then return v   (* prune; skip rest of children *)
    β := min(β, v)
  end for
  return v   (* minimax value *)
end function

What about time restrictions on search space?

  • Depth-limited search with limit Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \lambda} (requires a "heuristic" / board evaluation function / score that approximates probability of win/payoff)

Board Evaluation: Tic-Tac-Toe

  • Control of center square is a strategic position
  • Number of ways that player can win

Linear combination of weighted features of the board: Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle f(s) = \sum w_i \, f_i(s)}

Risks:

  • accuracy
  • quiescent search - Wait, something big is about to happen. Hold on, just a few more levels...
  • horizon effect - forestalling; can't see something about to happen


Randomness

Stochastic games

interleave min and max nodes with a chance node

max
|-- chance
|   |-- min
|   |   |-- chance
|   |   |   |-- max
|   |   |   |   |-- chance
|   |   |   |   |-- chance
|   |   |   |   `-- ...
|   |   |   |-- max
|   |   |   |   |-- chance
|   |   |   |   |-- chance
|   |   |   |   `-- ...
|   |   |   `-- ...
|   |   |-- chance
|   |   |   |-- max
|   |   |   |   |-- chance
|   |   |   |   |-- chance
|   |   |   |   `-- ...
|   |   |   |-- max
|   |   |   |   |-- chance
|   |   |   |   |-- chance
|   |   |   |   `-- ...
|   |   |   `-- ...
|   |   `-- ...
|   |-- min
|   |   |-- chance
|   |   |   |-- max
|   |   |   |   |-- chance
|   |   |   |   |-- chance
|   |   |   |   `-- ...
|   |   |   |-- max
|   |   |   |   |-- chance
|   |   |   |   |-- chance
|   |   |   |   `-- ...
|   |   |   `-- ...
|   |   |-- chance
|   |   |   |-- max
|   |   |   |   |-- chance
|   |   |   |   |-- chance
|   |   |   |   `-- ...
|   |   |   |-- max
|   |   |   |   |-- chance
|   |   |   |   |-- chance
|   |   |   |   `-- ...
|   |   |   `-- ...
|   |   `-- ...
|   `-- ...
|-- chance
|   |-- min
|   |   |-- chance
|   |   |   |-- max
|   |   |   |   |-- chance
|   |   |   |   |-- chance
|   |   |   |   `-- ...
|   |   |   |-- max
|   |   |   |   |-- chance
|   |   |   |   |-- chance
|   |   |   |   `-- ...
|   |   |   `-- ...
|   |   |-- chance
|   |   |   |-- max
|   |   |   |   |-- chance
|   |   |   |   |-- chance
|   |   |   |   `-- ...
|   |   |   |-- max
|   |   |   |   |-- chance
|   |   |   |   |-- chance
|   |   |   |   `-- ...
|   |   |   `-- ...
|   |   `-- ...
|   |-- min
|   |   |-- chance
|   |   |   |-- max
|   |   |   |   |-- chance
|   |   |   |   |-- chance
|   |   |   |   `-- ...
|   |   |   |-- max
|   |   |   |   |-- chance
|   |   |   |   |-- chance
|   |   |   |   `-- ...
|   |   |   `-- ...
|   |   |-- chance
|   |   |   |-- max
|   |   |   |   |-- chance
|   |   |   |   |-- chance
|   |   |   |   `-- ...
|   |   |   |-- max
|   |   |   |   |-- chance
|   |   |   |   |-- chance
|   |   |   |   `-- ...
|   |   |   `-- ...
|   |   `-- ...
|   `-- ...
`-- ...


Expectiminimax Search

Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle expectiminimax(s) = \begin{cases} u_1(s) & s\ \text{is terminal} \\ \max_a(expectiminimax(result(s,a)) & s\ \text{is max node} \\ \min_a(expectiminimax(result(s,a)) & s\ \text{is min node} \\ \sum_{o \in \text{outcomes}} expectiminimax(o) \cdot Pr[X=o] & s\ \text{is chance node} \end{cases}}

The fourth case is a general theme for decision-making in uncertain environments

What if there are too many outcomes to enumerate?

  • Monte Carlo Sampling/Simulation (have a drunk guy throw darts at it)

Observability

Partially-observable are generally the hardest to solve