CSCE 420 Lecture 11

« previous | Tuesday, February 19, 2013 | next »

Game Playing

2-player 0-sum Games
Competitive environments
Alternating turns

Minimax Search

lookahead and anticipate opponent's moves

assuming worst case
rational opponent will maximize his utility: $+u_{2}(s)=-u_{1}(s)$

Steps:

Define minimax score
calculate with doubly-recursive algorithm (Min-Value(), Max-Value()

αβ-Pruning

|-- 0
|   |-- 0 (look here first)
|   |   |-- 0
|   |   `-- -1
|   `-- >= 1
|       |-- 1
|       `-X ?
`-- <= -2
    |-- -2
    |   |-- -2
    |   `-- -3
    `-- >= 5
        |-- 5
        `-X -9

Algorithm:

αβ-pruning(init_state)
  return max-value(init_state, -Infinity, +Infinity)
 
(*
  α represents lower bound on value of max node (best choice)
  β represents upper bound on value of min node (worst choice)
*)
function max-value(state, α, β)
  v := -Infinity
  for a in actions
    w := min-value(result(state,a), α, β)
    v := max(v, w)
    if v >= β then return v   (* prune; skip rest of children *)
    α := max(α, v)
  end for
  return v   (* minimax value *)
end function
 
function min-value(state, α β)
  v := Infinity
  for a in actions
    w := max-value(result(state,a), α, β)
    v := min(v, w)
    if v <= α then return v   (* prune; skip rest of children *)
    β := min(β, v)
  end for
  return v   (* minimax value *)
end function

What about time restrictions on search space?

Depth-limited search with limit $\lambda$ (requires a "heuristic" / board evaluation function / score that approximates probability of win/payoff)

Board Evaluation: Tic-Tac-Toe

Control of center square is a strategic position
Number of ways that player can win

Linear combination of weighted features of the board: $f(s)=\sum w_{i}\,f_{i}(s)$

Risks:

accuracy
quiescent search - Wait, something big is about to happen. Hold on, just a few more levels...
horizon effect - forestalling; can't see something about to happen

Randomness

Stochastic games

interleave min and max nodes with a chance node

max
|-- chance
|   |-- min
|   |   |-- chance
|   |   |   |-- max
|   |   |   |   |-- chance
|   |   |   |   |-- chance
|   |   |   |   `-- ...
|   |   |   |-- max
|   |   |   |   |-- chance
|   |   |   |   |-- chance
|   |   |   |   `-- ...
|   |   |   `-- ...
|   |   |-- chance
|   |   |   |-- max
|   |   |   |   |-- chance
|   |   |   |   |-- chance
|   |   |   |   `-- ...
|   |   |   |-- max
|   |   |   |   |-- chance
|   |   |   |   |-- chance
|   |   |   |   `-- ...
|   |   |   `-- ...
|   |   `-- ...
|   |-- min
|   |   |-- chance
|   |   |   |-- max
|   |   |   |   |-- chance
|   |   |   |   |-- chance
|   |   |   |   `-- ...
|   |   |   |-- max
|   |   |   |   |-- chance
|   |   |   |   |-- chance
|   |   |   |   `-- ...
|   |   |   `-- ...
|   |   |-- chance
|   |   |   |-- max
|   |   |   |   |-- chance
|   |   |   |   |-- chance
|   |   |   |   `-- ...
|   |   |   |-- max
|   |   |   |   |-- chance
|   |   |   |   |-- chance
|   |   |   |   `-- ...
|   |   |   `-- ...
|   |   `-- ...
|   `-- ...
|-- chance
|   |-- min
|   |   |-- chance
|   |   |   |-- max
|   |   |   |   |-- chance
|   |   |   |   |-- chance
|   |   |   |   `-- ...
|   |   |   |-- max
|   |   |   |   |-- chance
|   |   |   |   |-- chance
|   |   |   |   `-- ...
|   |   |   `-- ...
|   |   |-- chance
|   |   |   |-- max
|   |   |   |   |-- chance
|   |   |   |   |-- chance
|   |   |   |   `-- ...
|   |   |   |-- max
|   |   |   |   |-- chance
|   |   |   |   |-- chance
|   |   |   |   `-- ...
|   |   |   `-- ...
|   |   `-- ...
|   |-- min
|   |   |-- chance
|   |   |   |-- max
|   |   |   |   |-- chance
|   |   |   |   |-- chance
|   |   |   |   `-- ...
|   |   |   |-- max
|   |   |   |   |-- chance
|   |   |   |   |-- chance
|   |   |   |   `-- ...
|   |   |   `-- ...
|   |   |-- chance
|   |   |   |-- max
|   |   |   |   |-- chance
|   |   |   |   |-- chance
|   |   |   |   `-- ...
|   |   |   |-- max
|   |   |   |   |-- chance
|   |   |   |   |-- chance
|   |   |   |   `-- ...
|   |   |   `-- ...
|   |   `-- ...
|   `-- ...
`-- ...

Expectiminimax Search

expectiminimax(s)={\begin{cases}u_{1}(s)&s\ {\text{is terminal}}\\\max _{a}(expectiminimax(result(s,a))&s\ {\text{is max node}}\\\min _{a}(expectiminimax(result(s,a))&s\ {\text{is min node}}\\\sum _{o\in {\text{outcomes}}}expectiminimax(o)\cdot Pr[X=o]&s\ {\text{is chance node}}\end{cases}}

The fourth case is a general theme for decision-making in uncertain environments

What if there are too many outcomes to enumerate?

Monte Carlo Sampling/Simulation (have a drunk guy throw darts at it)

Observability

Partially-observable are generally the hardest to solve