CSCE 420 Lecture 11
Jump to navigation
Jump to search
« previous | Tuesday, February 19, 2013 | next »
Game Playing
- 2-player 0-sum Games
- Competitive environments
- Alternating turns
Minimax Search
lookahead and anticipate opponent's moves
- assuming worst case
- rational opponent will maximize his utility:
Steps:
- Define minimax score
- calculate with doubly-recursive algorithm (Min-Value(), Max-Value()
αβ-Pruning
|-- 0
| |-- 0 (look here first)
| | |-- 0
| | `-- -1
| `-- >= 1
| |-- 1
| `-X ?
`-- <= -2
|-- -2
| |-- -2
| `-- -3
`-- >= 5
|-- 5
`-X -9
Algorithm:
αβ-pruning(init_state)
return max-value(init_state, -Infinity, +Infinity)
(*
α represents lower bound on value of max node (best choice)
β represents upper bound on value of min node (worst choice)
*)
function max-value(state, α, β)
v := -Infinity
for a in actions
w := min-value(result(state,a), α, β)
v := max(v, w)
if v >= β then return v (* prune; skip rest of children *)
α := max(α, v)
end for
return v (* minimax value *)
end function
function min-value(state, α β)
v := Infinity
for a in actions
w := max-value(result(state,a), α, β)
v := min(v, w)
if v <= α then return v (* prune; skip rest of children *)
β := min(β, v)
end for
return v (* minimax value *)
end function
What about time restrictions on search space?
- Depth-limited search with limit (requires a "heuristic" / board evaluation function / score that approximates probability of win/payoff)
Board Evaluation: Tic-Tac-Toe
- Control of center square is a strategic position
- Number of ways that player can win
Linear combination of weighted features of the board:
Risks:
- accuracy
- quiescent search - Wait, something big is about to happen. Hold on, just a few more levels...
- horizon effect - forestalling; can't see something about to happen
Randomness
Stochastic games
interleave min and max nodes with a chance node
max |-- chance | |-- min | | |-- chance | | | |-- max | | | | |-- chance | | | | |-- chance | | | | `-- ... | | | |-- max | | | | |-- chance | | | | |-- chance | | | | `-- ... | | | `-- ... | | |-- chance | | | |-- max | | | | |-- chance | | | | |-- chance | | | | `-- ... | | | |-- max | | | | |-- chance | | | | |-- chance | | | | `-- ... | | | `-- ... | | `-- ... | |-- min | | |-- chance | | | |-- max | | | | |-- chance | | | | |-- chance | | | | `-- ... | | | |-- max | | | | |-- chance | | | | |-- chance | | | | `-- ... | | | `-- ... | | |-- chance | | | |-- max | | | | |-- chance | | | | |-- chance | | | | `-- ... | | | |-- max | | | | |-- chance | | | | |-- chance | | | | `-- ... | | | `-- ... | | `-- ... | `-- ... |-- chance | |-- min | | |-- chance | | | |-- max | | | | |-- chance | | | | |-- chance | | | | `-- ... | | | |-- max | | | | |-- chance | | | | |-- chance | | | | `-- ... | | | `-- ... | | |-- chance | | | |-- max | | | | |-- chance | | | | |-- chance | | | | `-- ... | | | |-- max | | | | |-- chance | | | | |-- chance | | | | `-- ... | | | `-- ... | | `-- ... | |-- min | | |-- chance | | | |-- max | | | | |-- chance | | | | |-- chance | | | | `-- ... | | | |-- max | | | | |-- chance | | | | |-- chance | | | | `-- ... | | | `-- ... | | |-- chance | | | |-- max | | | | |-- chance | | | | |-- chance | | | | `-- ... | | | |-- max | | | | |-- chance | | | | |-- chance | | | | `-- ... | | | `-- ... | | `-- ... | `-- ... `-- ...
Expectiminimax Search
The fourth case is a general theme for decision-making in uncertain environments
What if there are too many outcomes to enumerate?
- Monte Carlo Sampling/Simulation (have a drunk guy throw darts at it)
Observability
Partially-observable are generally the hardest to solve