CSCE 420 Lecture 11
Jump to navigation
Jump to search
« previous | Tuesday, February 19, 2013 | next »
Game Playing
- 2-player 0-sum Games
- Competitive environments
- Alternating turns
Minimax Search
lookahead and anticipate opponent's moves
- assuming worst case
- rational opponent will maximize his utility:
Steps:
- Define minimax score
- calculate with doubly-recursive algorithm (Min-Value(), Max-Value()
αβ-Pruning
|-- 0 | |-- 0 (look here first) | | |-- 0 | | `-- -1 | `-- >= 1 | |-- 1 | `-X ? `-- <= -2 |-- -2 | |-- -2 | `-- -3 `-- >= 5 |-- 5 `-X -9
Algorithm:
αβ-pruning(init_state) return max-value(init_state, -Infinity, +Infinity) (* α represents lower bound on value of max node (best choice) β represents upper bound on value of min node (worst choice) *) function max-value(state, α, β) v := -Infinity for a in actions w := min-value(result(state,a), α, β) v := max(v, w) if v >= β then return v (* prune; skip rest of children *) α := max(α, v) end for return v (* minimax value *) end function function min-value(state, α β) v := Infinity for a in actions w := max-value(result(state,a), α, β) v := min(v, w) if v <= α then return v (* prune; skip rest of children *) β := min(β, v) end for return v (* minimax value *) end function
What about time restrictions on search space?
- Depth-limited search with limit (requires a "heuristic" / board evaluation function / score that approximates probability of win/payoff)
Board Evaluation: Tic-Tac-Toe
- Control of center square is a strategic position
- Number of ways that player can win
Linear combination of weighted features of the board:
Risks:
- accuracy
- quiescent search - Wait, something big is about to happen. Hold on, just a few more levels...
- horizon effect - forestalling; can't see something about to happen
Randomness
Stochastic games
interleave min and max nodes with a chance node
max |-- chance | |-- min | | |-- chance | | | |-- max | | | | |-- chance | | | | |-- chance | | | | `-- ... | | | |-- max | | | | |-- chance | | | | |-- chance | | | | `-- ... | | | `-- ... | | |-- chance | | | |-- max | | | | |-- chance | | | | |-- chance | | | | `-- ... | | | |-- max | | | | |-- chance | | | | |-- chance | | | | `-- ... | | | `-- ... | | `-- ... | |-- min | | |-- chance | | | |-- max | | | | |-- chance | | | | |-- chance | | | | `-- ... | | | |-- max | | | | |-- chance | | | | |-- chance | | | | `-- ... | | | `-- ... | | |-- chance | | | |-- max | | | | |-- chance | | | | |-- chance | | | | `-- ... | | | |-- max | | | | |-- chance | | | | |-- chance | | | | `-- ... | | | `-- ... | | `-- ... | `-- ... |-- chance | |-- min | | |-- chance | | | |-- max | | | | |-- chance | | | | |-- chance | | | | `-- ... | | | |-- max | | | | |-- chance | | | | |-- chance | | | | `-- ... | | | `-- ... | | |-- chance | | | |-- max | | | | |-- chance | | | | |-- chance | | | | `-- ... | | | |-- max | | | | |-- chance | | | | |-- chance | | | | `-- ... | | | `-- ... | | `-- ... | |-- min | | |-- chance | | | |-- max | | | | |-- chance | | | | |-- chance | | | | `-- ... | | | |-- max | | | | |-- chance | | | | |-- chance | | | | `-- ... | | | `-- ... | | |-- chance | | | |-- max | | | | |-- chance | | | | |-- chance | | | | `-- ... | | | |-- max | | | | |-- chance | | | | |-- chance | | | | `-- ... | | | `-- ... | | `-- ... | `-- ... `-- ...
Expectiminimax Search
The fourth case is a general theme for decision-making in uncertain environments
What if there are too many outcomes to enumerate?
- Monte Carlo Sampling/Simulation (have a drunk guy throw darts at it)
Observability
Partially-observable are generally the hardest to solve