CSCE 420 Lecture 24

« previous | Tuesday, April 16, 2013 | next »

Agent Architectures

Decision-making Implementation ( $\pi (s)=a$ )

Similar to other fields'

Table-lookup: determine actions in a table based on sensor inputs (doesn't scale well)
Rule-based Agent: Implement $\pi$ $\pi$ with set of "condition → action" rules (requires conjunctive antecedents)
- Rules should be ordered by priority
- Stop after first match

Figure 1. Ghengis Robot

Genghis Robot (Rob Brooks, MIT) has simple, reactive controllers for legs:

Maintain an internal representation of the world state

Describe in KB what goals need to be accomplished

$KB\wedge {\text{state}}\wedge {\text{goals}}\models {\text{do}}(a_{i})$
Try to infer for all $a_{i}\in {\text{Actions}}$
use Situation Calculus to encode preconditions and effects of actions in $KB$

OR use planners to derive sequence of actions (plan) by state-space search (e.g. A*)

In either case, output of actuators follows a plan

Utility Function $u(s)\to \mathbb {R}$
Transition function $T(s,a)\to S$ $T(s,a)\to S$
- defines outcomes of actions
- could be probablistic (distribution over successor states)
Reward/Cost function $R(s,a)\to \mathbb {R}$ $R(s,a)\to \mathbb {R}$
- payoff of action
Goal: maximize reward over time: $\sum _{t=0}^{\infty }\gamma ^{t}\,R_{a_{t}}(s_{t},s_{t+1})$ $\sum _{t=0}^{\infty }\gamma ^{t}\,R_{a_{t}}(s_{t},s_{t+1})$ , where $\gamma \in \left[0,1\right]$ $\gamma \in \left[0,1\right]$ is a geometric weighting scalar/constant that helps series to converge.
- smaller $\gamma$ would look for immediate payoff
- larger $\gamma$ would be longer term

Plans are encoded in policies (mapping states to actions)

Computing optimal policy $\pi ^{*}$ that maximizes long-term discounted reward

evaluation via Bellman equations: $V^{*}(s)=\max _{a\in A(s)}\sum _{s'}{Pr}_{s\,s'}^{a}\,\left[{\mathcal {R}}+\gamma \,V^{*}(s')\right]$
Value function gives utility of each state (which depends on its neigboring states)
Therefore values are all Coupled