CSCE 420 Lecture 26

From Notes
Jump to navigation Jump to search

« previous | Tuesday, April 23, 2013 | next »


Agents: Decision-Making

  • Reflexive Agent: simply stimulus-response (trigger rules)
  • Planning: sequence of actions to achieve goals
    1. Described actions using situation calculus (effects, possibility axioms[1], frame axioms[2])
    2. Planning Domain Definition Language (PDDL; formerly "STRIPS operators")
      • State-Space Search Algorithm (forward)
      • Goal-Regression Algorithm


PDDL

Simpler than situation calculus

Situation Calculus Algorithm for "pickup"

∀x,y,z  on(x,y,s) ∧ gripper_empty(s) ∧ clear(x,s) →
    holding(x, result(pickup(x,y),s)) ∧
    clear(y, result(pickup(x,y),s)) ∧
    ¬gripper_empty(result(pickup(x,y), s))

Frame Axioms:
∀a,b,x,y,z  poss(pickup(x,y),s) ∧ x ≠ z ∧ clear(z,s) →
    clear(z, result(pickup(x,y),s))
∀a,b,x,y,z  poss(pickup(x,y),s) ∧ a ≠ x ∧ on(a,b,s) →
    on(a,b, result(pickup(x,y),s))


PDDL statements of actions defined above

pickup(x,y):
    pre-conds: clear(x), on(x,y), gripper_empty
    effects: holding(x), clear(y), ¬gripper_empty

puton(x,y):
    pre-conds: holding(x), clear(y)
    effects: on(x,y), clear(x), gripper_empty, ¬clear(y)

Any positive effect is called the "add-list", and negative effects are called the "delete-list"

Database Update

Initial state:

[B][A]

[C]

Initial conditions:

on(a,b)
clear(a)
clear(c)
on(b,table)
on(c,table)
gripper_empty

Goal:

on(a,b)
on(b,c)
on(c,table)
clear(a)
gripper_empty

After performing pickup(a,b), the following events happen:

  1. Make sure all pre-conditions are met
  2. Add all add-list items
  3. Remove all delete-list items
- on(a,b)
  clear(a)
  clear(c)
  on(b,table)
  on(c,table)
- gripper_empty
+ holding(a)
+ clear(b)

In general,

where

  • is state represented as a list of literals ("database")
  • is the action to perform

State-Space Search

Also called (forward "State progression")

From each initial state, we can match all actions for which the preconditions hold. From example above,

  1. pickup(a,b)
  2. pickup(c,table)

Goal Regression

Work backwards from goal until ground out in initial state:

Modify list of goals to represent weakest pre-conditions necessary for action to succeed:

From example above, on(a,b) is already true, but we need to find what would make on(b,c) true: puton(b,c) However, puton(b,c) conflicts with on(a,b) since on(a,b) implies ¬clear(b)

puton(a,b) is possible assuming on(b,c) was true before performing the action: frame axiom. Therefore, prior state must have following literals (diff from goal):

- on(a,b)
  on(b,c)
  on(c,table)
- clear(a)
- gripper_empty
+ holding(a)
+ clear(b)

In turn, a would have to have been picked up from something (this could be anything, but eventually we could find that this should be table), so regress via pickup(a,table)

  on(b,c)
  on(c,table)
- holding(a)
  clear(b)
+ clear(a)
+ on(a,table)
+ gripper_empty

Regress via puton(b,c)

- on(b,c)
  on(c,table)
- clear(b)
  clear(a)
  on(a,table)
- gripper_empty
+ holding(b)
+ clear(c)

Regress via pickup(b,table)

  on(c,table)
  clear(a)
  on(a,table)
- holding(b)
  clear(c)
+ clear(b)
+ on(b,table)
+ gripper_empty

Regress via puton(a,table)

  on(c,table)
- clear(a)
- on(a,table)
  clear(c)
  clear(b)
  on(b,table)
- gripper_empty
+ holding(a)

Regress via pickup(a,b)

  on(c,table)
  clear(c)
- clear(b)
  on(b,table)
- holding(a)
+ clear(a)
+ on(a,b)
+ gripper_empty

Arrived at initial state:

  on(c,table)
  clear(c)
  on(b,table)
  clear(a)
  on(a,b)
  gripper_empty

Means-Ends Analysis Principle (MEA)

First put forth by Simon and Newell

  1. Compare goal with current state
  2. Identify largest difference
  3. Try to solve it first

Algorithm

function Goal-Regression(g)
    if Sinit ⊧ g then return {}        // empty plan
    foreach action a ∈ Actions
        if effects of a are consistent with g   // i.e. effects(a) ∪ g is satisfiable
            // construct weakest preimage; regress g through a
            g' = pre-conds(a) ∪ (g - effects(a))
            result ← GoalRegression(g')
            if result ≠ fail then return result
        end if
    end foreach
end function


Footnotes

  1. possibility axioms encode everything that changes
  2. frame axioms encode everything that doesn't change