CSCE 420 Lecture 7

From Notes
Jump to navigation Jump to search

« previous | Tuesday, February 5, 2013 | next »


Hill-Climbing

Score
state quality (objective function)
c.f. path costs
def hill_climbing init, &oper, &score
  current = init
  loop do
    children = oper(current)
    return current if children.all{|c| score(c) <= score(current)}
    current = children.max_by{|c| score(c)}
  end
end

Limitations of hill-climbing:

  • Getting stuck at a local maxima
  • Limited to taking small steps
  • Plateau effect (a bunch of states have the same score; lost gradient)
  • Ridge effect (operators don't give the best choice; there's a better state nearby, but neighbors are worse)

Solutions

  • Use "Macro operators" take multiple combinations of operators to get more possible actions
  • Random restart: reinitialize at a random state
  • Beam-Search: keep track of best previous states (limited memory)
  • Random-walk on plateaus (replace <= operator with < to accept children with equal value)
  • Simulated annealing


Compare to Best-First Search

Best-first sorts a priority queue on .

Priority queue allows us to backtrack, but the hill-climbing algorithm is designed to be memory-less.

Cost associated with remembering previous states

Spectrum of Algorithms

  1. Hill-Climbing (no memory)
  2. Beam-Search (limited memory; ranked by )
  3. Best-First (exponential memory)

Simulated Annealing

Algorithm 1

def simulated_annealing_1 init, &oper, &score
  current = init
  loop do
    neighbors = oper(current)
    nxt = neighbors.sample
    current = nxt if score(nxt) > score(current)

    # accept nxt with logarithmic probability: steps further downhill are less likely to be accepted
    delta = score(nxt) - score(current)
    current = nxt if rand < Math.exp(-delta)
  end
end

Algorithm 2

Pass in a "temperature schedule" that approaches 0 as approaches .

def simulated_annealing_2 init, &oper, &score, &temp_sched
  current = init
  (1..N).each do |i|
    nxt = oper(current).sample
    current = nxt if score(nxt) > score(current)

    delta = score(nxt) - score(current)
    current = nxt if rand < Math.exp(-delta / temp_sched(i)
  end
end

This comes from branch of physics called statistical mechanics [1]


Footnotes

  1. statistical mechanics is the study of emergent behavior of collectives/ensembles