CSCE 434 Lecture 19

From Notes
Jump to navigation Jump to search

« previous | Monday, October 7, 2013 | next »

Lecture Slides

Tangents

LR Parsing

Three common algorithms to construct LR tables:

  • SLR(1) = LR(0) + FOLLOW
    • smallest class of grammars
    • smallest tables (number of states)
    • simple, fast construction
  • LR(1)
    • full set of LR(1) grammars
    • Largest number of states
    • Slow, large construction
  • LALR(1)
    • intermediate sized set of grammars
    • Same number of states as SLR(1)
    • canonical construction is slow and large
    • but better solutions exist

For example, LR(1) parser for ALGOL or PASCAL has 1000s of states, while SLR(1) or LALR(1) only has 100s

tangent about ALGOL and PASCAL; written by Niklaus Wirth.

Smaller tables are more desirable because it can fit better in the cache.

Viable Prefix

Formally, a prefix of a right-sentential form that:

  1. does not ontinue past the right end of the rightmost handle of that sentential form [1]
  2. can appear on the stack of a shift-reduce parser. That is, as long as the prefix represented by the stack is viable, the parser has not seen a detectable error.

A viable prefix is an invariant of shift-reduce parsing: the top of the stack always contains a viable prefix.


SLR(1)

Viable prefix of a right-sentential form:

  • contains both terminals and nonterminals
  • can be recognized with an NFA or a DFA.

To build SLR,

  • start with NFA
  • construct DFA
  • augment DFA with follow sets to disambiguate reductions

States in NFA are LR(0) items, whereas states in DFA are sets of LR(0) items.

LR(0) Items

LR(0) items are strings denoted as Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle [\alpha]}

  • Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \alpha} is a production from Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle G} with a cursor • at some position
  • the cursor indicates how much of an item we have seen at a given state in the parse.

Examples:

  • Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle [A ::= \cdot X Y Z]} indicates that the parser is looking for a string that can be derived from
  • indicates that the parser has seen a string derived from and is looking for one derivable from
  • The production rule generates 4 LR(0) items:
    1. Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle [A ::= X \cdot YZ]}
    2. Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle [A ::= XY \cdot Z]}
    3. Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle [A ::= XYZ \cdot ]}
Canonical LR(0) Items

SLR(1) Table construction uses specific sets of LR(0) items (collectively called the canonical collection of sets of LR(0) items for a grammar Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle G} , or just canonical collection for short)

The Canonical collection represents the set of valid states for the LR parser.

Two classes of items:

kernel items
Items where • is not at the left end of the RHS of a production rule
Add Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle [S' ::= \cdot S]} as "identity"
non-kernel items
items where • is at the left end of the RHS

Closure

To generate a state, we compute its closure (all possible outcomes of the particular state):

  • If Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle [A ::= \alpha \cdot B \beta] \in I_j} , then in state Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle j} , the parser might next see a string derivable from Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle B \beta} .
  • To form its closure, add all items of the form Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle [B ::= \cdot \gamma] \in G} .

Given an item Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle [A ::= \alpha \cdot B \beta]} , its closure contains the item itself and any other items that can generate legal substrings to follow Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \alpha}

Therefore, if the parser has a viable prefix Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \alpha} on its stack, the input should reduce to Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle B\beta} (or Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \gamma} for some other item Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle [B ::= \cdot \gamma]} in the closure).

Creating the closure of Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle I} :

closure(I) {
  bool new_item;
  do {
    new_item = false;
    for (Item [A ::= a.Bb] in I) {
      for (Production "B ::= g" in Gprime) {
        if ([B ::= .g] not in I) {
          I.add([B ::= .g]);
          new_item = true;
        }
      }
    }
  } while (new_item != false)
  return I;
}


SLR(1) uses lookahead to guide decision whether to shift or reduce

Goto

Let Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle I} be a set of LR(0) items, and Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle X} be a grammar symbol.

Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \mathrm{goto}(I,X)} is the set of all items Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \left[ A ::= \alpha X \cdot \beta \right]} such that Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \left[ A ::= \alpha \cdot X \beta \right]} is in Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle I} .

If Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle I} is the set of valid items for some viable prefix Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \gamma} , then Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \mathrm{goto}(I,X)} is the set of valid items for the viable prefix Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \gamma X} .

Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \mathrm{goto}(I,X)} represents the state after recognizing Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle X} in state Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle I} .

goto(I,X) {
  J = items [A ::= aX.b] such that [A ::= a.Xb] in I;
  J' = closure(J);
  return J';
}


Table Construction

Start construction of collection of sets of LR(0) items with Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle [S' ::= \cdot S]} , where

  • Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle S'} is the start symbol of the augmented grammar Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle G'}
  • Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle S} is the start symbol of Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle G}

To compute collection of sets of LR(0) items,

def items(G'):
  S0 = closure(set([S' ::= . S]))
  Items = set(S0)
  ToDo = set(S0)
  while not ToDo.empty():
    remove Si from ToDo
    for X in grammar symbols:
      Snew = goto(Si, X)
      if Snew is a new state:
        Items.append(set(Snew))
        ToDo.append(set(Snew))
  return Items


LR(0) Machines

  • states: canonical sets of LR(0) items
  • edges: goto transitions
  • recognizes all viable prefixes of handles
  • no lookahead

To recognize viable prefixes of the language (instead of handles), we must be able to reduce the handles to nonterminals.

Reducing a handle (RHS of production) to a nonterminal can be viewed as

  • returning to a state at beginning of handle (must use a stack!)
  • making transition on nonterminal


SLR(1) Tables

SLR(1) augments the LR(0) machine by adding FOLLOW information using one token of lookahead.

These are encoded as ACTION and GOTO tables.

ACTION Table:

  • for each [state, lookahead] pair,
  • have we reached the end of handle?
  • if not, "shift".
  • If at end of handle, "reduce" (by production rule)
  • "accept" and "error" are also valid actions
  • use lookahead to guide precision

GOTO Table:

  • For each [state, nonterminal] pair,
  • pick state togo to after reduction
  • look at nonterminal at top of stack.

Algorithm:

  1. Construct items(G')
  2. State Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle i} of the parser is constructed from Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle I_i} .
    • If Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle [A ::= \alpha \cdot \mathtt{a} \beta] \in I_i} (a must be a terminal) and Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle goto(I_i, \mathtt{a}) = I_j} , then set ACTION[i,a] to "shift Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle j} ".
    • If Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle [A ::= \alpha \cdot] \in I_i} , then set ACTION[i,a] to "reduce Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle A ::= \alpha} " for all a in Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \mathrm{follow}(A)} .
    • If Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle [S' ::= S\cdot] \in I_i} , then set ACTION[i, eof] to "accept".
  3. If Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \mathrm{goto}(I_i, A) = I_j} , then set GOTO[i,A] to Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle j} .
  4. All other entries in ACTION and GOTO are set to "error"
  5. The initial state of the parser is the state constructed from the set containing the item Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle [S' ::= \cdot S]} .

SLR(1) Parser Example

Grammar:

1  E ::= T + E
2      | T
3  T ::= id

Augmented Grammar:

0  S' ::= E
1  E ::= T + E
2      | T
3  T ::= id
Symbol first follow
S' { id } { eof }
E { id } { eof }
T { id } { +, eof }

States:

  • Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle S_0 = \left\{ [S' ::= \cdot E],\ [E ::= \cdot T + E],\ [E ::= \cdot T],\ [T ::= \cdot \mathtt{id}] \right\}}
  • Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle S_1 = \left\{ [S' ::= E \cdot] \right\}}
  • Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle S_2 = \left\{ [E ::= T \cdot + E],\ [E ::= T \cdot] \right\}}
  • Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle S_3 = \left\{ [T ::= \mathrm{id} \cdot] \right\}}
  • Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle S_4 = \left\{ [E ::= T + \cdot E],\ [E ::= \cdot T + E],\ [E ::= \cdot T],\ [T ::= \cdot \mathtt{id}] \right\}}
  • Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle S_5 = \left\{ [E ::= T + E \cdot] \right\}}

GOTO Construction iterations:

Start
Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle S_0 \gets \mathrm{closure}({[S ::= \cdot E]})}
Iteration 1
Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \mathrm{goto}(S_0, E) = S_1}
Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \mathrm{goto}(S_0,T) = S_2}
Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \mathrm{goto}(S_0, \mathtt{id}) = S_3}
Iteration 2
Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \mathrm{goto}(S_2,+) = S_4}
Iteration 3
Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \mathrm{goto}(S_4, \mathtt{id}) = S_3}
Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \mathrm{goto}(S_4, E) = S_5}
Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \mathrm{goto}(S_4, T) = S_2}

ACTION and GOTO Tables:

  ACTION GOTO
  id + eof expr term
Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle S_0} shift 3 1 2
Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle S_1} accept
Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle S_2} shift 4 reduce 2
Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle S_3} reduce 3 reduce 3
Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle S_4} shift 3 5 2
Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle S_5} reduce 1


Potential Problems

If either of these happen, the grammar is not SLR(1)

Shift/Reduce Conflicts

ambiguous construct in the grammar (parser doesn't know whether to shift or reduce).

  • grammar can be modified to eliminate conflict
  • resolve in favor of shifting

Classic example: dangling else

Reduce/Reduce Conflicts

Another grammar ambiguity

  • often no simple resolution
  • parse a nearby language

Classic example: PL/I call and subscript ([])

Usually resolved with context

Footnotes

  1. If the grammar is unambiguous, and LR(k) grammars are generally unambiguous, there is a unique rightmost handle.