CSCE 434 Lecture 18
« previous | Friday, October 4, 2013 | next »
Review of Top-Down LL(1) Parsing
Grammar is LL(1) if and only if, for all non-terminals , each distinct pair of productions and satisfy the condition
In other words, for a production rule
- all first-sets of all production rules are pairwise disjoint
- If , then , for all , .
Bottom-up (LR) Parsing
- start at leaves and use production rules to reduce to goal
- Key problem: Choosing RHS of production to use
- LR parsers use stack and lookahead to choose production.
- Lookahead is just extra decoration for state; each state has more information.
- LR is generally more powerful than LL
Starting with start symbol , any string such that is called a sentential form
- If contains only terminals, then is a sentence in
- If contains at least one non-terminal, then it is just a sentential form.
Bottom-up parser repeatedly matches RHS of a productino against a substring in the current right-sentential form.
In other words, if our grammar has , and we have at the top of our stack, then we can replace the with .
Example
Consider the grammar
1 <goal> ::= a <A> <B> e 2 <A> ::= <A> b c 3 | b 4 <B> ::= d
and the input sentence abbcde
Production | Sentential Form | Handle |
---|---|---|
— | abbcde |
3, 2 |
3 | a<A>bcde |
2, 4 |
2 | a<A>de |
4, 3 |
4 | a<A><B>e |
1, 4 |
1 | <goal> |
— |
Handles
Definition:
- a tuple containing the matching production rule and the position in the sentence where it matches.
- a substring of the current right-sentential form where matches some production and reducing to is one step in the reverse rightmost derivation
Invariant: The substring of a handle contains only terminal symbols.
Provable fact: If is unambiguous, then every right-sentential form has a unique handle.
Proof. is unambiguous, then rightmost derivation is unique. This implies that a unique production applied to take to . Next a unique position at which is applied. The result is a handle
Shift-Reduce Parsing
- Simple to understand
- Have simple, table-driven, shift-reduce skeleton
- grammatical knowledge encoded in tables
Algorithm
- Initialize stack with "$"
- Repeat the following until the top of the stack is the goal symbol and the input token is "end of file"
- Find the handle. If we don't have a handle, shift an input symbol onto the stack
- Prune the handle. If we have a handle on the stack, then reduce
- pop symbols off the stack
- push onto the stack
LR(1)
Everyone's favorite parser
Describe a proper superset of languages recognized by LL (predictive) parsers
Token token = next_token();
while (true) {
Token s = top of stack
if (action[s,token] == "shift s[i]") {
push token;
push s[i];
token = next_token();
} else if (action[s, token] == "reduce A ::= b") {
pop 2 * |b| symbols;
s = top of stack;
push A;
push goto[s,A];
} else if (action[s, token] == "accept") {
return;
} else error();
}