Category:CSCE 434 Exam 1

From Notes
Jump to navigation Jump to search
Review Slides

« previous | Friday, October 25, 2013 | next »

Location HRBB 126
Date Friday, October 25, 2013
Time 11:30–12:20

Lexical Analysis (Scanning)

Syntax Analysis (Parsing)

LL Parsing

  • Top-down parsers
  • build parse tree from root to leaves

LR Parsing

  • Bottom-up parsers
  • build parse tree from leaves to root

LR(1)

  • scan input from left to right (L)
  • build rightmost derivation in reverse (R)
  • use a single token lookahead to disambiguate (1)
  • Simple, Table-driven, shift-reduce skeleton
  • grammatical knowledge encoded in (parse lookup) tables
  • In general, LR parsers are practical, efficient, and easy to build

Skeleton Parser code:

def lr_parse(toks):
    stack = Stack();
    tokiter = iter(toks)
    tok = tokiter.next()

    while True:
        curr_state = stack.top()

        # Possible values are:
        #   ("shift", state_id),
        #   ("reduce", production{lhs,rhs}), or
        #   ("accept", None)
        behavior, payload = action[curr_state, tok]

        if behavior == "shift":
            next_state = payload

            stack.push(tok)
            stack.push(next_state)
            tok = tokiter.next()

        elif behavior == "reduce":
            A = payload['lhs']
            b = payload['rhs']

            stack.pop(2*len(b))
            curr_state = stack.top()
            stack.push(A)
            push goto[curr_state, A]

        elif behavior == "accept"
            return

        else:
            error()

This has a running time of , where is the number of shifts (the length of the input string), and is the number of reduces (depends on the grammar).


Example Tables

The following grammar:

1 <goal>   ::= <expr>
2 <expr>   ::= <term> + <expr>
3            | <term>
4 <term>   ::= <factor> * <term>
5            | <factor>
6 <factor> ::= id

gets translated into the following action and goto tables:

  action goto
id + * $ (eof)
(shift, )
(accept, _)
(shift, ) (reduce, )
(reduce, ) (shift, ) (reduce, )
(reduce, ) (reduce, ) (reduce, )
(shift, )
(shift, )
(reduce, )
(reduce, ) (reduce, )


First and Follow Sets

Used to build LR(1) tables.

First Set

Given a terminal or non-terminal grammar symbol , the first set is the set of tokens (terminal symbols) that are the first symbol in all possible strings derivable from .

If , then is a member of

Building instructions:

  1. If is a terminal, then .
  2. If , then .
  3. If , then put in (i.e. ).
  4. In the rule above, suppose the first nonterminals all have in their first-sets , this tells us that if , then is a valid member of and each

Note: The definition given here is suitable for LR(1) grammars. In general for LR(k), we define as the leading tokens (not just 1) that begin strings derived from .

Follow Set

Given a non-terminal grammar symbol , the follow set is the set of terminals that can appear immediately after :

Suppose both and appear as valid sentential forms (RHS of some other productions). Then and are both members of the follow set of .

Building instructions:

  1. Place eof in (a.k.a ... same concept)
  2. If , then all members of —except —are also in .
  3. If , then anything that follows can follow (i.e. put all members of in )
  4. (A combination of the above two rules) If and , then all members of are in .
Example

In the grammar above, the first and follow sets of each symbol are:

Symbol first follow


LR(k) Items

Table construction algorithms use LR(k) items to represent the set of possible states in a parse.

an LR(k) item is a pair where

  • is a production rule from with a at some position in the RHS
  • is a lookahead string containing symbols (terminals or eof)

(two cases of interest here are when and :

  • LR(0) items play a key role in SLR(1) table construction
  • LR(1) items play a role in LR(1) and LALR(1) construction

A sentential form is a partially expanded parse (mixed terminals and nonterminals) that might appear as the RHS of a production rule, for example. In particular, a right-sentential form is a sentential form in which only the right side has been expanded.

A handle is a substring in a rule. For example, (or simply just ) provides a handle for since could be reduced to form .

A viable prefix is the prefix of a right sentential form that could potentially appear on the stack of a shift-reduce parser. In other words, it does not continue past the end of the handle for that sentential form. Note that in the example above, must consist only of terminal symbols, and viable prefixes consist of initial substrings of .

Table Construction Algorithms

  1. SLR(1)
    • grammar space: smallest class of grammars
    • table size: smallest number of states
    • performance: simple, fast construction
  2. LR(1)
    • grammar space: full set of LR(1) grammars
    • table size: largest number of states
    • performance: slow, large construction
  3. LALR(1)
    • grammar space: intermediate sized set of grammars
    • table size: same as SLR(1) — small
    • performance: canonical construction is slow and large

Better construction techniques exist

Media in category "CSCE 434 Exam 1"

The following 13 files are in this category, out of 13 total.