CSCE 434 Lecture 9

From Notes
Jump to navigation Jump to search

« previous | Friday, September 13, 2013 | next »

Lecture Slides

TODO: Hennesey, Patterson; Memory Hierarchy

Finite State Machines

Compact way of representing transformation rules of context-free grammar

3 basic states:

  1. Initial / starting state
  2. Accept (valid string)
  3. Reject (errors)

Deterministic Finite Automaton (DFA) code for scanner:

char c = next_char();
state = S0;  // start state code
done = false;
token_value = "";
while(not done) {
    int class = char_class[c];
    int state = next_state[class, state];
    switch (state) {
        case S1:  // building an id
            token_value += c;
            c = next_char();
            break;
        case S2:  // accept case
            token_type = IDENTIFIER;
            done = true;
            break;
        case S3:  // error
            token_type = ERROR;
            done = true;
            break;
    }
    return token_type;
}

Table rows can be used to represent state transformation (input states → output vals and state), as in sequential logic from CSCE 312

We use case statements instead of conditionals here because machine can use JUMP statements (single evaluation) to go directly to a state instead of evaluating a predicate multiple times. Even better, GOTO:

S1: setup()
    do_something()
    if (condition)
        goto S2
    else
        goto S3
S2: setup()
    do_something_else()
    if (condition)
        goto S1

S3: setup
    do_yet_another_thing()
    // ...

Problematic features:

  • reserved keywords
  • significant blanks (whitespace)
  • string constants and special chars in strings
  • finite closures (limitations on identifier lengths)


Lexical Errors

  • Report it?
  • Correct it? (it is possible)
  • Ignore and Recover?

Techniques

  • minimum distance correction (did you mean?)
  • hard token recovery (reset state after getting error)

Scanner Generators

  1. Construct a DFA
  2. Minimize states (automaton optimization, e.g. merge identical states)
  3. Emit code/table