CSCE 434 Lecture 10
« previous | Monday, September 16, 2013 | next »
Finite Automata
Deterministic (DFA)
Scanners are implemetned in deterministic [1] finite state machines (dfas)
dfa for even no of 0s and 1s:
Equivalent Regular Expression: (00|11)*((01|10)(00|11)*(01|10)(00|11)*)*
dfas can be FAST! (when implemented on the computer)
Nondeterministic (NFA)
DFAs cannot recognize the empty string: The regular expression (a|b)*abb has multiple transition choices for a, and it could start with the empty String:
Such regular expressions can be recognized by a nondeterministic finite automaton (nfa)
A nfa accepts iff there is some path to an accept state.
DFAs are special cases of NFAs
- NFAs can simulate DFAs (obviously)
- DFAs can simulate NFAs (but can get very big)
Cycle of construction:
- Start with a Regular Expression
- NFA with ε moves (build NFA for each term and connect with epsilon moves)
- NFA (coalesce identical states)
- DFA (construct smulation; the subset construction)
- Construct regular expression (nontrivial!)
NFA to DFA Algorithm
Input: nondeterministic finite automaton N Output: equivalent (accepts same language) deterministic finite automaton D
let be a state in NFA and let a set of states. W e define the following procedures:
// set of NFA states reachable from NFA state s on epsilon-transitions alone
Set<State> epsilon_closure(State s);
// set of NFA states reachable from any NFA state s in T on on epsilon-transitions alone
Set<State> epsilon_closure(Set T);
// set of NFA states to which there is a transition on input symbol a from some NFA state s in T
Set<State> move(Set T, Token a);
State start = epsilon_closure(s[0]);
add Start unmarked to Dstates;
while (there exists an unmarked state T in Dstates) {
T.mark()
for (Token a : tokens) {
Set U = epsilon_closure(move(T,a))
if (U not in Dstates) {
add U to Dstates unmarked;
}
Dtran[T,a] = U;
}
}
in DFA becomes a state that corresponds to a set of states in the NFA (coalescence).
- There can be up to possible states in .
- A state in is an accepting state if one or more of the states it represents in is accepting
Minimum State DFAs
Every regular language recognized by a minimum-state DFA that is unique (up to state names).
Minimization algorithm:
- Construct initial partition of states into accepting and non-accepting states
- Successively refine partition by splitting a group into smaller groups if states in have transitions to different groups.
- Update transition edges, remove dead (unreachable) states.
Next time, parsers!
Footnotes
- ↑ deterministic = not random; for each input, there is a single choice to go to the next state