CSCE 314 Lecture 12
« previous | Friday, September 23, 2011 | next »
YAAY! Dr. Järvi is back!!!
Languages
Defining a Language
- syntax
- form of a program
- how expressions, commands, declarations, etc. are put together to form the final program
- defines the set of valid programs
- lexical structure (how words are formed)
- phrasal structure (how sentences are formed from words)
- semantics
- meaning of a program
- defines how valid programs behave
- informal (documentation and colloquial)
- formal (operational, denotational, axiomatic)
Syntax
Context of Lexing and Parsers
- Defines Legal programs (programs that can be executed by the machine)
- Defined by grammar rules (how to make sentences out of words)
- Sentences are called statements, commands, expressions, terms
- Words are called tokens
- Grammars can't catch everything, so we use type-checking, etc
(E)BNF define context-free languages
After constructing a parse tree, reading the leaves should produce the original input string
Bakus-Naur Form
Common notation to define programming languages
Sample Grammar to define rules for robot commands {up, down, left, right}
<move> ::= <command> | <command> <move> <command> ::= up | down | left | right
Extended Bakus-Naur Form
<x> nonterminal <x> ::= Body <x> defined by Body <x> <y> sequence of <x> and <y> {<x>} zero or more of <x> [<x>] optional <x>
Parse Trees
Concrete: exactly what the grammar gives, including all punctuations and groupings.
Abstract: Simplification of the concrete syntax tree to represent only whath we're interested in (turns deep, narrow tree into a flat one)
Ambiguity
Sequence is part of the language, but can result in multiple syntax trees.
Example: Parse the input 1 - 2 - 3
<expr> ::= <num> | <expr> '-' <expr>
To resolve ambiguity in above case (and similar cases) make associativity rules
<expr> ::= <num> | <expr> '-' <num>
results in (1-2)-3 in parse tree
For different operators, use precedence to have correct interpretation:
<expr> ::= <sum> { == <sum> } <sum> ::= <term> { + <term> } <term> ::= <num> { * <num> }
Chomsky Hierarchy
- Regular Grammars: A → aB
- Context-free Grammars: A → γ
- Context-sensitive Grammars: αAβ → αγβ, γ ≠ ε
- Unrestricted Grammars
Ordered from least expressive (but fastest to parse) to most expressive (slowest to parse). The first two have most importance in programming languages.