CSCE 314 Lecture 12

From Notes
Jump to navigation Jump to search

« previous | Friday, September 23, 2011 | next »


YAAY! Dr. Järvi is back!!!

Languages

Defining a Language

syntax
form of a program
how expressions, commands, declarations, etc. are put together to form the final program
defines the set of valid programs
lexical structure (how words are formed)
phrasal structure (how sentences are formed from words)
semantics
meaning of a program
defines how valid programs behave
informal (documentation and colloquial)
formal (operational, denotational, axiomatic)

Syntax

Context of Lexing and Parsers

  • Defines Legal programs (programs that can be executed by the machine)
  • Defined by grammar rules (how to make sentences out of words)
    • Sentences are called statements, commands, expressions, terms
    • Words are called tokens
    • Grammars can't catch everything, so we use type-checking, etc

(E)BNF define context-free languages

After constructing a parse tree, reading the leaves should produce the original input string

Bakus-Naur Form

Common notation to define programming languages

Sample Grammar to define rules for robot commands {up, down, left, right}

<move>    ::= <command> | <command> <move>
<command> ::= up | down | left | right

Extended Bakus-Naur Form

<x> nonterminal
<x> ::= Body <x> defined by Body
<x> <y> sequence of <x> and <y>
{<x>} zero or more of <x>
[<x>] optional <x>

Parse Trees

Concrete: exactly what the grammar gives, including all punctuations and groupings.

Abstract: Simplification of the concrete syntax tree to represent only whath we're interested in (turns deep, narrow tree into a flat one)

Ambiguity

Sequence is part of the language, but can result in multiple syntax trees.

Example: Parse the input 1 - 2 - 3

<expr> ::= <num> | <expr> '-' <expr>

To resolve ambiguity in above case (and similar cases) make associativity rules

<expr> ::= <num> | <expr> '-' <num>

results in (1-2)-3 in parse tree

For different operators, use precedence to have correct interpretation:

<expr> ::= <sum> { == <sum> }
<sum> ::= <term> { + <term> }
<term> ::= <num> { * <num> }


Chomsky Hierarchy

  1. Regular Grammars: A → aB
  2. Context-free Grammars: A → γ
  3. Context-sensitive Grammars: αAβ → αγβ, γ ≠ ε
  4. Unrestricted Grammars

Ordered from least expressive (but fastest to parse) to most expressive (slowest to parse). The first two have most importance in programming languages.