Implementing a Programming Language

From Notes
Jump to navigation Jump to search

Implementing a programming language involves undoing the abstractions of modern, high-level constructs and converting them into machine-reabable code. This process commonly involves the following steps:

Lexical Analysis

if (a == b) return;

is turned into a stream of tokens.

keyword['if']
symbol[')']
identifier['a']
symbol['==']
identifier['b']
symbol[')']
keyword['return']


Parsing

The stream of tokens is turned into an abstract syntax tree

if-statement
|-- statement
|   `-- equality operator
|       |-- identifier
|       |   `-- a
|       `-- identifier
|           `-- b
`-- expression
    `-- return statement


Type Checking

Make sure that parts of the syntax tree are being used correctly (from the bottom-up)

if-statement : OK
|-- conditional statement
|   `-- equality operator : bool
|       |-- identifier : int
|       |   `-- a
|       `-- identifier : int
|           `-- b
`-- expression
    `-- return statement : void


Optimization

Suppose a = 10 and b = 20-a: a == b will always return true, so the tree can be simplified to

if-statement : OK
|-- conditional statement
|   `-- true : bool
`-- expression
    `-- return statement : void

or more simply

return statement : void


Code Generation

  • abstract types are no more... everything is int, byte, float, etc.
  • variables become memory locations and the variable names become addresses (offsets)
  • statements become stack-based commands