Lecture 04

Theory and Design of PL (CS 538)

February 3, 2020

Lambda calculus basics

Some terminology

You may see several different names:
- Programs
- Expressions
- Terms
For lambda calculus, these all mean the same thing

“Running” a lambda expression

Given a lambda calculus program, how to run it?

Figure out where parentheses go
Substitute fn argument into fn body
Repeat until we reach a value

1. Figure out where parentheses go

Function application is left-associative
Example: e_1~e_2~e_3 means (e_1~e_2)~e_3
- Read: call e_1 with e_2, then with e_3
Not the same: e_1~(e_2~e_3)
- Read: call e_2 with e_3, then call e_1

2. Substitute argument into body

Example: (\lambda x.e)~v where v is a value
Replace all* x’s in e with v, remove \lambda x.
- Read: call function with argument v
Example: (\lambda x.x + 1)~5
- Replace x with 5, remove \lambda x.
- Result: 5 + 1, steps to 6

3. Keep substituting until done

Order: outside-to-inside
Operate on left-most term until it is \lambda x. e
Turn to argument (right-most term)
- If eager evaluation, operate on argument
- If lazy evaluation, substitute argument into e
Never substitute “under” lambdas
- Don’t substitute for y: \lambda x. ((\lambda y. e_1)~x)

Let’s do an example

Start: ((\lambda a. a)~\lambda b. \lambda c. \lambda d. d~b~c)~1~2~\lambda x.\lambda y. x + y

{= ((((\lambda a. a)~\lambda b. \lambda c. \lambda d. (d~b)~c)~1)~2)~(\lambda x.\lambda y. x + y)}
- \to (((\lambda b. \lambda c. \lambda d. (d~b)~c)~1)~2)~(\lambda x.\lambda y. x + y)
- \to ((\lambda c. \lambda d. (d~1)~c)~2)~(\lambda x.\lambda y. x + y)
- \to (\lambda d. (d~1)~2)~(\lambda x.\lambda y. x + y)
- \to ((\lambda x.\lambda y. x + y)~1)~2
- \to (\lambda y. 1 + y)~2
- \to 1 + 2 \to 3

Free versus bound variables

Free variable: introduced by outer \lambda
Bound variable: not introduced by outer \lambda
Example: z~\lambda x. z + x
- x is a bound variable (under \lambda x)
- z is a free variable (not under \lambda z)
When substituting, only replace bound variable
Example: (\lambda x. (\lambda x. x + 1))~5 steps to \lambda x. x + 1
- Inner x bound by the inner \lambda x, not the outer one

Specifying program behaviors

Help compiler writers

For real languages: multiple implementations
- C/C++: gcc, clang, icc, compcert, vc++, …
- Python: CPython, Jython, PyPy, …
- Ruby: YARV, JRuby, TruffleRuby, Rubinius, …
Should agree on what programs are supposed to do!

Design optimizations

Compilers use optimizations to speed up code
- Loops: fission and fusion, unrolling, unswitching
- Common subexpression, dead code elimination
- Inlining and hoisting
- Strength reduction
- Vectorization
- …
Optimizations shouldn’t affect program behavior!

Prove programs satisfy certain properties

Before we can prove anything about programs, we first need to formalize what programs do
Example: equivalence
- Which programs are equivalent?
- Which programs aren’t equivalent?

How to specify behaviors?

Program semantics

Ideal goal: describe programs mathematically
- Aiming for a fully precise definition
But: no mathematical model is perfect
- Programs run on physical machines in real life
Challenge: which aspects should we model?

Many approaches

Denotational semantics
- Translate programs to mathematical functions
Axiomatic semantics
- Analyze pre-/post-conditions of programs
Operational semantics
- Model how programs step

Principle: program behavior should be defined by behavior of its components

Operational semantics

Programs make steps

Model how a program is evaluated
Benefits:
- Closer correspondence with implementation
- General: most programs “step”, in some sense
Drawbacks:
- A lot of details, models all the steps
- Overkill if we just care about input/output

Values and expressions

Programs may or may not be able to step
Can step: redexes (reducible expresisons)
Can’t step:
- Values: valid results
- Stuck terms: invalid results (“runtime errors”)

In lambda calculus

Values: these things do not step, they are done

val = 𝔹 | ℤ | var | λ var . expr

Expressions: these things may step

expr = 𝔹 | ℤ
     | λ var . expr | expr expr
     | add(expr, expr) | sub(expr, expr)
     | and(expr, expr) | or(expr, expr)
     | if expr then expr else expr | ...

Stuck terms: not values, but can’t step (error)
- true 1
- 1 + false

How to define operational semantics?

Want to define new relations

R(e, v): “Program e steps to value v”
S(e, e'): “Program e steps to program e'”
As PL designer: we get to define R and S
- But what does a definition look like?

Inference rules

Basic idea: we write down a set of inference rules
Components of a rule
- Above the line: zero-or-more assumptions
- Below the line: one conclusion
Meaning of a rule
- If top thing(s) hold, then bottom thing holds
- If no top things: bottom thing holds

Example: IsDouble

Big-step semantics

Idea: describe program result

Useful for language specifications
Don’t describe intermediate steps
Write e \Downarrow v if program e evaluates to value v

Language designer defines when e \Downarrow v

Example

How to apply functions?

Eager evaluation
- If e_1 \Downarrow λx. e_1', and
- If e_2 \Downarrow v, and
- If e_1'[x \mapsto v] \Downarrow v',
- Then: e_1~e_2 \Downarrow v'

How to apply functions?

Lazy evaluation
- If e_1 \Downarrow λx. e_1', and
- If e_1'[x \mapsto e_2] \Downarrow v,
- Then: e_1~e_2 \Downarrow v

In Haskell?

Recall tuple and non-terminating functions:

fst (x, y) = x
snd (x, y) = y

loopForever x = loopForever x -- never terminates

What if we try to project from a bad tuple?

badFst = fst (loopForever 42, 0) + 1 -- Never returns
badSnd = snd (loopForever 42, 0) + 1 -- Returns 1!

Eager evaluation

When passing arguments to function, first evaluate argument all the way
Also known as call-by-value (CBV)
If argument doesn’t terminate, then function call doesn’t terminate

badFst = fst (loopForever 42, 0) + 1 -- Never returns under CBV
badSnd = snd (loopForever 42, 0) + 1 -- Never returns under CBV

Lazy evaluation

Only evaluate arguments when they are needed
Also known as call-by-name (CBN)
This is Haskell’s evaluation order

badFst = fst (loopForever 42, 0) + 1 -- Never returns under CBN
badSnd = snd (loopForever 42, 0) + 1 -- Returns 0 under CBN

Fun with laziness

Can write various kinds of infinite data
Values are computed lazily: only when needed

lotsOfOnes :: [Int]
lotsOfOnes = 1 : lotsOfOnes  -- [1, 1, ...

firstOne   = head lotsOfOnes -- Returns 1

onesAndTwos :: [Int]         -- [1, 2, 1, 2, ...
onesAndTwos = x where x = 1 : y
                      y = 2 : x

firstTwo = head $ tail onesAndTwos -- Returns 2

fibonacci :: [Int]           -- [1, 1, 2, 3, ...
fibonacci = 1 : 1 : zipWith (+) fibonacci (tail fibonacci)

Small-step semantics

Idea: describe program steps

More fine-grained, helpful for implementation
If e steps to e' in one step, write: e \to e'
If e steps to e' in zero or more steps: e \to^\ast e'

Example

Recursion

Fixed point operation

Idea: special expression for recursive definitions
Should allow definition to “make recursive call”
Fixed point expression: defined in terms of itself

expr = ... | fix var . expr

How does this evaluate?

In \text{fix } f.~e:
- The variable f represents recursive call
- The body e can make recursive calls via f
Small-step: \text{fix } f.~e \to e[f \mapsto \text{fix } f.~e]
Big-step:
- If e[f \mapsto \text{fix } f.~e] \Downarrow v,
- Then: \text{fix } f.~e \Downarrow v

How to use this thing?

Suppose: want to model factorial function:

factorial 0 = 1
factorial n = n * factorial (n - 1)

We can model as the following expression: \text{factorial} = \text{fix } f.~λn.~\text{if } n = 0 \text{ then } 1 \text{ else } n * (f~(n - 1))

Testing it out

Evaluating \text{factorial}~5:
- {\to [λn.~\text{if } n = 0 \text{ then } 1 \text{ else } n * ((\text{fix } f ...)~(n - 1))]~5}
- {\to \text{if } 5 = 0 \text{ then } 1 \text{ else } 5 * ((\text{fix } f ...)~(5 - 1))}
- \to^\ast 5 * ((\text{fix } f ...)~4)
- …
- \to^\ast 5 * 4 * 3 * 2 * (\text{if } 0 = 0 \text{ then } 1 \text{ else } ...)
- \to 5 * 4 * 3 * 2 * 1 \to^\ast 120