Excessive explanation, part eight

Happy new year all, well enough chit chat, more type inference!

Recall that we are going through the seminal paper on ML type inference line-by-line and explaining all the jargon and symbols. Last year we got through the introduction.


2 The language

Assuming a set Id of identifiers x the language Exp of expressions e
is given by the syntax

e ::= x | e e' | λx.e | let x = e in e'

(where parentheses may be used to avoid ambiguity).

This is the grammar of the extremely stripped-down version of ML that we’re going to type check. The language is called “Exp”. The ::= is just the separator between the grammar term “e” and all the kinds of things it can be, separated by bars.

So this is saying that an expression (“e”) can be:

an identifier — x here is used as a stand-in for any identifier.
a function call, where e is the function and e' is the argument, and both are expressions. Remember that in ML we pass arguments to functions by putting the function followed by a space followed by the argument.
a function definition, where λ begins the function, x is an identifier that stands in for the parameter, the . separates the parameter from the body, and e is the body of the function, an expression. In conventional OCaml we would say fun instead of λ and so on. (And of course C# programmers now see why we call inline functions “lambda expressions”.) Recall that in ML all functions are functions of one argument; a function of two arguments is just a function of one argument that returns a function of one argument.
a let expression; we define a new “variable” called x, assign the value of expression e to it, and then we can use x inside the body of expression e'. Of course the “variables” are not really variables; they only change once. They’re named values. Notice that this implies that expressions are things that have values; we’ll get into that more later.

The authors do not bother to give the grammar for parenthesized expressions; you can work it out easily enough. Similarly they do not say what the parse of something like a b c d is; is that ((a b) c) d or a (b (c d)) or what? None of this is important for the purposes of type inference, so these details are glossed over.

Only the last clause extends the λ-calculus. Indeed for type checking
purposes every let expression could be eliminated (by replacing x by
e everywhere in e'), except for the important consideration that in 
on-line use of ML declarations

let x = e

are allowed, whose scope (e') is the remainder of the on-line
session. As illustrated in the introduction, it must be possible
to assign type-schemes to the identifiers thus declared.

The point here is that let-expressions are just a syntactic convenience; we could eliminate them as we did recursion and “if-then-else”. By “on-line session”, the authors mean the ML REPL. In the REPL, you can simply assign values to variables and those variables stick around for the remainder of the session, as though the remainder of the session was the body of the invisible “in”.

The mapping between a bunch of variables and their values is called an “environment”, and we’ll be talking a lot about environments in future episodes.

So that’s the programming language we’re going to be analyzing. Obviously it is a much smaller language than you’d typically use for line-of-business programming. There are no numbers, no strings, and so on. But if we can do type inference in this language, we can easily extend type inference to more generally useful languages.

Next time: we’ll define an entirely separate little language for describing types.

Fabulous adventures in coding

Eric Lippert's blog

Excessive explanation, part eight

Leave a comment Cancel reply

Share this:

Related

Leave a comment Cancel reply