By contrast a type-scheme σ = ∀α_{1}...α_{m}τ has a "generic instance" σ' = ∀β_{1}...β_{n}τ' if τ' = [τ_{i}/α_{i}]τ for some types τ_{1},...,τ_{m}and the β_{j}are not free in σ.

This definitely needs an example.

Suppose we have the identity function `λx.x`

. Its type scheme is `∀α α → α`

. Now we might want to perform a substitution here of `int`

for `α`

. But we’ve only defined substitution on free variables, and we need to make a substitution on bound variables, to eliminate the bound variable entirely.

This mechanism of “generic instantiation” says: a type scheme consists of zero or more quantifiers and a type. Take just the type and perform a substitution on it, and then put as many quantifiers as you like before the substituted type, provided that the new quantifiers are not turning free variables into bound variables.

Let’s try it. Extract the type from the type scheme: `α → α`

. We perform our substitution `[int/α]`

on that type to get `int → int`

. And then we put zero or more quantifiers on top of that. Let’s put zero quantifiers on top of it. So we have the original type scheme `∀α α → α`

— that is, for all α, there’s a function from α to α. Now we want a specific generic instance of that, and it is `int → int`

.

This is in a sense the sort of generic instantiation that you’re used to from C#. Notice that we’ve eliminated not just the type variable, but the quantifier on top of the type scheme.

Now let’s look at a more complicated example.

Suppose we have some type scheme for some function, let’s say `∀β∀γ β → γ`

. And let’s suppose we pass that thing to the identity function. The type scheme of the identity function is `∀α α → α`

: the quantifier says that this works for any `α`

we care to name, as long as the type provided for `α`

is a type, not a type scheme; remember, so far we’ve only described type substitution by saying that we’re going to replace a free variable with a type, but what we have is a type *scheme* with two bound type variables: `∀β∀γ β → γ`

.

How do we deal with this? We’ll do our generic instantiation process again. We have type schemes `∀α α → α`

and `∀β∀γ β → γ`

. Take just the type of the first without the quantifiers: `α → α`

. Now make the substitution with the type of the second without the quantifiers: `[β → γ / α]`

to get the type `(β → γ) → (β → γ)`

. Then slap a couple of quantifiers on top of that: `∀β∀γ (β → γ) → (β → γ)`

.

We’ve produced an entirely new type scheme, but clearly this type scheme has the same “fundamental structure” as the type scheme for the identity function, even though this type scheme does not have the type variable of the original scheme, or even the same number of type variables; it’s got one more.

Why did we have to say that all of the introduced type variables must be “not free”? Well, let’s see what happens if we violate that constraint. Suppose we have type scheme `∀α β → α`

. Clearly β is free.

We extract the type: `β → α`

. We perform a substitution `[int/α]`

to get `β → int`

. And then we put a quantifier on top: `∀β β → int`

. But that turns β from a free variable into a bound variable, so this is not a generic instance of a type scheme with a free variable β. Thus when creating generic instances we are restricted to introduce quantifiers only on non-free variables.

In this case we shall write σ > σ'.

Conceptually this is pretty straightforward. Given two type schemes, possibly one is a generic instance of the other. If a schema σ’ is a generic instance of a scheme σ then we say that σ’ is smaller than σ. This should make some sense given our examples so far. If you think of schemes as patterns that can be matched, the pattern `∀α α → α`

is way more general than the pattern `int → int`

, so we say that the latter is “smaller” — less general — than the former.

It is slightly problematic that the authors chose the “greater than” symbol instead of, say, the “greater than or equal” symbol. Why? Well consider `∀α α → α`

and `∀β β → β`

. It should not be hard to see that both of these are generic instances of each other, and therefore both are “smaller” than the other, which is silly. Plainly they are equally general!

Note that instantiation acts on free variables, while generic instantiation acts on bound variables.

This line should now be clear.

It follows that σ > σ' implies S σ > S σ'.

Let’s look at a quick example just to make sure that’s clear. Suppose we have `σ = ∀α α → β`

and `σ' = int → β`

. Clearly σ > σ’. If we now make the substitution of string for β in both, then we have `∀α α → string`

which is still more general than `int → string`

; clearly the greater-than relationship is preserved by substitution of free variables.

And that’s it for section 3! Next time, we’ll look at the formal semantics of the Exp language.

]]>

If S is a substitution of types for type variables, often written [τ_{1}/α_{1}, ... ,τ_{n}/α_{n}] or [τ_{i}/α_{i}] and σ is a type-scheme, then S σ is the type-scheme obtained by replacing each free occurrence of α_{i}in σ by τ_{i}, renaming the generic variables of σ if necessary.

OK, syntactic concerns first. A substitution is written as a bracketed list of pairs, where the first element of the pair is a type (possibly a type variable, but definitely not a type scheme!) and the second is a type variable. We can think of a substitution as a function that takes a type scheme and produces a new type scheme, so (possibly confusingly) we use the same notation as we do for applying a function to an argument in the Exp language.

Let’s look at an example. Suppose we have the type scheme `σ = ∀α α → (β → β)`

, and the substitution `S = [int/β]`

. β is free in σ, so `S σ`

is `∀α α → (int → int)`

.

Remember that we said last time that the free type variables were essentially type variables whose values were going to be provided by an outer context. Substitution of types for free variables is essentially a mechanism by which we can supply the value of the type variable.

Why “renaming the generic variables of σ if necessary”? Well suppose we had σ as before but the substitution `S = [α/β]`

and we intended that α to be a different (that is, free) α than the (bound) α in σ. We cannot in this case say that `S σ`

is `∀α α → (α → α)`

because now every α is bound, which is not what we want. But we can say, hey, we’ll just trivially rename the bound α to γ and now the result of the substitution is `∀γ γ → (α → α)`

and all is well.

Then S σ is called an "instance" of σ;

Notice that since `S σ`

is itself a type scheme, we are saying that “instance of” is a relation on type schemes: given any two type schemes, we should be able to tell if one is an instance of another, by checking to see if there is any substitution on free variables of one that produces the other, modulo renaming of bound variables.

the notions of substitution and instance extend naturally to larger syntactic constructs containing type-schemes.

In later episodes we’re going to have things that contain some number of type schemes, and we want to be able to say “apply this substitution to all of them” and produce a collection of instance type schemes.

Next time: we’ve seen that we can substitute any type for a free variable to get an instance of a schema. But this is insufficiently powerful; we need to be able to make substitutions on bound variables as well, eliminating the bound variable. Next time we’ll see how to do that.

]]>

delegate T MyDelegate<T, U>(T t, List<U> us);

then we can make a substitution of `int`

for `T`

and `double`

for `U`

to get the type `MyDelegate<int, double>`

which represents a method with signature

int MyDelegate(int t, List<double> us);

Let’s see how the paper defines this idea. There are some subtleties that are not readily apparent in C#!

I’m just going to quote the entirety of section 3; it is only two paragraphs. But we’ll need to take this thing apart word by word to understand it.

3 Type instantiation If S is a substitution of types for type variables, often written [τ_{1}/α_{1}, ... ,τ_{n}/α_{n}] or [τ_{i}/α_{i}] and σ is a type-scheme, then S σ is the type-scheme obtained by replacing each free occurrence of α_{i}in σ by τ_{i}, renaming the generic variables of σ if necessary. Then S σ is called an "instance" of σ; the notions of substitution and instance extend naturally to larger syntactic constructs containing type-schemes. By contrast a type-scheme σ = ∀α_{1}...α_{m}τ has a "generic instance" σ' = ∀β_{1}...β_{n}τ' if τ' = [τ_{i}/α_{i}]τ for some types τ_{1},...,τ_{m}and the β_{j}are not free in σ. In this case we shall write σ > σ'. Note that instantiation acts on free variables, while generic instantiation acts on bound variables. It follows that σ > σ' implies S σ > S σ'.

Just… wow. We already know what type variables and type schemes are, but now we have type instances somehow distinct from generic type instances, and free variables somehow distinct from bound variables. And an ordering relation on type schemes has just been introduced.

Let’s start by saying what we mean by “free” and “bound” variables in the Exp language, and then say what we mean by free and bound variables in the type language.

Basically, by “bound” we mean that a variable is declared somewhere in the expression, and by “free” we mean it is used but not declared. So for example, in the expression `x y`

, both `x`

and `y`

are free variables. The variable `q`

which does not appear, is not considered free in `x y`

, but nor is it bound.

In the expression `let x = q in x y`

, `x`

is no longer a free variable; the name “x” is associated with a particular syntactic location: the first operand of the “let”. In this larger expression, `q`

and `y`

are free, as both are used but not defined.

Let’s now connect this with the type grammar. Suppose we have expression `λx.λy.x`

. First off, let’s make sure that we understand what this is. It’s a function; all functions take a value and return a value. The outer function takes a value `x`

and returns a function that takes a value `y`

, ignores it, and returns `x`

.

What is the type scheme of this expression? `x`

and `y`

can be any type, and they need not be the same. So let’s make two type variables and quantify both of them. The type scheme of this expression is `∀α ∀β α → (β → α)`

. That is, it’s a function that takes any type alpha, and returns a function that takes any type beta and returns an alpha.

So now maybe it is clear what “bound” means in type schemes. In the type `α → (β → α)`

found inside of that type scheme, both α and β are bound; they have quantifiers that introduce them. But what is “free”? Well, what’s the type scheme of the sub-expression `λy.x`

found inside `λx.λy.x`

? It’s not `∀α ∀β β → α`

. Why not? Because `x`

cannot take on any type whatsoever; it has to take on the type of the expression that was passed to the outer lambda! The type scheme of the inner lambda is

`∀β β → α`

, and α is a free type variable from the perspective of the inner lambda. A type variable γ which does not appear anywhere would be considered to be neither free nor bound in these expressions.

Maybe looking at this example more explicitly will help. An equivalent but more verbose expression is

let outer = λx.let inner = λy.x in inner in outer

If you’re finding this terse syntax hard to read, in C# this would be (ignoring that you can’t use var for lambdas!)

var outer = x => { var inner = y => x; return inner; };

So what is the type scheme of outer? `∀α ∀β α → (β → α)`

. Both type variables are bound. What is the type scheme of inner? `∀β β → α`

. α is free, β is bound.

In short, “free” really means “the meaning of this name will be provided by some outer context, not here”, and “bound” means “this name is given its meaning somewhere in here”, regardless of whether we’re talking about identifiers in the Exp language or type variables in the type language.

Apropos of which: if you’re familiar with the C# specification for lambda expressions, in there we refer to the “free” variables of a lambda as the “outer” variables, which I think is a little more clear.

Next time, we’ll actually look at substitution, I promise.

]]>

Note that types are absent from the language Exp. Assuming a set of type variables α and of primitive types ι, the syntax of types τ and of type-schemes σ is given by τ ::= α | ι | τ → τ σ ::= τ | ∀α σ

The high order bit here is that the language Exp is just expressions; we have no ability to declare new types in this language. So we assume for the purposes of type inference that there is an existing set of primitives — int, string, bool, whatever — that we can use as the building blocks. What those primitives are does not matter at all because the type inference algorithm never concerns itself with the difference between int and bool. The primitive types are meaningless to the type inference algorithm; all that matters is that we can tell them apart.

The first line in the grammar gives the syntax for types; a type tau can be one of three things.

- A type can be a generic “type variable” — the T in List<T> in C#, for example — that we will notate alpha, beta, and so on.
- A type can be one of a set of “primitive” types: int, string, whatever. Doesn’t matter.
- The only other kind of type is “function which takes a type and returns a type”. That is, a function of one argument, with an argument type and a return type, is a type.

Just as they did with the Exp language, the authors have omitted parentheses from the language of types. Unlike with the Exp language, they neglected to call out that they were doing so.

The second line in the grammar gives the syntax for type schemes.

What’s a type scheme? Recall earlier that we said that a type scheme is a bit like a generic type. More formally, a type scheme is either just a plain type (including possibly a type variable), or “for any type alpha there exists a type such that…”

A type-scheme ∀α_{1}...∀α_{n}τ (which we may write ∀α_{1}...α_{n}τ) has generic type variables α_{1}...α_{n}.

I’ve never liked “type variables” — I think variables should, you know, vary. The C# spec calls these type parameters, which is much more clear; a parameter is given a value by substituting an argument for it. But that’s the next episode!

Anyways, this sentence is just pointing out that for notational convenience we’ll omit the universal quantifiers in the cases where we have a large number of type variables in a row. But we will always have the quantifiers on the left hand side of the schema. A quantifier never appears in a type, only in a type schema.

Something important to realize here is that types and type schemes are just another syntax for a language, like “Exp”, our expression language. A “type” in this conception is simply a string of symbols that matches the grammar of our type language.

An interesting point to call out here: note that this type system has no generic “classes”. There is no “list of T” in this system. We’ve got primitive types, we’ve got function types, and we’ve got generic functions, and that’s it.

Again, this is because we don’t need anything more complicated than generic functions in order to make type inference interesting. If we can solve the problem of inference in a world with generic functions, we can very easily extend the inference to generic types. So let’s keep it simple and not introduce generic types into our core language.

A monotype μ is a type containing no type variables.

This just introduces a new jargon term, to distinguish generic types from non-generic types.

And that’s it for section two; that was a lot shorter than the introduction.

Next time: as promised, type substitution.

]]>

Recall that we are going through the seminal paper on ML type inference line-by-line and explaining all the jargon and symbols. Last year we got through the introduction.

2 The language Assuming a set Id of identifiers x the language Exp of expressions e is given by the syntax e ::= x | e e' | λx.e | let x = e in e' (where parentheses may be used to avoid ambiguity).

This is the grammar of the extremely stripped-down version of ML that we’re going to type check. The language is called “Exp”. The ::= is just the separator between the grammar term “e” and all the kinds of things it can be, separated by bars.

So this is saying that an expression (“e”) can be:

- an identifier —
`x`

here is used as a stand-in for any identifier. - a function call, where
`e`

is the function and`e'`

is the argument, and both are expressions. Remember that in ML we pass arguments to functions by putting the function followed by a space followed by the argument. - a function definition, where
`λ`

begins the function,`x`

is an identifier that stands in for the parameter, the`.`

separates the parameter from the body, and`e`

is the body of the function, an expression. In conventional OCaml we would say`fun`

instead of`λ`

and so on. (And of course C# programmers now see why we call inline functions “lambda expressions”.) Recall that in ML all functions are functions of one argument; a function of two arguments is just a function of one argument that returns a function of one argument. - a let expression; we define a new “variable” called
`x`

, assign the value of expression`e`

to it, and then we can use`x`

inside the body of expression`e'`

. Of course the “variables” are not really variables; they only change once. They’re named values. Notice that this implies that expressions are things that have values; we’ll get into that more later.

The authors do not bother to give the grammar for parenthesized expressions; you can work it out easily enough. Similarly they do not say what the parse of something like `a b c d`

is; is that `((a b) c) d`

or `a (b (c d))`

or what? None of this is important for the purposes of type inference, so these details are glossed over.

Only the last clause extends the λ-calculus. Indeed for type checking purposes every let expression could be eliminated (by replacing x by e everywhere in e'), except for the important consideration that in on-line use of ML declarations let x = e are allowed, whose scope (e') is the remainder of the on-line session. As illustrated in the introduction, it must be possible to assign type-schemes to the identifiers thus declared.

The point here is that let-expressions are just a syntactic convenience; we could eliminate them as we did recursion and “if-then-else”. By “on-line session”, the authors mean the ML REPL. In the REPL, you can simply assign values to variables and those variables stick around for the remainder of the session, as though the remainder of the session was the body of the invisible “in”.

The mapping between a bunch of variables and their values is called an “environment”, and we’ll be talking a lot about environments in future episodes.

So that’s the programming language we’re going to be analyzing. Obviously it is a much smaller language than you’d typically use for line-of-business programming. There are no numbers, no strings, and so on. But if we can do type inference in this language, we can easily extend type inference to more generally useful languages.

Next time: we’ll define an entirely separate little language for describing types.

]]>

For simplicity, our definitions and results here are formulated for a skeletal language, since their extension to ML is a routine matter. For example recursion is omitted since it can be introduced by simply adding the polymorphic fixed-point operator fix : ∀α ((α → α) → α)

The high order bit is that ML has a lot of features, but that’s OK. We can come up with a greatly simplified core language, work out a type inference system for it, and then extend that easily to the whole language. But what is this about recursion? What is a polymorphic fixed-point operator?

What we want to do is establish that the type assignment algorithm does not have to understand recursion; we can in fact remove recursion from the core language entirely, and not change the program behaviour or the type analysis. Let’s see how.

First of all, a “fixed point” of a function `f`

is a value `x`

such that `f x`

equals `x`

. For example:

let f x = x (* Every possible value is a fixed point *) let f x = x * x (* 0 and 1 are the only fixed points *) let f x = x + 1 (* There are no fixed points *)

OK. Now, consider this crazy-looking but clearly not-recursive function, which looks suspiciously like the recursive factorial function:

let crazy f n = if n < 2 then 1 else n * f (n - 1)

What is the type scheme of this thing? Let’s reason informally.

- Parameter
`n`

must be an integer because it is compared to an integer. `n-1`

is an integer which is passed to`f`

, so`f`

must be a function that takes an integer.- The result of
`f`

is multiplied by an integer, so`f`

must return an integer. `crazy`

is a function of two arguments, which means that it is a function of one argument that returns a function of one argument.

In short, the type assignment algorithm must deduce:

crazy : (int → int) → (int → int)

Make sure that makes sense. This thing takes an `int → int`

called `f`

and returns an `int → int`

whose argument is `n`

. That’s just how we express a function of two arguments in this system.

The key question: We have a function `crazy`

that, when considered as a function of one argument, returns the same kind of thing as its argument. Therefore there could be a fixed point. Is there a fixed point of this function?

This thing takes and returns an `int → int`

. So a fixed point of `crazy`

, if it exists, must be an `int → int`

. Let’s suppose it exists, and call it, oh, I don’t know, let’s just pick any old name: `factorial`

.

A fixed point is defined by the equality of two things, but in this case the two things are functions. How do we know that two functions `factorial`

and `crazy factorial`

are equal? That’s straightforward. For any number n we must have that `crazy factorial n`

and `factorial n`

being equal. That’s how we know that `crazy factorial`

and `factorial`

are equal, and therefore `factorial`

is a fixed point of `crazy`

.

Let’s try it and see. I’m going to write a recursive function `factorial`

:

letrec factorial n = if n < 2 then 1 else n * factorial (n - 1)

Now as an exercise, mentally work out some examples and you will soon see that `crazy factorial n`

and `factorial n`

are always equal for any integer n. Therefore `factorial`

is a fixed point of `crazy`

.

Now, let us suppose that magically we have a function `fix`

that can take an `α → α`

for any α and returns a fixed point of that function, if it exists. (The fixed point will of course be of type α.) Now suppose we have this program which we wish to analyze:

letrec factorial n = if n < 2 then 1 else n * factorial (n - 1)

We can trivially translate that into this program:

let factorial = let crazy f n = if n < 2 then 1 else n * f (n - 1) in fix crazy

And hey, we have suddenly removed recursion from the language without changing the types of anything! Again, let’s reason informally about the type of this implementation of `factorial`

:

- We have already argued that crazy has type scheme

`crazy : (int → int) → (int → int)`

- By assumption we have a polymorphic magic function that takes a function and returns its fixed point:
`fix : ∀α ((α → α) → α)`

- In particular this is true for the alpha that is
`(int → int)`

, and therefore there exists a function`fix : ((int → int) → (int → int)) → (int → int)`

- Since
`factorial`

is just`fix crazy`

we deduce that`factorial : int → int`

And we’re done; we have determined that factorial is a function from int to int without having to analyze any recursion.

The point here is: we never need to write a type analyzer that understands recursion any more deeply than simply rewriting a recursive function into a crazy non-recursive function with an extra argument on top that is a function, and then applying the fix-point combinator to that crazy non-recursive function.

I said that `fix`

was magical. We already know that there are functions that have no fixed point, some have multiple fixed points, and some have a single fixed point. I’m going to leave as an exercise for the reader that the following code is a good enough fixed point combinator to take `crazy`

as an argument and produce a factorial function as its result:

let fix f = let bizarre r = f (r r) in bizarre bizarre

This implementation of `fix`

doesn’t actually type check in OCaml, by the way, but there are ways to get around that. We’ve digressed enough here though. Moving on. The paper said that we don’t need to worry about recursion, and now we know why. Thus far our examples map and factorial have been recursive and included an if-then. Do we need to worry about type analysis of if-then? Continuing with the paper:

and likewise for conditional expressions.

Apparently not. We can remove “if then else” from ML as well without changing the type analysis. How? Suppose again we have:

let crazy f n = if n < 2 then 1 else n * f (n - 1)

Now add a magical function:

ifthenelse : ∀α bool → (dummy → α) → (dummy → α) → α

This magical function takes a bool and two functions, and magically invokes the first function if the bool is true, and the second function if the bool is false, and regardless of which one it invokes, it returns the result. The dummy parameter can be ignored; it’s only there because ML requires that all functions take one argument.

Imagine in C# for instance that there was no if-then-else and no ?: operator. If you had a function:

static T Conditional<T>(bool condition, Func<T> consequence, Func<T> alternative)

that did the right thing then you could implement

x = b ? c : a;

as

x = Conditional(b, ()=>c, ()=>a);

and the type inference engine would produce the same result.

And now we can make two nested functions “consequence” and “alternative” that each take one dummy parameter:

let factorial = let crazy f n = let consequence dummy = 1 in let alternative dummy = n * f (n - 1) in ifthenelse (n < 2) consequence alternative in fix crazy

and the type analysis of `crazy`

will be just the same as before. We don’t actually need `if ... then ... else`

in the type analyzer; it’s an unnecessary syntactic sugar.

The language we analyze will have neither recursion nor conditionals, but now you can see how easy it is to add them to the type analyzer; they’re both equivalent to special function patterns from the point of view of the type analyzer, so all we need to do is say that the rule for analyzing recursion or conditionals is to transform the program into a form with neither, and do the type analysis of the transformed program.

And with that we’ve gotten through the introduction! Next time — next year — we’ll dig into part two: the language.

See you then for more fabulous adventures!

]]>

letrec map f s = if null s then nil else cons(f(hd s)) (map f (tl s))

Knowing the type schemes of `null`

, `nil`

, and so on.

Let’s reason informally, just using ordinary logic rather than not some formal system. You’d probably reason something like this:

- I know that map takes two arguments, f and s, from its declaration.
- Functions of two arguments in ML are actually functions that take one argument and return another function. Therefore I know that map has to take something of f’s type, and return a function.
- The returned function must take something of s’s type
- The returned function must return something of a list type, because there’s a “nil” on one half of the “if” and a “cons” with two arguments on the other. (Obviously “if” in ML is
`?:`

in C#.) - I know that s must be some kind of list, because it is passed to null.
- and so on; can you fill in the rest?

Of course, the whole point of this paper is to develop a formal algorithm for making these sorts of deductions. Which we’ll get to at some point, I’m sure!

Moving on; we’ve been using the phrase “type” and “type scheme” without defining either!

Types are built from type constants (bool, ...) and type variables (α, β, ...) using type operators (such as infixed → for functions and postfixed list for lists); a type-scheme is a type with (possibly) quantification of type variables at the outermost.

Here we’re just making a bit more formal what we mean by a type.

- We have a bunch of “built in” types like bool and int and string; they are types by fiat.
- We have generic type parameters (called here “type variables” though of course they are not variables in the sense of “storage of a value”).
- And we have operators that act on types; “list” can be thought of as a postfix operator on a type that produces a new type, and the arrow can be thought of as an infix operator that takes two types and produces a function type.
- “Quantification” is a use of the “for all” operator that introduces a type variable.
- “Outermost” means that the type variables are introduced as far to the left as possible.

Thus, the main result of this paper is that the type-scheme deduced for such a declaration (and more generally, for any ML expression) is a principal type-scheme, i.e. that any other type-scheme for the declaration is a generic instance of it. This is a generalisation of Hindley’s result for Combinatory Logic [3].

We’re re-stating the goal of the paper: that the algorithm presented here should find not just a correct type scheme, but the most general possible type scheme. By “generic instance” we mean of course that “string list” is a more specific instance of a scheme like “for any alpha, alpha list”.

Combinatory logic is the study of “combinators” — functions that take functions and return functions. You’re probably heard that this kind of type inference is called “Hindley-Milner” inference, and now you know why.

ML may be contrasted with Algol 68, in which there is no polymorphism, and with Russell [2], in which parametric types appear explicitly as arguments to polymorphic functions. The generic types of Ada may be compared with type-schemes.

Nothing particularly interesting here; we’re just calling out that ML’s polymorphic type system is just one approach and that you might want to compare it to languages that try other approaches.

For simplicity, our definitions and results here are formulated for a skeletal language, since their extension to ML is a routine matter. For example recursion is omitted since it can be introduced by simply adding the polymorphic fixed-point operator fix : ∀α ((α → α) → α)

What we’re saying here is that we can present an algorithm for a very simple language, show that it is correct, and then easily extend that algorithm to a more complex language.

Now, you might think that surely it must be difficult to write an algorithm that produces a type for recursive methods like `map`

; there seems to be a regression problem here, where we cannot infer types of a method that calls itself, because we’d need to know the type of the called method in order to work out the calling method, but they are the same method, so, hmm, how do to that?

Fortunately it turns out that you can always transform a recursive method into a non-recursive method, and then apply the type inference to the non-recursive version to deduce the type of the recursive version. We do this by using a fixed-point operator. Therefore the algorithm that will be presented does not have to worry about recursion.

Next time: what the heck is a fixed-point operator?

]]>

The type checker will deduce a type-scheme for map from existing type-schemes for null, nil, cons, hd and tl; the term type-scheme is appropriate since all these objects are polymorphic. In fact from null : ∀α (α list → bool) nil : ∀α (α list) cons : ∀α (α → (α list → α list)) hd : ∀α (α list → α) tl : ∀α (α list → α list) will be deduced map : ∀α ∀β ((α → β) → (α list → β list)).

One of the difficulties in reading papers like this for people who don’t already have an academic background is that so much notation is used before it is defined, or simply assumed. So let’s unpack this.

First off, polymorphic types (“generics” in C# parlance) are notated as `α list`

. That is, the type argument comes first, followed by the type being parameterized. This has always struck me as weird! You can think of a generic type as a kind of function that takes in type arguments and produces types. In ML we notate a function that takes in an argument and produces a value by putting the argument to the right of the function. But a generic type has its argument to the left. I’ve asked a number of ML experts why this is, and they usually say something about how it is “more natural” to say “string list”, but this is question-begging; it is only “more natural” because that’s what you’re used to seeing! VB solves the problem with “List of String”, which seems far more “natural” to me than “string list”.

Anyways, long story short, for polymorphic types in ML, the type argument comes to the left of the polymorphic type.

So what is this?

null : ∀α (α list → bool)

This is saying that `null`

is a generic function; its *type scheme* is to the right of the colon. The upside-down A is the universal quantifier, and is read “for all” or “for any”. The arrow means that this is a function that takes a thing of the left type and produces a thing of the right type.

So: `null`

has the type scheme: for any type alpha, there is a function called null that takes a list of alpha and produces a bool. In C# we’d say something like

static bool IsEmpty<T>(List<T> items)

What about

nil : ∀α (α list)

This thing has no arrow so it isn’t even a function. ML allows polymorphic *values*; this is saying that there exists an empty list of any element type, and it is called nil.

The head and tail functions should be straightforward; they take a list of alphas and produce either an alpha, (the head) or a list of alphas (the tail).

But what is up with `cons`

? It should take *two* arguments; an item of type alpha and a list of alphas. And it should produce a list of alphas. Its scheme is:

cons : ∀α (α → (α list → α list))

What is up with this? It looks like it is saying that cons takes an alpha and produces a* function.* That function in turn takes a list of alphas and produces a list of alphas.

And indeed that is what it does. When you say

let append_hello = cons "hello"

then `append_hello`

is a function that takes a list of strings and returns a list with “hello” appended to it. So

let x = cons "hello" mylist

is the same as

let append_hello = cons "hello" let x = append_hello mylist

Functions always take one argument; a function of two arguments actually is a function of one argument that returns another function.

So what do we want to deduce as the type scheme for `map`

?

map : ∀α ∀β ((α → β) → (α list → β list))

The first argument to map must be a function from any type alpha to any type beta. That gives us back a function that takes a list of alphas and returns the list of betas which is the result of applying the given function to each alpha in the list. Somehow the type assignment algorithm needs to *deduce* this type scheme, given the schemes for all the functions called by the implementation.

Next time: we’ll go through the deductive steps needed to figure out the type of map.

]]>

]]>

After several years of successful use of the language, both in LCF and other research, and in teaching to undergraduates, it has become important to answer these questions — particularly because the combination of flexibility (due to polymorphism), robustness (due to semantic soundness) and detection of errors at compile time has proved to be one of the strongest aspects of ML.

ML and its descendants are indeed very pleasant languages. Parametric polymorphism, a sound type system, and good error detection mean that if your program type checks, odds are pretty good it is correct. That said, thirty years later the case is a bit overstated. I’m using OCaml every day and I’m pretty frustrated by the quality of the error messages. In a large complex program it can be hard to determine where the error really is when the type system tells you you’ve made a mistake.

But my purpose here is not to editorialize on the considerable merits of ML, but rather to explicate this paper. So onwards!

The discipline can be well illustrated by a small example. Let us define in ML the function map, which maps a given function over a given list — that is map f [x1; ...; xn] = [f(x1),...,f(xn)]

The map function is familiar to C# programmers by another name. In C# parlance it is:

static IEnumerable<R> Select<A, R>( IEnumerable<A> items, Func<A, R> projection)

That is, it takes a sequence of items and a projection function, and it returns a sequence of the items with the projection applied to them. We seek to deduce the same thing about the map function that we will define below: that it takes a sequence of this and a function from this to that, and returns a sequence of that.

The notation used in the paper could use some explanation.

First of all, function application is denoted in ML by stating the function (`map`

) and then the arguments separated by spaces. So `f`

is the first argument, and it is the projection function. The second argument is a list of items `x1`

through `xn`

. Lists are denoted a couple of different ways in ML; here we are using the syntax “surround the whole thing with square brackets and separate the items with semicolons”.

Plainly the thing on the right hand side is intended to be the list of items after the function has been applied to them, but I must confess that I am mystified as to why the authors of the paper have changed notations here! I would have expected this to say

map f [x1; ...; xn] = [f x1; ...; f xn]

in keeping with the standard ML syntax. Anyone care to hazard a guess? Is this a typo or is there some subtlety here?

The required declaration is letrec map f s = if null s then nil else cons(f(hd s)) (map f (tl s))

We are defining (“let”) the recursive (“rec”) function `map`

, which takes two arguments, `f`

, a function, and `s`

, a linked list. This is the standard recursive definition; first we consider the base case. `null`

is a function that takes a list and returns true if it is the empty list or false otherwise. If the list is empty then the result of mapping is an empty list, written here as `nil`

. If the list is not empty then it must have a head item, `hd s`

and a tail list `tl s`

. We apply the function to the head item, and recursively solve the problem on the tail, and then append the new head to the new tail using `cons`

.

Now remember, the problem that we’re trying to solve here is “what is the signature of `map`

?” Notice that we have no type information whatsoever in the parameter list. But if we already know the type signatures of the functions used here then we should be able to deduce the type signature.

Next time: given the types of all the parts of map, can we deduce the type of map?

]]>