Excessive explanation, part nine

Last time we defined the grammar of our language “Exp”, which consists of identifiers, lambdas, function application and let expressions. This is a programming language. We also need a different language to formally describe types and “type schemes”. The paper says:

Note that types are absent from the language Exp. Assuming a set of
type variables α and of primitive types ι, the syntax of types τ and of
type-schemes σ is given by

τ ::= α | ι | τ → τ
σ ::= τ | ∀α σ

The high order bit here is that the language Exp is just expressions; we have no ability to declare new types in this language. So we assume for the purposes of type inference that there is an existing set of primitives — int, string, bool, whatever — that we can use as the building blocks. What those primitives are does not matter at all because the type inference algorithm never concerns itself with the difference between int and bool. The primitive types are meaningless to the type inference algorithm; all that matters is that we can tell them apart.

The first line in the grammar gives the syntax for types; a type tau can be one of three things.

A type can be a generic “type variable” — the T in List<T> in C#, for example — that we will notate alpha, beta, and so on.
A type can be one of a set of “primitive” types: int, string, whatever. Doesn’t matter.
The only other kind of type is “function which takes a type and returns a type”. That is, a function of one argument, with an argument type and a return type, is a type.

Just as they did with the Exp language, the authors have omitted parentheses from the language of types. Unlike with the Exp language, they neglected to call out that they were doing so.

The second line in the grammar gives the syntax for type schemes.

What’s a type scheme? Recall earlier that we said that a type scheme is a bit like a generic type. More formally, a type scheme is either just a plain type (including possibly a type variable), or “for any type alpha there exists a type such that…”

A type-scheme ∀α₁...∀α_n τ (which we may write ∀α₁...α_n τ) has generic
type variables α₁...α_n.

I’ve never liked “type variables” — I think variables should, you know, vary. The C# spec calls these type parameters, which is much more clear; a parameter is given a value by substituting an argument for it. But that’s the next episode!

Anyways, this sentence is just pointing out that for notational convenience we’ll omit the universal quantifiers in the cases where we have a large number of type variables in a row. But we will always have the quantifiers on the left hand side of the schema. A quantifier never appears in a type, only in a type schema.

Something important to realize here is that types and type schemes are just another syntax for a language, like “Exp”, our expression language. A “type” in this conception is simply a string of symbols that matches the grammar of our type language.

An interesting point to call out here: note that this type system has no generic “classes”. There is no “list of T” in this system. We’ve got primitive types, we’ve got function types, and we’ve got generic functions, and that’s it.

Again, this is because we don’t need anything more complicated than generic functions in order to make type inference interesting. If we can solve the problem of inference in a world with generic functions, we can very easily extend the inference to generic types. So let’s keep it simple and not introduce generic types into our core language.

A monotype μ is a type containing no type variables.

This just introduces a new jargon term, to distinguish generic types from non-generic types.

And that’s it for section two; that was a lot shorter than the introduction.

Next time: as promised, type substitution.

Fabulous adventures in coding

Eric Lippert's blog

Excessive explanation, part nine

1 thought on “Excessive explanation, part nine”

Leave a comment Cancel reply

Share this:

Related

1 thought on “Excessive explanation, part nine”

Leave a comment Cancel reply