Continuing with my excessively detailed explanation of type inference in ML…
After several years of successful use of the language, both in LCF and other research, and in teaching to undergraduates, it has become important to answer these questions — particularly because the combination of flexibility (due to polymorphism), robustness (due to semantic soundness) and detection of errors at compile time has proved to be one of the strongest aspects of ML.
ML and its descendants are indeed very pleasant languages. Parametric polymorphism, a sound type system, and good error detection mean that if your program type checks, odds are pretty good it is correct. That said, thirty years later the case is a bit overstated. I’m using OCaml every day and I’m pretty frustrated by the quality of the error messages. In a large complex program it can be hard to determine where the error really is when the type system tells you you’ve made a mistake.
But my purpose here is not to editorialize on the considerable merits of ML, but rather to explicate this paper. So onwards!
The discipline can be well illustrated by a small example. Let us define in ML the function map, which maps a given function over a given list — that is map f [x1; ...; xn] = [f(x1),...,f(xn)]
The map function is familiar to C# programmers by another name. In C# parlance it is:
static IEnumerable<R> Select<A, R>( IEnumerable<A> items, Func<A, R> projection)
That is, it takes a sequence of items and a projection function, and it returns a sequence of the items with the projection applied to them. We seek to deduce the same thing about the map function that we will define below: that it takes a sequence of this and a function from this to that, and returns a sequence of that.
The notation used in the paper could use some explanation.
First of all, function application is denoted in ML by stating the function (
map) and then the arguments separated by spaces. So
f is the first argument, and it is the projection function. The second argument is a list of items
xn. Lists are denoted a couple of different ways in ML; here we are using the syntax “surround the whole thing with square brackets and separate the items with semicolons”.
Plainly the thing on the right hand side is intended to be the list of items after the function has been applied to them, but I must confess that I am mystified as to why the authors of the paper have changed notations here! I would have expected this to say
map f [x1; ...; xn] = [f x1; ...; f xn]
in keeping with the standard ML syntax. Anyone care to hazard a guess? Is this a typo or is there some subtlety here?
The required declaration is letrec map f s = if null s then nil else cons(f(hd s)) (map f (tl s))
We are defining (“let”) the recursive (“rec”) function
map, which takes two arguments,
f, a function, and
s, a linked list. This is the standard recursive definition; first we consider the base case.
null is a function that takes a list and returns true if it is the empty list or false otherwise. If the list is empty then the result of mapping is an empty list, written here as
nil. If the list is not empty then it must have a head item,
hd s and a tail list
tl s. We apply the function to the head item, and recursively solve the problem on the tail, and then append the new head to the new tail using
Now remember, the problem that we’re trying to solve here is “what is the signature of
map?” Notice that we have no type information whatsoever in the parameter list. But if we already know the type signatures of the functions used here then we should be able to deduce the type signature.
Next time: given the types of all the parts of map, can we deduce the type of map?
“…but I must confess that I am mystified as to why the authors of the paper have changed notations here!”
map f [x1; …; xn] = [f(x1),…,f(xn)]
Could it be that the author makes a distinction between list syntax and the syntax of list elements that are function arguments? That is to say that the author’s syntax for map should be interpreted more like this:
IEnumerable<R> Map<A,R>(Func<A,R> f, params A x)
The semicolons are used to separate the params arguments, whereas the commas are used to seperate the elements of the result.
My initial thought is that it’s a little of both. Semicolons would have been more consistent than commas, but I feel like the f() syntax is a little closer to the notation I would expect in a math course, and felt a little more natural to me. 🙂
My take is that the authors are showing the meaning of the ML code given on the left-hand side with the mathematical syntax shown on the right-hand side. Using the equals symbol for this relationship is a bit unclear though.
Small typo… the C# should presumably return IEnumerable
Ah. Angle-bracket eating problem… Enumerable<R>
My guess about the notation is that the left part is supposed to be the ML construct to be explained, in terms of stuff (a notation) the reader already knows. Is the right-hand side perhaps in a different font?