Excessive explanation, part one

Well that was a long break! When I was studying for my Facebook interviews I realized that I was going to have no time to write in my blog if I got hired, so I wrote several dozen articles, batched them all up, and figured I would have plenty of time to write more by the time May rolled around. And then I did not! I have been crazy busy learning OCaml, learning Hack, and learning my way around Facebook systems. It’s been a huge amount of fun.

I would like to finish this series on designing and implementing a Z-machine in OCaml eventually, but I want to take a break from that for a bit and talk about another OCaml-related topic.

I frequently hear from readers and coworkers who don’t have a formal computer science theory background that they would love to learn more about programming languages, type systems, and so on, by reading papers. But that the academic papers are written in such a dense jargon that they bog down after the first couple of pages and can’t make further progress. That’s unfortunate. I thought what I might do is pick a short, seminal paper and go through it in excruciating detail over many episodes.

In my series on monads I gave particular emphasis to understanding monads via concepts familiar to professional C# programmers. I’m going to do the same in this series.

So without further ado…


Principal type-schemes for functional programs

Luis Damas † and Robin Milner
† The work of this author is supported by the Portuguese Instituto
Nacional de Investigacao Cientifica

First published in POPL ’82: Proceedings of the 9th ACM SIGPLAN-SIGACT
symposium on Principles of programming languages, ACM, pp. 207–212

Permission to copy without fee all or part of this material is granted
provided that the copies are not made or distributed for direct
commercial advantage, the ACM copyright notice and the title of its
publication and date appear, and notice is given that copying is by
permission of the Association for Computing Machinery. To copy
otherwise, or to republish, requires a fee and/or specific permission.
© 1982 ACM 0-89791-065-6/82/001/0207 $00.75

This is Damas and Milner’s seminal paper on how to do type inference in ML, the language that OCaml is based on.

We’ll more clearly define “type-scheme” later on, but briefly, by “type-scheme” we basically we mean what a C# programmer would mean by “method signature”. The goal here is to deduce the type signature of a method from its body. Of course, C# requires that you explicitly state the type of a method, and requires that you convert a lambda to a delegate of the appropriate type. In ML, these types are deduced, not manifest in the program text.

By “functional programs” we mean programs written in a functional style. Functional style emphasizes two things: immutability and treating functions as data. As we’ll see, the tiny little version of ML that this paper considers could be thought of as the “core” of a great many more complex functional languages.

Next time: The introduction!

Does not compute

One of the most basic ways to think about a computer program is that it is a device which takes in integers as inputs and spits out integers as outputs. The C# compiler, for example, takes in source code strings, and those source code strings are essentially nothing more than enormous binary numbers. The output of the compiler is either diagnostic text, or strings of IL and metadata, which are also just enormous binary numbers.  Because the compiler is not perfect, in some rare cases it terminates abnormally with an internal error message. But those fatal error messages are also just big binary numbers. So let’s take this as our basic model of a computer program: a computer program is a device that either (1) runs forever without producing output, or (2) computes a function that maps one integer to another.
Continue reading

Never say never, part two

This is part two of a two-part series about determining whether the endpoint of a method is never reachable. Part one is here. A follow-up article is here.


Whether we have a “never” return type or not, we need to be able to determine when the end point of a method is unreachable for error reporting in methods that have non-void return type. The compiler is pretty clever about working that out; it can handle situations like

int M()
{
  try
  {
    while(true) N();
  }
  catch(Exception ex)
  {
    throw new WrappingException(ex);
  }
}

The compiler knows that N either throws or it doesn’t, and that if it doesn’t, then the try block never exits, and if it does, then either the construction of the exception throws, or the construction succeeds and the catch throws the new exception. No matter what, the end point of M is never reached.

However, the compiler is not infinitely clever. It is easy to fool it:
Continue reading