Excessive explanation, part seven

From the paper we’re studying:

For simplicity, our definitions and results here are formulated for a 
skeletal language, since their extension to ML is a routine matter. 
For example recursion is omitted since it can be introduced by simply
adding the polymorphic fixed-point operator

fix  : ∀α ((α → α) → α)

The high order bit is that ML has a lot of features, but that’s OK. We can come up with a greatly simplified core language, work out a type inference system for it, and then extend that easily to the whole language. But what is this about recursion? What is a polymorphic fixed-point operator?

What we want to do is establish that the type assignment algorithm does not have to understand recursion; we can in fact remove recursion from the core language entirely, and not change the program behaviour or the type analysis. Let’s see how.

First of all, a “fixed point” of a function f is a value x such that f x equals x. For example:

let f x = x      (* Every possible value is a fixed point *)
let f x = x * x  (* 0 and 1 are the only fixed points *)
let f x = x + 1  (* There are no fixed points *)

OK. Now, consider this crazy-looking but clearly not-recursive function, which looks suspiciously like the recursive factorial function:

let crazy f n = if n < 2 then 1 else n * f (n - 1)

What is the type scheme of this thing? Let’s reason informally.

Parameter n must be an integer because it is compared to an integer.
n-1 is an integer which is passed to f, so f must be a function that takes an integer.
The result of f is multiplied by an integer, so f must return an integer.
crazy is a function of two arguments, which means that it is a function of one argument that returns a function of one argument.

In short, the type assignment algorithm must deduce:

crazy : (int → int) → (int → int)

Make sure that makes sense. This thing takes an int → int called f and returns an int → int whose argument is n. That’s just how we express a function of two arguments in this system.

The key question: We have a function crazy that, when considered as a function of one argument, returns the same kind of thing as its argument. Therefore there could be a fixed point. Is there a fixed point of this function?

This thing takes and returns an int → int. So a fixed point of crazy, if it exists, must be an int → int. Let’s suppose it exists, and call it, oh, I don’t know, let’s just pick any old name: factorial.

A fixed point is defined by the equality of two things, but in this case the two things are functions. How do we know that two functions factorial and crazy factorial are equal? That’s straightforward. For any number n we must have that crazy factorial n and factorial n being equal. That’s how we know that crazy factorial and factorial are equal, and therefore factorial is a fixed point of crazy.

Let’s try it and see. I’m going to write a recursive function factorial:

letrec factorial n = if n < 2 then 1 else n * factorial (n - 1)

Now as an exercise, mentally work out some examples and you will soon see that crazy factorial n and factorial n are always equal for any integer n. Therefore factorial is a fixed point of crazy.

Now, let us suppose that magically we have a function fix that can take an α → α for any α and returns a fixed point of that function, if it exists. (The fixed point will of course be of type α.) Now suppose we have this program which we wish to analyze:

letrec factorial n = if n < 2 then 1 else n * factorial (n - 1)

We can trivially translate that into this program:

let factorial = 
  let crazy f n = 
    if n < 2 then 1 else n * f (n - 1) in
  fix crazy

And hey, we have suddenly removed recursion from the language without changing the types of anything! Again, let’s reason informally about the type of this implementation of factorial:

We have already argued that crazy has type scheme
crazy : (int → int) → (int → int)
By assumption we have a polymorphic magic function that takes a function and returns its fixed point: fix : ∀α ((α → α) → α)
In particular this is true for the alpha that is (int → int), and therefore there exists a function fix : ((int → int) → (int → int)) → (int → int)
Since factorial is just fix crazy we deduce that factorial : int → int

And we’re done; we have determined that factorial is a function from int to int without having to analyze any recursion.

The point here is: we never need to write a type analyzer that understands recursion any more deeply than simply rewriting a recursive function into a crazy non-recursive function with an extra argument on top that is a function, and then applying the fix-point combinator to that crazy non-recursive function.

I said that fix was magical. We already know that there are functions that have no fixed point, some have multiple fixed points, and some have a single fixed point. I’m going to leave as an exercise for the reader that the following code is a good enough fixed point combinator to take crazy as an argument and produce a factorial function as its result. I’ll annotate everything with its type; that should help:

(* f is (α → β) → (α → β) *)
(* fix is ((α → β) → (α → β)) → (α → β) *)
let fix f =
 (* r is ρ, a recursively defined type: ρ → α → β *)
 (* (r r) is α → β *)
 (* f (r r) is α → β *)
 (* a is α *)
 (* (f (r r)) a is β *)
 (* bizarre is also ρ *)
 let bizarre r a = (f (r r)) a in 
 (* bizarre bizarre is (α → β) *)
 bizarre bizarre

This implementation of fix doesn’t actually type check in OCaml, by the way — because of that recursive type — but there are ways to get around that. We’ve digressed enough here though. Moving on. The paper said that we don’t need to worry about recursion, and now we know why. Thus far our examples map and factorial have been recursive and included an if-then. Do we need to worry about type analysis of if-then? Continuing with the paper:

and likewise for conditional expressions.

Apparently not. We can remove “if then else” from ML as well without changing the type analysis. How? Suppose again we have:

let crazy f n = if n < 2 then 1 else n * f (n - 1)

Now add a magical function:

ifthenelse : ∀α bool → (dummy → α) → (dummy → α) → α

This magical function takes a bool and two functions, and magically invokes the first function if the bool is true, and the second function if the bool is false, and regardless of which one it invokes, it returns the result. The dummy parameter can be ignored; it’s only there because ML requires that all functions take one argument.

Imagine in C# for instance that there was no if-then-else and no ?: operator. If you had a function:

static T Conditional<T>(bool condition, Func<T> consequence, Func<T> alternative)

that did the right thing then you could implement

    x = b ? c : a;

    x = Conditional(b, ()=>c, ()=>a);

and the type inference engine would produce the same result.

And now we can make two nested functions “consequence” and “alternative” that each take one dummy parameter:

let factorial =
  let crazy f n = 
    let consequence dummy = 1 in 
    let alternative dummy = n * f (n - 1) in
    ifthenelse (n < 2) consequence alternative in
  fix crazy

and the type analysis of crazy will be just the same as before. We don’t actually need if ... then ... else in the type analyzer; it’s an unnecessary syntactic sugar.

The language we analyze will have neither recursion nor conditionals, but now you can see how easy it is to add them to the type analyzer; they’re both equivalent to special function patterns from the point of view of the type analyzer, so all we need to do is say that the rule for analyzing recursion or conditionals is to transform the program into a form with neither, and do the type analysis of the transformed program.

And with that we’ve gotten through the introduction! Next time — next year — we’ll dig into part two: the language.

See you then for more fabulous adventures!

4 thoughts on “Excessive explanation, part seven”

petermilley on December 22, 2016 at 10:41 am said:

In an odd coincidence that you’ve been blogging about the fixed point combinator; I’ve been trying to bend my head lately around some of it’s weirder applications and cousins, like the Reverse State Monad.

Reply ↓
- ericlippert on December 22, 2016 at 11:37 am said:
  
  Non-strictness is weird, man. I too have yet to wrap my head around that one.
  
  Reply ↓
  - Rhialto on December 22, 2016 at 12:46 pm said:
    
    Oh, that is just an effect of the leftmost outermost reduction stategy that many functional languages employ. Adding strictness in that context is the complicated thing 🙂
    
    Reply ↓
David Conrad on October 8, 2018 at 7:28 am said:

I don’t know how you managed to get through that discussion of replacing conditionals with functions without mentioning Smalltalk. 😉

This series is great, and I really hope you’ll be able to find some time for blogging again soon.

Reply ↓