# Anti-unification, part 2

Last time on FAIC we learned what the first-order anti-unification problem is: given two expressions, `s` and `t`, either in the form of operators, or method calls, or syntax trees, doesn’t matter, find the most specific generalizing expression `g`, and substitutions for its holes that gives back `s` and `t`. Today I’ll sketch the algorithm for that in terms of function calls, and next time we’ll implement it on syntax trees.

There are papers describing the first-order anti-unification algorithm that go back to the 1970s, and they are surprisingly difficult to follow considering what a simple, straightforward algorithm it is. Rather than go through those papers in detail, rather here’s a sketch of the algorithm:

Basically the idea of the algorithm is that we start with the most general possible anti-unification, and then we gradually refine it by repeatedly applying rules that make it more specific and less redundant. When we can no longer improve things, we’re done.

So, our initial state is:

```g = h0
ss = { s / h0 }
st = { t / h0 }
```

That is, our generalization is just a hole, and the substitutions which turn it back into `s` and `t` are just substituting each of those for the hole.

We now apply three rules that transform this state; we keep on applying rules until no more rules apply:

Rule 1: If `ss` contains `f(s1, ... sn)/h0` and `st` contains `f(t1, ... tn)/h0` then `h0` is not specific enough.

• Remove the substitution from `ss` and add substitutions `s1/h1, ...sn/hn`.
• Remove the substitution from `st` and add `t1/h1, ... tn/hn`
• Replace all the `h0` in `g` with `f(h1, ... hn)`

Rule 2: If `ss` contains `foo/h0` and `foo/h1`, and `st` has `bar/h0` and `bar/h1`, then `h0` and `h1` are redundant. (For arbitrary expressions foo and bar.)

• Remove `foo/h0` and `bar/h0` from their substitution sets.
• Replace `h0` in `g` with `h1`.

Rule 3: If `ss` and `st` both contain `foo/h0` then `h0` is unnecessary.

• Remove the substitution from both sets.
• Replace all `h0` in `g` with `foo`.

These rules are pretty straightforward, but if we squint a little, we can simplify these even more! Rule 3 is just an optimization of repeated applications of rule 1, provided that we consider an expression like “nil” to be equivalent to “nil()”. This should make sense; a value is logically the same as a function call that takes no arguments and always returns that value. So in practice, we can eliminate rule three and just apply rules one and two.

That’s it! Increase specificity and remove redundancy; keep doing that until you cannot do it any more, and you’re done. Easy peasy. The fact that this algorithm converges on the best possible solution, and the fact that you can apply the rules in any order and still get the right result, are facts that I’m not going to prove in this space.

Next time on FAIC: We’ll work an example out by hand, just to make sure that it’s all clear.

# Anti-unification, part 1

Last time on FAIC we noted that there was a simple, recursive, linear-time first-order unification algorithm. We didn’t give the algorithm, but hopefully you see how it would go. The point is, we start with two expressions that contain some holes, and we deduce hole-free expressions that, when filled into the holes, cause the two expressions to become identical.

I got into this series of posts because I wanted to talk about anti-unification, and it is hard to say what anti-unification is if you don’t know what unification is. Anti-unification is kinda, sorta, but not exactly the opposite problem. The anti-unification problem is:

Given two input expressions (often, but not necessarily without holes), call them `s` and `t`, find a generalizing expression `g` that does have holes, and two substitutions; the first substitution makes the result expression equal to the first input, and the second substitution makes it equal to the second input.

Let’s look at an example. Suppose are inputs are

```s is  (1 :: 2) :: (1 :: 2) :: nil
t is  3 :: 3 :: nil```

If you don’t like the `::` operator, we could equivalently think of these as function calls:

```s is  cons(cons(1, 2), cons(cons(1, 2), nil))
t is  cons(3, cons(3, nil))```

Or, if you prefer, as trees:

```s is     cons             t is       cons
/      \                     /    \
cons      cons                 3     cons
/    \     /   \                     /    \
1      2  cons   nil                 3      nil
/    \
1      2```

and the question is: what is an expression with holes such that there are substitutions for the holes that make both s and t? Plainly the generalizing expression is

`cons(h0, cons(h0, nil)`

and the substitutions are `cons(1, 2) / h0` to make `s` and `3 / h0` to make `t`. (Recall that our standard notation for substitutions is to separate the value being substituted and the hole to substitute by a slash.)

You might have noticed that the unification problem might have no solution; there might be no possible set of substitutions for all the variables that make the statements all true. But the anti-unification problem always has at least one solution: the generalized expression `h0` always works, and of course the substitutions are `s/h0` and `t/h0`.

Thus we need to make the anti-unification problem a little bit harder for it to be interesting: we want the most specific generalization, not just any generalization.

It was pretty clear that unification on equations could be generalized to multiple equations. Similarly, I hope it is clear that anti-unification on two expressions can be generalized to any number of expressions; we want the generalizing expression and a substitution for each input expression.

And of course just as there was higher-order unification for unification problems involving lambdas, and unification modulo theory for problems involving arithmetic or other mathematical ideas, there are similar variations on anti-unification. For our purposes we’ll consider only first-order anti-unifiction, which is the easy one.

Why is anti-unification useful? I’ll get into that in a later post. Now that we know what first-order anti-unification is, can we devise and implement an algorithm that takes in expressions and gives us the most specific anti-unifying expression?

Next time on FAIC: we’ll sketch out the algorithm.

# Unification, part 2

Last time on FAIC I defined unification as a fancy word for taking a system of statements that contain placeholder variables and finding values for those variables which make the statements all true. We’ve seen that we can do that with mathematical statements (also known as “equations”) and that the type unification algorithm described in Milner’s paper does so for type assumptions. Last time I posed the question: what about arbitrary expressions in a programming language?

Sure, why not? Let’s look at an example typical to functional languages.

In many functional languages we have the binary operator `head :: tail`, which takes two values; the head of a list, say, a number, and the tail, say, a list of numbers, possibly empty. The result is a new list of numbers. `nil` is an expression that is used for the empty list. Suppose we make this equation

`h0 :: h0 :: nil`  =  `2 :: h1`

Again, `h0` and `h1` are variables in the mathematical sense, not in the sense of elements of a programming language, and similarly, the equality is logical equality, not an operator in any programming language.  From now on I’m going to label these “holes” with the letter h and a number to disambiguate them, so that it is clear when I am talking about a hole that needs something filled into it.

What we’re looking for here is “what expressions could we put into `h0` and `h1` that would make these two program fragments mean the same thing?” The only values are to substitute `2` for `h0` and `2 :: nil` for `h1`, which gives us  `2 :: 2 :: nil` on both sides.

Maybe the answer was immediately obvious to you, and maybe it was not, but regardless it is not 100% clear whether there is a straightforward algorithm that gives you values for the holes when presented with such a unification problem.

The syntax might be throwing us off a bit, and it’s not very general. What if instead of an operator `::` we just called a function like we do in C#, a function that took a head and a tail? Such a function is traditionally called “cons”, because it constructs a list:

`cons(h0, cons(h0, nil))` = `cons(2, h1)`

Again, remember that we are looking for expressions that we can sub in for the holes that make the equality true.

In this form it is a bit easier to see the algorithm. The first argument to `cons` on the left is `h0`, the first argument on the right is `2`, so `h0` must be `2`, and perhaps you see how it goes from there.

We could also model this as a parse tree.

```    cons                            cons
/      \                        /      \
h0      cons          =         2          h1
/     \
h0      nil```

I find that with the tree diagram it becomes much easier to see what the solution is. I’m not going to go through the whole algorithm here, but maybe you see how it goes. Basically, we recurse through the tree, solving unification problems on the children, and then we check the results to see if there were any contradictions.

This algorithm that I’ve failed to adequately describe is called the first-order unification algorithm, and it can solve any simple unification problem in linear time; it produces a substitution if one exists or fails if it does not.

What do I mean by “simple” unification problem? Well, notice that this simple algorithm would fail if given the problem

`add(h0, 10)` = `add(13, 1)`

Because `h0` would unify with `13`, but `10` would not unify with `1`, and we’d get a unification failure, even though obviously there is a value for `h0` that makes the equation true. Unification algorithms that understand additional facts about arithmetic or other domains are “unification modulo theory”.

Similarly, unification algorithms that involve the lambda abstraction are “higher order unification” algorithms. The higher-order unification problem is in general not solvable, but that’s a topic for another day.

Similarly, it is I hope easy to see how unifying one equation generalizes to multiple equations; you just unify on each equation, and then check to see if you deduced contradictions, and then combine the solutions together.

All of this was meant to introduce what I really want to talk about, which is antiunification. If unification is the process of finding values of variables that induce certain statements to be all true, what on earth could the opposite of that be?

Next time on FAIC: We’ll find out.

# Unification, part 1

My long and not-yet-finished series where I did a line-by-line exegesis of the seminal paper on ML type inference ended with a discussion of the type unification algorithm. In reading back over that episode I realize to my horror that I never defined “unification”! And this in a series intended to demystify the jargon inherent in papers about type systems. My bad. Apologies for that, and let’s fix that problem right now.

“Unification” is just a fancy word for “we have a set of statements that contain placeholder variables; find values for those variables that make every statement true”. That is to say, unification is a generalization of the problem of “solve this equation for x”.

The classic example from high school algebra is the system of linear equations, where we have something like

```X + Y = 10
X - Y = 2```

(As I might have noted before, we call these placeholders “variables”, though they bear only a small similarity to what we call “variables” in C# — that is, typed, mutable storage locations. As we did in the previous series, where we discussed “free” and “bound” type variables, I’ll continue to use “variable” in this traditional mathematical sense of a placeholder for a value to be supplied later via substitution, and not in the C# programming sense.)

The unification problem is to find values for X and Y such that the statements are both true; in this case if we substitute 6 for X and 4 for Y, both statements are true. Moreover, this is the unique such substitution; of course there are systems of equations where there are multiple substitutions, infinitely many substitutions, or no substitutions at all; in the latter case we say that unification fails.

A unification algorithm is one which takes in a set of statements with variables and returns substitutions for the variables that makes every statement true. As we know, the algorithm for unifying systems of linear equations on real numbers is straightforward, but as we make those equations more complicated — by restricting them to integers, by making them non-linear, and so on — then the algorithms get more and more complicated.

In our type system example, the statements which we wish to be true are called “assumptions”. An assumption can contain a type variable, which we denoted with a Greek letter. Our type inference algorithm’s goal was to use type unification to find a set of substitutions for those variables which makes the assumptions all true. An example that I gave was an assumption:

`plus : int -> int -> int`

And we have an expression we wish to assign a type to:

`λx.(plus x x)`

We add to our assumptions an assumption that contains a variable:

`x:β`

And then we run our type inference algorithm on the body of the lambda. It returns a substitution — that β is int, and therefore so is x, and now we have enough information to deduce that our expression is a function from int to int.

Here’s an interesting question. We’ve seen how to unify systems of linear equations, and we’ve seen how to unify types. Can we unify arbitrary expressions in a programming language?

Next time on FAIC: Yes, yes we can.

# That was a long break

Well that was a much longer break than I intended! In fact I did not intend any break at all.

As I’m sure you’ve deduced, I batched up a whole long series of blog articles when things were relatively calm, knowing that things were about to get busy. I posted them gradually, and intended to finish up those series when things calmed down a little again, and then I stayed super busy!

I’m still super busy (and having a great time doing interesting work) but a large number people have asked me recently when I’m going to start blogging again. And I do miss it. So I figure that I’ll try to post more often than I have been, but not necessarily more regularly.

Though I do hope to someday finish those long series on Z-machines and type inference that I started way back when, it’s been a very busy year and I have learned a tremendous amount that I’d love to share with you all. Thus I think I’ll start with some posts that are more like the classic days of Fabulous Adventures In Coding, where I’ll talk about some of the data structures and algorithms I’ve had to learn about and implement recently.

Next time on FAIC: My previous long series of blog posts discussed type unification in functional languages, but horrors, I never actually defined the word “unification”. And what on earth is anti-unification? We’ll find out!