Not so fabulous adventures in banking

I’ve banked with First Tech Credit Union since 1994, when they were the only bank that would give a non-resident Microsoft intern like me an account. I had nothing but good service from them for many years, but in the last couple years things have gone really downhill, (starting right around the time they merged with another credit union; probably not a coincidence).

The final straw was this evening; here’s a screenshot of their web site showing my balance on my home equity line of credit:

On July 10th I must have needed some cash for something, so I borrowed $1000 and my balance went from $0 to $1000.  Five days later I paid it back, and it said that my new balance was $10.52.  Now, this seems like exorbitant interest on a five-day loan, but whatever, as you can see, I paid the $10.52, and my balance went back to zero.

A couple months later I was assessed a late fee of $5. I did not notice this, since my balance was still listed on the web site as zero.

This afternoon I did a credit check and discovered that my credit score had gone down 111 points.  111 points!

I called up First Tech and they cheerfully explained to me that they had charged me $0.75 interest on the $10.52 of interest, and that since I had not paid it in 60 days, they reported my account as delinquent.

As you can see, they were good enough to waive the five dollar fee, and I transferred over the seventy-five cents. Hopefully they will remove the delinquency from my credit report, but we’ll see.

This is the third time this has happened in the last two years. The first time it was for 7 cents, and the second time it was for 17 cents. I informed them at the time that I would be taking my business elsewhere if there was a third time, and that was today.

Add to that a number of other recent issues — during the merger they somehow managed to lose about a years worth of my online bill payment records. A bank lost my banking records. At tax time. And I recently got what I think was a sales call from a representative that was so confusing, uninformed, vague and generally incompetent that I called them back because I genuinely believed I was probably being phished.

So, long story short: first, do not bank with First Tech. They cannot make a web site that shows you your balance, and they will mark an account as delinquent for a discrepancy that does not show up on the web site.  I’ve just had my credit ruined over the price of a pack of gum.

Second, I need recommendations for a credit union in the Seattle area that does not suck. Anyone have recommendations? Whack ’em down in the comments!


UPDATE: I had a meeting with my local branch manager, who was the first person in this saga to apologize to me for my inconvenience and clearly state that it was a bad idea to ruin people’s credit over the price of a cup of coffee. But the super amusing part was that he said that this happens all the time.

How? Well, the HELOC has an annual fee, and fee is considered part of the loan, so you end up having to pay a few cents interest on it if you do not pre-pay it, but that interest does not show up in the balance on the web site.  So a great many people do not discover this until it goes to collections or shows up on their credit report.

Apparently, he told me, people getting their credit ruined because of a seven cent charge they didn’t know they had is bad for customer satisfaction metrics.

That’s some high quality Business Intelligence analysis right there.

What kind of clown town operation are these people running? Like I said, this has happened to me three times, over a period of several years, and apparently I am not alone. The only conclusion I can come up with is that the problem is not bad enough to bother fixing.

The branch manager asked me for advice, which seemed to me to be rather backwards, but, hey, I’ll answer the question. First Tech, the suggestions I gave your manager were:

  • Stop marking these accounts as delinquent. Put a WHERE clause in the SQL query that determines which accounts get flagged for collections or delinquency that filters out accounts that are delinquent by less than the price of, say, a Big Mac.
  • Fix your web site to show the actual balance in the “balance” column.
  • I was super irritated when I was on the phone getting this sorted out, but the mortgage specialist I spoke with was most interested in telling me that this was my fault, and that what they did was legal. But the problem is not that I think you’re criminals! The problem is that I think you’re incompetent clowns who are actively making my life worseStaff who are asked to speak to irate customers should be trained on how to solve the real problem, which is retaining the customer’s business and good will.

I think those are reasonable suggestions that could be implemented in pretty short order, and any one of them would have mitigated the problem. Maybe do all three.

Advertisements

Anti-unification, part 6

Last time I gave a simple C# implementation of the first-order anti-unification algorithm. This is an interesting algorithm, but it’s maybe not so clear why anti-unification is useful at all. It’s pretty clear why unification is interesting: we have equations that need solving! But why would you ever need to have two (or more) trees that need anti-unifying to a more general tree?

Here’s a pair of code changes:

dog.drink();   —becomes—>   if (dog != null) dog.drink();
dog.bark();    —becomes—>   if (dog != null) dog.bark();

Now here’s the trick. Suppose we represent each of these code changes as a tree. The root has two children, “before” and “after”. The child of each side is the abstract syntax tree of the before and after code fragment, so our two trees are:

Now suppose we anti-unify those two trees; what would we get? We’d get this pattern:

dog.h0(); —> if (dog != null) dog.h0();

Take a look at that. We started with two code changes, anti-unified them, and now we have a template for making new code edits! We could take this template and write a tool that transforms all code of the form “an expression statement that calls a method on a variable called dog” into the form “an if statement that checks dog for null and calls the method if it is not null”.

What I’m getting at here is: if we have a pair of small, similar code edits, then we can use anti-unification to deduce a generalization of those two edits, in a form from which we could then build an automatic refactoring.

But what if we have three similar code edits?

edit 1: dog.drink(); —> if (dog != null) dog.drink();
edit 2: dog.bark();  —> if (dog != null) dog.bark();
edit 3: cat.meow();  —> if (cat != null) cat.meow();

Let’s take a look at the pairwise anti-unifications:

1 & 2: dog.h0(); —> if (dog != null) dog.h0();
1 & 3: h1.h0();  —> if (h1 != null) h1.h0();
2 & 3: the same.

Anti-unifying the first two makes a more specific pattern than any anti-unification involving the third. But the really interesting thing to notice here is that the anti-unifications of 1&3 and 2&3 is itself a generalization of the anti-unification of 1&2!

Maybe that is not 100% clear. Let’s put all the anti-unifications into a tree, where the more general “abstract” patterns are at the top, and the individual “concrete” edits are at the leaves:

Each parent node is the result of anti-unifying its children. This kind of tree, where the leaves are specific examples of a thing, and each non-leaf node is a generalization of everything below it, is called a dendrogram, and they are very useful when trying to visualize the output of a hierarchical clustering algorithm.

Now imagine that we took hundreds, or thousands, or hundreds of thousands of code edits, and somehow managed to work out a useful dendrogram of anti-unification steps for all of them. This is a computationally difficult problem, and in a future episode, I might describe some of the principled techniques and unprincipled hacks that you might try to make it computationally feasible. But just suppose for the moment we could.  Imagine what that dendrogram would look like.  At the root we’d have the most general anti-unification of our before-to-after pattern:

h0 —> h1

Which is plainly useless.  At the leaves, we’d have all of the hundreds of thousands of edits, which are not useful in themselves. But the nodes in the middle are pure gold! They are all the common patterns of code edits that get made in this code base, in a form that you could turn into a refactoring or automatic fix template. The higher in the tree they are, the more general they are.

You’ve probably deduced by now that this is not a mere flight of fancy; I spent eight months working on a tiny research team to explore the question of whether this sort of analysis is possible at the scale of a modern large software repository, and I am pleased to announce that indeed it is!

We started with a small corpus of code changes that were made in response to a static analysis tool (Infer) flagging Java code as possibly containing a null dereference, built tools to extract the portions of the AST which changed, and then did a clustering anti-unification on the corpus to find patterns. (How the AST extraction works is also very interesting; we use a variation on the Gumtree algorithm. I might do a blog series about that later.) It was quite delightful that the first things that popped out of the clustering algorithm were patterns like:

h0.h1(); —> if (h0 != null) h0.h1();
h0.h1(); —> if (h0 == null) return; h0.h1();
h0.h1(); —> if (h0 == null) throw …; h0.h1();
if (h0.h1()) h2; —> if (h0 != null && h0.h1()) h2;

and a dozen more variations. You might naively think that removing a null dereference is easy, but there are a great many ways to do it, and we found most of them in the first attempt.

I am super excited that this tool works at scale, and we are just scratching the surface of what we can do with it. Just a few thoughts:

  • Can it find patterns in bug fixes more complex than null-dereference fixes?
  • Imagine for example if you could query your code repository and ask “what was the most common novel code change pattern last month?” This could tell you if there was a large-scale code modification but the developer missed an example of it. Most static analysis tools are of the form “find code which fails to match a pattern”; this is a tool for finding new patterns and the AST operations that apply the pattern!
  • You could use it as signal to determine if there are new bug fix patterns emerging in the code base, and use them to drive better developer education.
  • And many more; if you had such a tool, what would you do with it? Leave comments please!

The possibilities of this sort of “big code” analysis are fascinating, and I was very happy to have played a part in this investigation.

The team has recently written a public-facing post on Facebook’s coding blog describing the high-level architecture of our pipeline, with much better graphics and figures than I’ve thrown together here. Please check it out and let me know what you think.


I have a lot of people to thank: our team leader Satish, who knows everyone in the code analysis community, our management Erik and Joe who are willing to take big bets on unproven technology, my colleagues Andrew and Johannes, who hit the ground running and implemented some hard algorithms and beautiful visualizations in very little time, our interns Austin and Waruna, and last but certainly not least, the authors of the enormous stack of academic papers I had to read to figure out what combination of techniques might work at FB scale. I’ll put some links to some of those papers below.


Anti-unification, part 5

Last time we wrote all the boring boilerplate code for substitutions and trees. Now let’s implement the algorithm. As I noted a couple of episodes back, we can reduce the algorithm to repeated application of two rules that mutate three pieces of state: the current generalization, the current substitutions on s, and the current substitutions on t.

The function returns those three things, and they do not have any particular semantic connection to each other aside from being the solution to this problem, so let’s try returning them as a tuple.

This seems like a good place to try out nested functions in C# 7, since each rule is logically its own function, but also only useful in the context of the algorithm; there’s no real reason to make these private methods of the class since no other code calls them.  Also, they’re logically manipulating the local state of their containing function.

We’ll start by setting up the initial state as being the most general generalization:


public static (Tree, Substitutions, Substitutions)
  Antiunify(Tree s, Tree t)
{
  var h = MakeHole();
  var generalization = h;
  var sSubstitutions = Substitutions.Empty.Add(h, s);
  var tSubstitutions = Substitutions.Empty.Add(h, t);

Recall the first rule seeks situations where there is a substitution that is insufficiently specific. We want to go until no more rules apply, so we’ll have this return a Boolean indicating whether the rule was applied or not.

  bool RuleOne()
  {
    var holes = from subst in sSubstitutions
                let cs = subst.Value
                let ct = tSubstitutions[subst.Key]
                where cs.Kind == ct.Kind
                where cs.Value == ct.Value
                where cs.ChildCount == ct.ChildCount
                select subst.Key;
    var hole = holes.FirstOrDefault();
    if (hole == null)
      return false;
    var sTree = sSubstitutions[hole];
    var tTree = tSubstitutions[hole];
    sSubstitutions = sSubstitutions.Remove(hole);
    tSubstitutions = tSubstitutions.Remove(hole);
    var newHoles =
      sTree.Children.Select(c => MakeHole()).ToList();
    foreach (var (newHole, child) in newHoles.Zip(
        sTree.Children, (newHole, child) => (newHole, child)))
      sSubstitutions = sSubstitutions.Add(newHole, child);
    foreach (var (newHole, child) in newHoles.Zip(
         tTree.Children, (newHole, child) => (newHole, child)))
       tSubstitutions = tSubstitutions.Add(newHole, child);
    generalization = generalization.Substitute(
      hole, new Tree(sTree.Kind, sTree.Value, newHoles));
    return true;
  }

There is a small code smell here: tuples are value types, and so the “default” if there is no pair of holes that matches like this is (null, null), so that’s the condition that we’re using to check to see if the rules apply.

Notice that we’re using tuples to iterate over two sequences of equal size via zip. The code seems inelegant to me in a subtle way. The fundamental issue here is that C# has always had mutable tuples ever since version 1.0; it just called them “argument lists”, and that’s weird. It has always struck me as bizarre that C# requires you to pass an argument tuple, but that it gives you no syntax for manipulating that tuple in any way other than extracting the arguments from it or mutating them. You cannot treat what is logically a tuple as a tuple; instead you have to write code that explicitly constructs a real tuple out of the logical tuple, and end up writing what looks like it ought to be an identity:

(newHole, child) => (newHole, child)

For that matter, why do we need to zip at all? In this particular example it would be nice if the tuple syntax carried over into foreach loops; imagine if instead of that ugly zip code we could just write

foreach (var newHole, var child in newHoles, sTree.Children)

Zipping is only necessary here because the language lacks the feature of treating tuples as values consistently across the language. I’m hoping there will be further improvements in this area in C# 8.

But I digress. We’ve implemented the first rule, and the second is even more straightforward. Here we are looking for redundant holes and removing them:

  bool RuleTwo()
  {
    var pairs =
      from s1 in sSubstitutions
      from s2 in sSubstitutions
      where s1.Key != s2.Key
      where s1.Value == s2.Value
      where tSubstitutions[s1.Key] == tSubstitutions[s2.Key]
      select (s1.Key, s2.Key);
    var (hole1, hole2) = pairs.FirstOrDefault();
    if (hole1 == null)
      return false;
    sSubstitutions = sSubstitutions.Remove(hole1);
    tSubstitutions = tSubstitutions.Remove(hole1);
    generalization = generalization.Substitute(hole1, hole2);
    return true;
  }

Quite fine. And now the outer loop of the algorithm is trivial. We keep applying rules until we are in a situation where neither applies.

  while (RuleOne() || RuleTwo())
  { /* do nothing */ }
  return (generalization, sSubstitutions, tSubstitutions);
}

It’s slightly distasteful to have RuleOne and RuleTwo useful for both their side effects and their values, but really their values are only being used for control flow, not for the value that was computed, so I feel OK about this.

Let’s try it out! Again we’ll make a couple of local helper functions:

static void Main()
{
  Tree Cons(params Tree[] children) =>
    new Tree(“call”, “cons”, children);
  Tree Literal(string value) =>
    new Tree(“literal”, value);
  var one = Literal(“1”);
  var two = Literal(“2”);
  var three = Literal(“3”);
  var nil = Literal(“nil”);
  var s = Cons(Cons(one, two), Cons(Cons(one, two), nil));
  var t = Cons(three, Cons(three, nil));
  var (generalization, sSubstitutions, tSubstitutions) =
    Tree.Antiunify(s, t);
  Console.WriteLine(generalization);
  Console.WriteLine(sSubstitutions.LineSeparated());
  Console.WriteLine(tSubstitutions.LineSeparated());
}

And when we run it, we get the right answer:

cons(h1,cons(h1,nil))
cons(1,2)/h1
3/h1

Nice!

Next time on FAIC: Why is this useful?

Anti-unification, part 4

All right, let’s implement this thing. We’ll start with a few caveats:

  • In the previous post I worked an example on function calls; in this code, we’ll do the algorithm on syntax trees. Hopefully it is obvious that they’re equivalent.
  • As I prefer, I’ll work with immutable data structures whenever possible.
  • This code is intended to illustrate the concepts; there are numerous places where it could be made faster or more memory-efficient. Those are left as exercises.
  • There’s a small amount of boilerplate code because I want value equality on immutable trees. It’s irritating to write, but we’ll do it.
  • WordPress turns quotation marks into “smart quotes” automatically and I don’t remember how to turn it off. VEXING.

Let’s get through the boring code quickly in this episode, and then we can look in more detail at the algorithm proper in the next episode. As often is the case, if we get the boring boilerplate infrastructure right, then the algorithm reads very clearly.

I want to be able to make new, unique “holes”; a class that is purpose-built to count off numbers is useful for that. I’m unlikely to have two billion holes, so the fact that it wraps around is irrelevant; I could always swap it out for longs if I had to.

internal sealed class Counter
{
    private int count = 0;
    public int Next()
    {
        int current = count;
        count += 1;
        return current;
    }
}

Yes I know that ++ exists. I do not like that thing.

I originally thought that I’d make a “substitution” type that is logically a Tree, Tree tuple, but then I realized that the only time I use substitutions is when looking them up in a collection of substitutions. I’ll therefore just use an immutable dictionary from trees to trees as my collection of substitutions, and the key-value pair as my substitution. 

using Substitutions =
  System.Collections.Immutable.ImmutableDictionary<Tree, Tree>;
internal static class Extensions
{
  public static string LineSeparated(this Substitutions s)
    => string.Join(“\n”,
      s.Select(kv => $”{kv.Value}/{kv.Key}));
}

All right. Let’s get through the boring parts of making an immutable syntax tree that has value equality. We’ll say that a tree is characterized by three things: it has a kind, it has a value, and it has any number of ordered children. We’ll store the children in an array but ensure that it is never exposed and hence never mutated.

internal sealed class Tree
{
  public string Kind { get; }
  public string Value { get; }
  private readonly Tree[] children;
  public IEnumerable<Tree> Children =>
    this.children.Select(c => c);
  public int ChildCount => this.children.Length;
  public Tree(string kind, string value, params Tree[] children)
    : this(kind, value, (IEnumerable<Tree>)children)
  { }
  public Tree(
    string kind, string value, IEnumerable<Tree> children)
  {
    this.Kind = kind;
    this.Value = value;
    this.children = children.ToArray();
  }
  public static bool operator ==(Tree a, Tree b) =>
    ReferenceEquals(a, b) || !(a is null) && a.Equals(b);
  public static bool operator !=(Tree a, Tree b) => !(a == b);
  public override bool Equals(object obj) =>
    obj is Tree t &&
      t.Kind == this.Kind &&
      t.Value == this.Value &&
      t.Children.SequenceEqual(this.Children);
  public override int GetHashCode() =>
    HashCode.Combine(this.Kind, this.Value,
      this.children.Aggregate(0, HashCode.Combine));

I am loving the “is” patterns but C# really needs a !is null or a is not null or something like that. This !(a is null) is ugly. Of course I cannot use a != null — do you see why? I’d have to use ReferenceEquals. There is an opportunity here for a more general feature of “match the negation of this pattern”.

Printing out trees is straightforward; we’ll just print them out in their function call form:

public override string ToString() =>
  this.ChildCount == 0 ?
    this.Value :
    $”{this.Value}({string.Join<Tree>(‘,’, this.children)})”;

Given a substitution, what is the tree after the substitution is applied?

public Tree Substitute(Tree original, Tree replacement) =>
  this == original ? 
    replacement : 
    new Tree(this.Kind, this.Value, this.Children.Select(
      e => e.Substitute(original, replacement)));

Easy peasy. Finally, I want a factory for holes:

private static readonly Counter counter = new Counter();
public static Tree MakeHole() =>
  new Tree(“hole”, $”h{counter.Next()});

And that does it for the boring boilerplate code.

Next time on FAIC: let’s implement anti-unification for real.

Anti-unification, part 3

Last time we described the classic first-order anti-unification algorithm, and reduced it from three rules to only two. Let’s work an example, the same example that we gave a while back.

s is cons(cons(1, 2), cons(cons(1, 2), nil)) 
t is cons(3, cons(3, nil))

So our initial condition is:

g = h0
ss = { cons(cons(1, 2), cons(cons(1, 2), nil)) / h0 }
st = { cons(3, cons(3, nil) / h0 }

Now we notice that rule 1 can be applied; there are two cons expressions both substituted for h0, so we move the cons into g and make the substitutions on the arguments to cons:

g = cons(h1, h2)
ss = { cons(1,2) / h1,  cons(cons(1, 2), nil) / h2 }
st = { 3 / h1, cons(3, nil) / h2 }

Super. Now we notice that rule 1 applies again: we have cons expressions both substituted for h2, so we move the cons into g:

g = cons(h1, cons(h3, h4))
ss = { cons(1, 2) / h1, cons(1, 2) / h3, nil / h4 }
st = { 3 / h1, 3 / h3, nil / h4 }

We are now in a situation where both rules apply.

Rule 1 applies because we can think of nil as being nil() — that is, a call that has exactly zero children. Thus there are two nil expressions both substituted for h4, so we can move the nil into g, and introduce zero new holes for the zero arguments.

Rule 2 applies because h1 and h3 are redundant.

One of the nice things about this algorithm is that it doesn’t matter what order you apply the rules in; you always make progress towards the eventual goal. Let’s apply rule 1:

g = cons(h1, cons(h3, nil))
ss = { cons(1, 2) / h1, cons(1, 2) / h3 }
st = { 3 / h1, 3 / h3 }

Rule 1 no longer applies, but rule 2 does. h1 and h3 are still redundant. Get rid of h1:

g = cons(h3, cons(h3, nil))
ss = { cons(1, 2) / h3 }
st = { 3 / h3 }

No more rules apply, and we’re done; we’ve successfully deduced that the most specific generalization of s and t is cons(h3, cons(h3, nil)) and given substitutions that produce s and t.

Next time on FAIC: Let’s implement it! And maybe we’ll take a look at a few new features of C# 7 while we’re at it.

Anti-unification, part 2

Last time on FAIC we learned what the first-order anti-unification problem is: given two expressions, s and t, either in the form of operators, or method calls, or syntax trees, doesn’t matter, find the most specific generalizing expression g, and substitutions for its holes that gives back s and t. Today I’ll sketch the algorithm for that in terms of function calls, and next time we’ll implement it on syntax trees.

There are papers describing the first-order anti-unification algorithm that go back to the 1970s, and they are surprisingly difficult to follow considering what a simple, straightforward algorithm it is. Rather than go through those papers in detail, rather here’s a sketch of the algorithm:

Basically the idea of the algorithm is that we start with the most general possible anti-unification, and then we gradually refine it by repeatedly applying rules that make it more specific and less redundant. When we can no longer improve things, we’re done.

So, our initial state is:

g = h0
ss = { s / h0 }
st = { t / h0 }

That is, our generalization is just a hole, and the substitutions which turn it back into s and t are just substituting each of those for the hole.

We now apply three rules that transform this state; we keep on applying rules until no more rules apply:

Rule 1: If ss contains f(s1, ... sn)/h0 and st contains f(t1, ... tn)/h0 then h0 is not specific enough.

  • Remove the substitution from ss and add substitutions s1/h1, ...sn/hn.
  • Remove the substitution from st and add t1/h1, ... tn/hn
  • Replace all the h0 in g with f(h1, ... hn)

Rule 2: If ss contains foo/h0 and foo/h1, and st has bar/h0 and bar/h1, then h0 and h1 are redundant. (For arbitrary expressions foo and bar.)

  • Remove foo/h0 and bar/h0 from their substitution sets.
  • Replace h0 in g with h1.

Rule 3: If ss and st both contain foo/h0 then h0 is unnecessary.

  • Remove the substitution from both sets.
  • Replace all h0 in g with foo.

These rules are pretty straightforward, but if we squint a little, we can simplify these even more! Rule 3 is just an optimization of repeated applications of rule 1, provided that we consider an expression like “nil” to be equivalent to “nil()”. This should make sense; a value is logically the same as a function call that takes no arguments and always returns that value. So in practice, we can eliminate rule three and just apply rules one and two.

That’s it! Increase specificity and remove redundancy; keep doing that until you cannot do it any more, and you’re done. Easy peasy. The fact that this algorithm converges on the best possible solution, and the fact that you can apply the rules in any order and still get the right result, are facts that I’m not going to prove in this space.

Next time on FAIC: We’ll work an example out by hand, just to make sure that it’s all clear.

Anti-unification, part 1

Last time on FAIC we noted that there was a simple, recursive, linear-time first-order unification algorithm. We didn’t give the algorithm, but hopefully you see how it would go. The point is, we start with two expressions that contain some holes, and we deduce hole-free expressions that, when filled into the holes, cause the two expressions to become identical.

I got into this series of posts because I wanted to talk about anti-unification, and it is hard to say what anti-unification is if you don’t know what unification is. Anti-unification is kinda, sorta, but not exactly the opposite problem. The anti-unification problem is:

Given two input expressions (often, but not necessarily without holes), call them s and t, find a generalizing expression g that does have holes, and two substitutions; the first substitution makes the result expression equal to the first input, and the second substitution makes it equal to the second input.

Let’s look at an example. Suppose are inputs are

s is  (1 :: 2) :: (1 :: 2) :: nil
t is  3 :: 3 :: nil

If you don’t like the :: operator, we could equivalently think of these as function calls:

s is  cons(cons(1, 2), cons(cons(1, 2), nil))
t is  cons(3, cons(3, nil))

Or, if you prefer, as trees:

s is     cons             t is       cons
       /      \                     /    \
    cons      cons                 3     cons
   /    \     /   \                     /    \
  1      2  cons   nil                 3      nil
           /    \
          1      2

and the question is: what is an expression with holes such that there are substitutions for the holes that make both s and t? Plainly the generalizing expression is

cons(h0, cons(h0, nil)

and the substitutions are cons(1, 2) / h0 to make s and 3 / h0 to make t. (Recall that our standard notation for substitutions is to separate the value being substituted and the hole to substitute by a slash.)

You might have noticed that the unification problem might have no solution; there might be no possible set of substitutions for all the variables that make the statements all true. But the anti-unification problem always has at least one solution: the generalized expression h0 always works, and of course the substitutions are s/h0 and t/h0.

Thus we need to make the anti-unification problem a little bit harder for it to be interesting: we want the most specific generalization, not just any generalization.

It was pretty clear that unification on equations could be generalized to multiple equations. Similarly, I hope it is clear that anti-unification on two expressions can be generalized to any number of expressions; we want the generalizing expression and a substitution for each input expression.

And of course just as there was higher-order unification for unification problems involving lambdas, and unification modulo theory for problems involving arithmetic or other mathematical ideas, there are similar variations on anti-unification. For our purposes we’ll consider only first-order anti-unifiction, which is the easy one.

Why is anti-unification useful? I’ll get into that in a later post. Now that we know what first-order anti-unification is, can we devise and implement an algorithm that takes in expressions and gives us the most specific anti-unifying expression?

Next time on FAIC: we’ll sketch out the algorithm.