The names of birds, part 2

Reader “Joel” had an insightful comment on the first part of this series which I thought deserved a short episode of its own. Recall that we proved the theorem “if a compositional forest contains a mockingbird then every bird in the forest has a fixpoint“. An equivalent way to state that theorem is: “if a forest contains a bird without a fixpoint then either the forest is not compositional or it contains no mockingbird“. Joel’s question was whether there is a more “intuitive” way to understand why the mockingbird is so interesting that the presence of a bird without a fixpoint should prevent a mockingbird from existing in the same (compositional) forest.

The short version is: we’ll say that two birds agree on an input if their outputs are the same for that input. The mockingbird is a bird that agrees with every other bird on at least one input. A bird without a fixpoint is a bird that can always produce a different output than its input. Their composition is a bird which disagrees with every bird in the forest on one input and therefore is not in the forest at all.

The long version is:

I find it easiest to get an intuition for a general rule by looking at some simple examples.

The first thing I’d note is that it is quite easy to construct a non-compositional forest containing a bird without a fixpoint. A simple example would be a non-compositional forest containing exactly two birds, A and M. Define them such that:

AA = M
AM = A
MA = M
MM = M

Plainly these are two different birds, since their responses disagree on input M. Plainly M is a mockingbird, since Mx=xx for all x in the forest. And plainly A has no fixpoint. This forest is not compositional because there is no bird C in the forest such that Cx=A(Mx) for all x. With such a small number of birds in the forest it seems quite reasonable that one might have no fixpoint.

Let’s look at a more interesting example, this one in a compositional forest. Let’s suppose there is a countable infinity of birds in the forest. By “countable infinity” I mean that we can number each bird with a natural number; we’ve got bird 0, bird 1, bird 2, and so on. In fact, let’s just use natural numbers as the names of the birds in this forest.

Since each bird needs a response to hearing the name of every other bird in the forest, we can imagine an infinite lookup table that says what each bird responds when it hears the name of another bird in the forest. We’ll put the “called to” bird on the vertical axis and the “name called” on the horizontal axis:

      0 1 2 3 ...
    ------------
  0 | 9 5 2 5 ...
  1 | 8 3 6 7 ...
  2 | 0 1 2 3 ...
  3 | 3 1 4 1 ...
... |
  S | 1 2 3 4 ...
... |

So in our notation 0 1=5, 2 3=3, and so on.

Let’s suppose this forest is compositional and contains a bird, call it S, which has no fixpoint. We could choose any such S, but without loss of generality we shall pick a specific S, the bird where S 0=1, S 1=2, S 2=3, S 3=4, … and so on; the “successor” bird. Of course S has some “number” name but we don’t really care what it is, so we’ll just “alias” that one as S.

Now let’s suppose there’s a mockingbird in the forest and deduce a contradiction. The mockingbird tells us what is on the highlighted diagonal! That is, M 0=9, M 1=3, M 2=2, M 3=1, and so on. (As with S, M is an alias for some “number” name but we don’t know or care what it is.)

Now comes the fun bit. The forest is compositional, so therefore there exists a bird C such that Cx=S(Mx) for all x. We do care what the number name of C is. So, which is it?

C 0=10, and therefore C is not 0, because 0 0=9. But similarly, C 1=4, so C is not 1. And similarly C is not 2, 3, 4, 5, … because C definitely disagrees with each of those functions I mean birds at the diagonal. But if C has no number name, then it is not in the forest at all, contradicting the statement that the forest is compositional. We now have deduced a contradiction, and therefore our assumption that M is in the forest is shown to be false.

By looking at a specific example — a countable forest with a successor bird — perhaps you have a better intuition about why Mx=xx and any bird S that has no fixpoint cannot coexist in a compositional forest: if they did then the bird C such that Cx=S(Mx) definitely disagrees with every bird in the forest on at least one input, and therefore the forest would not be closed under composition; there would always be a missing bird.


Next time on FAIC: We’ll look at some more fun examples and then get back to compilation.

The names of birds, part 1

For the next part in my Bean Machine retrospective to make sense I’ll need to make a short digression. In looking back on the almost 20 years I’ve been blogging, it is surprising to me that I’ve only briefly alluded to my appreciation of combinatory logic. In the next couple of episodes, I’ll do a quick introduction based on the delightful book that introduced it to me: To Mock A Mockingbird, by the late Raymond Smullyan.

Imagine a forest containing some birds, possibly finitely or infinitely many. These are unusual birds. When you call the species name of a bird in the forest to a bird in the forest, it calls one back to you. Maybe the same, maybe different, but when you tell a bird the name of a bird, it names a bird back to you. There might be a Red Capped Cardinal in the forest and when you call out Great Blue Heron to it, it calls back Belted Kingfisher.

(Photos by me; click for higher resolution.)

We will notate “I called Q to P and got response R” as PQ = R. If I then called out S to R and R responded with T, we’ll notate that as PQS = RS = T. We’ll use parentheses in the obvious way: PQS = (PQ)S and this might be different from P(QS). The latter is “I called S to Q, and then called Q‘s response to P“. I’ll use capital letters to represent specific bird names, and small letters to represent variables.

The question we’re considering here is: under what circumstances will a bird call back the same name you called to it? That is, for a given bird y, under what circumstances does yx = x?

Smullyan calls birds with this relationship “fondness” — that is, “y is fond of x” means that yx = x. If y is fond of x then x is said to be a “fixpoint” of y.


A forest is said to be “compositional” if for every pair of birds (a, b) — a and b can be the same or different — there is a bird c in the forest such that cx = b(ax) for all x. That is, if we call any name x to a, and then call its response to b, we get the same result as simply calling x to c. c is the composition of “call x to a, and then call that response to b“.

A mockingbird is the bird M with the property that Mx = xx. That is, for any bird name x, M tells you what x‘s response is to its own name. (The attentive reader will note that we have not said what MM is, but at least we know from the definition that MM = MM, so give me some credit for consistency at least.)

Theorem: if a compositional forest contains a mockingbird then every bird in the forest is fond of at least one bird.

Proof: Try to prove it yourself; scroll down for the solution.

.

.

.

.

.

.

.

Let p be any bird. Let c be the bird that composes p with M. Therefore cx = p(Mx) for all x.

In particular, that’s true for c, so cc = p(Mc). But Mc = cc, thus cc = p(cc), and we’ve found a bird that p is fond of. Since p was any bird, every bird in the forest is fond of at least one bird. Or, put another way, every bird in a compositional forest with a mockingbird has at least one fixpoint.


Next time on FAIC: we’ll take a look at a few more interesting birds, and then discuss why this whimsy is relevant to compilation before getting back to Bean Machine.

Bean Machine Retrospective, part 6

Happy New Year all!

Last time I briefly described the basic strategy of the Beanstalk compiler: transform the source code of each queried or observed function (and transitively their callees) into an equivalent program which partially evaluates the model, accumulating a graph as it goes. In this post I’ll go through the first step of the transformation.

The idea of the first source-to-source transformation is to take a relatively complicated language — Python — and replace it with what I think of as “simplified Python”. The Zen of Python famously says “there should be one — and preferably only one — obvious way to do it” but that is not the goal of this simplification step. The Zen of Python emphasizes obviousness. It is about making the code clear and the language discoverable. My goal with simplified Python is “there should be only one way to do it” — because it is easier to instrument a program to capture its meaning as it runs if there’s only one way to do everything.

Consider for example how to call a function foo with three arguments:

foo(1, 2, 3)
x = [1, 2, 3]
foo(*x)
foo(1, 2, bar=3)
x = {"bar":3}
foo(1, 2, **x)

… and so on — there are a great many obvious ways to call a function in Python, but I’d like there to be exactly one in simplified Python.

The first thing we do when analyzing a queried or observed function is to obtain its source code, parse that into an abstract syntax tree (AST), and then do a series of AST transformations that reduce the function into an equivalent but simpler Python program. This is the most traditional-compiler-ish step in the pipeline.

(How do we obtain the source code? Remember, we have an RVID for each observation and query; the RVID contains a reference to the original function object. Python’s runtime can provide the source code for a given function reference.)

Let’s look at an only slightly more complicated version of our “hello world” model:

@random_variable
def fairness():
  return Beta(2,2)

@random_variable
def flip(n):
  return Bernoulli(0.5 * fairness())

In our simplified Python:

  • Every value gets its own variable
  • Every computation gets its own statement
  • Every call has the same syntax: a = b(*c, **d)

Our two methods in the simplified form would be:

def fairness():
  _t1 = 2
  _t2 = 2
  _t3 = [_t1, _t2]
  _t4 = {}
  _t5 = Beta(*_t3, **_t4)
  return _t5

def flip(n):
  _t1 = 0.5
  _t2 = []
  _t3 = {}
  _t4 = fairness(*_t2, **_t3)
  _t5 = _t1 * t4
  _t6 = [_t5]
  _t7 = {}
  _t8 = Bernoulli(*_t6, **_t7)
  return _t8

ASIDE: Readers with some experience writing compilers might ask “is this static-single-assignment form?” SSA form is a transformation whereby every variable is assigned exactly once; it’s used for control flow analysis and other program analyses. I did not write a full-on SSA transformation but when I was doing initial prototyping I knew that I might need SSA form in the future. Think of it as SSA-light if that makes sense.


We can go further; in our simplified Python:

  • logical operators and and or are eliminated
x = y and z

can be simplified to

x = y
if x:
  x = z

and now the graph accumulator does not need to worry at all about logical operators, because there aren’t any in the simplified program.

  • Every while loop is while True.
  • No loop has an else clause. (Did you know that Python loops have else clauses? It’s true!)
  • There are no compound comparisons:
x = a < b < c

can be simplified to

x = a < b
if x:
  x = b < c

and now the graph accumulator does not need to worry about compound comparisons, because there aren’t any.

  • Similarly there are no lambdas, no function annotations, and so on. All the convenient syntactic sugars are desugared into their more fundamental form.

The simplified language is definitely not a language I’d like to write a long program in, but it is a language that is easy to analyze programmatically. Thus far, the transformations make the program longer and simpler, but do not change its meaning; if we compiled and ran the simplified version of the program it should do the same thing as the normal version.


Next time on FAIC: let’s get into the details of how the AST-to-AST transformation code works. I had a lot of fun writing it.