# Excessive explanation, part nineteen

We’ve just seen that we can use our deductive logic system with its rules TAUT, GEN, and so on, to start from knowing nothing, and end with a correct derivation of the type scheme of a complex expression. But here’s an important question.

We know that `A ⊢ e:σ` means that there exists a derivation in our deductive system that allows us to conclude that the given expression is of the given type scheme when the given assumptions are true.

And we know that `A ⊨ e:σ` means that no matter what environment with values we assign to the identifiers named in assumptions `A`, the value computed for expression `e` will be a value of type scheme `σ`.

What we would really like to know is that `A ⊢ e:σ` logically implies `A ⊨ e:σ`. That is, if we have a derivation at compile time that the type scheme is correct in our deduction system, that there will be no type violation at runtime! Recall that earlier we defined the “soundness” of a logical deduction as a logically correct derivation where we start with true statements and end with true statements; what we want is to know that our logical system of derivation of types is “sound” with respect to the semantics of the language.

The paper of course provides a detailed proof of this fact:

```The following proposition, stating the semantic soundness
of inference, can be proved by induction on e.

Proposition 1 (Soundness of inference). If A ⊢ e:σ then A ⊨ e:σ.
```

That’s the entire text of the proof; don’t blink or you’ll miss it. It can be proved, and that’s as good as proving it, right? And there’s even a big hint in there: it can be proved by induction.

Wait, what?

Inductive proofs are a proof technique for properties of natural numbers but `e` is an expression of language Exp, not a number at all!

Let’s consider for a moment the fundamental nature of an inductive proof. Inductive proofs on naturals depend on the following axioms: (by “number” throughout, I mean “non-negative integer”.)

• Zero is a number.
• Every number k has a successor number called k + 1.
• If something is true for zero, and it being true for k implies it is true its successor, then it is true of all numbers.

So the idea is that we prove a property for zero, and then we prove that if the property holds for k, then it also holds for k+1. That implies that it holds for 0+1, which implies that it holds for 1+1, which implies that it holds for 2+1, and so on, so we deduce that it applies to all numbers.

This is the standard form in which inductive reasoning is explained for numbers, but this is not the only form that induction can take; in fact this kind of induction is properly called “simple induction”. Here’s another form of induction:

Prove a property for zero, and then prove that if the property holds for k and every number smaller than k, then it also holds for k+1.

This form of induction is called “strong induction”. Simple and strong induction are equivalent; some proofs are easier to do with simple induction and some are easier with strong, but I hope it is clear that they are just variations on the technique.

Well, we can make the same reasoning for Exp:

• An identifier is an expression.
• Every other kind of expression — lambdas, lets, and function calls — is created by combining together a finite number of smaller expressions.
• Suppose we can prove that a property is true for the base case: identifiers. And suppose that we assume that a property is true for a given expression and all of the finite number of smaller expressions that it is composed of. And suppose we can prove from the assumption that then the property holds on every larger expression formed from this expression. Then the property holds on all expressions.

That’s just a sketch to get the idea across; actually making a formally correct description of this kind of induction, and showing that it is logically justified, would take us very far afield indeed. But suffice to say, we could do so, and are justified in using inductive reasoning on expressions.

Before we go on I want to consider a few more interesting facts about induction. Pause for a moment and think about why induction works. Because I was sloppy before; I left some important facts off the list of stuff you need to make induction work.

Suppose for example I make up a new number. Call it Bob. Bob is not the successor of any number, and it is not zero either. Bob has a successor — Bob+1, obviously. Now suppose we have a proof by induction: we prove that something is true for 0, we prove that if it is true for k then it is true for k+1. Have we proved it for all numbers? Nope! We never proved it for Bob, which is not zero and not the successor of any number!

Induction requires that every number be “reachable” by some finite number of applications of the successor relationship to the smallest number, zero. One of the axioms we need to add to our set is that zero is the only number that is the successor of no number.

The property that expressions in Exp share with integers is: every expression in Exp is either an identifier, or is formed from one or more strictly smaller expressions. There is no “Bob” expression in Exp; there’s no expression that we can’t get to by starting from identifiers and combining them together via lets, lambdas and function calls. And therefore if we can start from identifiers and work our way out, breadth-first, to all the successor expressions, then we will eventually get all expressions. Just as if we start from zero and keep adding one, we eventually get to all numbers.

Induction is really a very powerful technique; it applies in any case where there are irreducible base cases, and where application of successorship rules eventually gets to all possible things.

Back to the paper.

Basically the way the proof would proceed is: we’d start with the base case: does our deductive system guarantee that if `A ⊢ x:σ` for an identifier x then `A ⊨ x:σ`? That’s our base case, and it’s pretty obviously true. Suppose we have the assumption `A = {x:σ}`. By assumption, the value of x at runtime must be from type scheme `σ`. By the TAUT rule, we can deduce that `A ⊢ x:σ`. And plainly `A ⊨ x:σ` is true; by assumption `x` will have a value from `σ`! Our base case holds.

Next we would assume that `A ⊢ k:σ` implies `A ⊨ k:σ`, and then ask the question “can we show that for every possible successor k’ of k, `A ⊢ k':σ'` implies `A ⊨ k':σ'`“? If we can show that, then by the inductive property of expressions, it must be true for all expressions that `A ⊢ e:σ` implies `A ⊨ e:σ`.

The actual proof would be a bit trickier than that because of course some of the successors require two expressions, and so we’d have to reason about that carefully.

These proofs are extraordinarily tedious, and you learn very little from them. Thus the proof is omitted from the paper.

There are two more sketch proofs in this section. We’ll cover them next time, and then that will finish off the section on the deductive system.