Excessive explanation, part fifteen

All right, we are getting through section four of this paper. Today: what is an “environment”?

Now let Env = Id → V be the domain of environments η. 

Just to be clear, this is not a function in the Exp language. This is a formal description of part of the runtime engine of the Exp language. Basically we are saying here that an “environment” is simply the part of the runtime of an implementation of the language that takes identifiers and maps them onto their values. And that just as we’ll use σ for type schemes and α for type variables, we’ll use η for environments.

You’re so accustomed to using environments that they are like the air you breathe; you don’t even think about it. When you’re writing a program in some language and you say x = y + z;, you’re saying “obtain the values associated with identifiers y and z from the environment, add them, and modify the environment so that next time we ask, x has that value”. We can characterize the accesses to y and z as the environment being a function that takes “y” or “z” and returns their values. We can characterize the assignment to x as being the construction of a new environment based on the previous environment.

The semantic function ε : Exp → Env → V is given in [5]. 

Again this is not an function in the Exp language; this is again a formalization of the runtime engine.

Remember that in our notation, a function of two arguments is notated as a function of one argument that returns a function of one argument. So what we’re saying here is that the “semantic function” is a function that takes an expression and an environment and produces a value.

Again, this is making formal the basic operation of a runtime. The runtime has a bunch of expressions and the values of all the variables, and it evaluates the expressions in the context of the environment.

Using it, we wish to attach meaning to assertions of the form 

                           A ⊨ e:σ

where e ∈ Exp and A is a set of assumptions of the form x:σ, x∈Id.

This right here is the mission statement of this paper.

The double-turnstile means “the thing on the right is a logical consequence of the thing on the left”.

What we’re saying here is: we have (on the left) a set of judgments (called “assumptions” here) where all the expressions are identifiers. Say:

square:int → int
identity:∀β(β → β)

We have on the right an arbitrary expression with a typing judgement, say

λy.identity (square x) : ∀β(β → int)

I hope you agree that if we assume that x is an int, square is a function from int to int, and identity is an identity function, that the lambda given is a function that takes any type and returns an int. The assumptions on the left are sufficient to deduce that the judgment on the right is correct (if the assumptions are correct, of course).

The goal of this paper is to provide an algorithm which, given the facts on the left and the expression on the right, deduces the type schema on the right in a manner that allows us to confidently say that yes, we have sufficient evidence on the left to make this judgment on the right.

(Aside: is it possible to have a function with type scheme ∀β(β → β) that is not an identity function? Ponder this!)

If the assertion is closed, i.e. if A and σ contain no free type
variables, then the sentence is said to hold iff, for every 
environment η, whenever η ⟦x⟧:σ', for each member x:σ' of A,
it follows that ε ⟦e⟧ η:σ. 

OK, we’ve got some jargon and some notation here used without introduction.

First of all, “the assertion” and “the sentence” are being used interchangeably here. We are “asserting” that the assumptions about typed identifiers on the left are sufficient evidence to deduce the type of the expression on the right.

Second, remember when we abused the Exp function notation to say that an environment was a function Id → V? You would think then that we would continue to abuse that notation and say that given an environment η and an identifier x we would say that the value of x was η x. And is what we do, but to call out that x is being used here as an identifier rather than standing in for a value, we “quote” it with funny brackets. So the value of identifier x in environment η is notated η ⟦x⟧.

Similarly for the semantic function ε. It takes two things: an expression e and an environment η, and to emphasize that the expression is an expression of the Exp language, we bracket it as well. So ε ⟦e⟧ η is “the value of expression e when evaluated in environment η”.

So what the heck is this jargon-filled paragraph actually saying? We have a bunch of identifiers with types on one side, and the statement that this implies that expression on the other side has a particular type. The paragraph is saying that this means: choose any possible environment that has those identifiers having some values of the given types. No matter what environment you chose, evaluating the expression in that environment gives a value that matches the expression’s type.

That might be too abstract still. In our example:

x:int, square:int → int, identity:∀β(β → β) ⊨ 
  λy.identity (square x) : ∀β(β → int)

We’re saying that no matter what integer we choose for x, no matter what function from int to int “square” really is, and no matter what function matching its type scheme “identity” really is, we know that evaluating the expression on the right will produce a function from something to int. There is never any combination of values for the environment that would cause that expression to evaluate to something else.

Further, an assertion holds iff all its closed instances hold.

That’s maybe not so clear but fortunately there is an example in the paper:

Thus, to verify the assertion

                x:α, f: ∀β(β → β) ⊨ (f x): α

it is enough to verify it for every monotype μ in place of α.
This example illustrates that free type variables in an assertion
are implicitly quantified over the whole assertion, while explicit
quantification in a type scheme has restricted scope.

What we’re getting at here is that this is not the same as saying

                x:∀α α, f: ∀β(β → β) ⊨ (f x): α

Because that scopes the alpha to just the type of x, but we want that alpha to be still “in scope” on the right hand side of the double turnstile. Basically it is saying that the assertion is logically:

              ∀α  ( x:α, f: ∀β(β → β) ⊨ (f x): α  )

All right, that gets us through section four! The last paragraph in section four lays out what is going to happen in the rest of the paper:

The remainder of this paper proceeds as follows. First we present
an inference system for inferring valid assertions. Next we present 
an algorithm W for computing a type scheme for any expression, under
assumptions A. We then show that W is sound, in the sense that any 
type-scheme it derives is derivable in the inference system. Finally
we show that W is complete, in the sense that any derivable 
type-scheme is an instance of that computed by W.

Next time: A notation for making chains of logical inferences, and another kind of turnstile.

14 thoughts on “Excessive explanation, part fifteen

  1. > (Aside: is it possible to have a function with type scheme ∀β(β → β) that is not an identity function? Ponder this!)

    It is, if the function in question is not required to be total. For example, the function that diverges for every possible argument matches this type scheme.

    In the real world, a function can diverge by entering an infinite loop, or terminating the process, or (in some frameworks) by throwing an exception, or by invoking another function that diverges.

    • Am I missing something? Aren’t there lots of functions ∀β(β → β) that are not identities? Presumably the mentioned “square” function is not an identity, for example. Unless the problem is that Exp is so basic that it’s not really possible to define anything else, because there are no operators *, +, etc?

    • I don’t think I can edit or delete my above post. I think I see that it’s because it’s generic. It has to be something applicable for all types, not just some arithmetic type or whatever. But we don’t really have anything defined at the most base level (something of type “object” in a normal language I guess), so we can’t really do anything with a parameter of generic type β.

      • Even if you could do something with “object”, how would that help you produce a value of type β?

        In C#, there is at least one other function that fits: one that returns default(β). But there is no default operator in Exp.

        • By “do something with”, I meant either modify or combine or duplicate, etc. But we can’t add βs, we can’t increment them, we can’t make new ones, and as you said we can’t even get a ‘default’ or ‘blank’ one.

  2. Pingback: Excessive explanation, part fifteen | Ace Infoway

  3. > (Aside: is it possible to have a function with type scheme ∀β(β → β) that is not an identity function? Ponder this!)

    I propose a function which returns 3 if β is an int and the function parameter otherwise.

  4. > (Aside: is it possible to have a function with type scheme ∀β(β → β) that is not an identity function? Ponder this!)

    In Haskell, I could write:

    let f a = f a

    Which I thought would be a non-identity function from ∀β(β → β), however ghci tells me the type for this function is:

    f :: t1 -> t

    So it seems my assumption is off. Can you explain this and/or give the solution to your question?


    • Your supposition is that your f is ∀β(β → β) but not an identity.

      Suppose we call your function with a randomly chosen integer, say: (f 123)

      Is the value that is returned always an integer?

      If the answer is “yes it is always an integer” then we must demonstrate that it could be an integer other than 123 to show that f is a function ∀β(β → β) that is not an identity. Can you demonstrate that? Which integer is it?

      If you cannot demonstrate that then we have no reason to suppose that it is not an identity.

      If the answer is “no”, it is not always an integer, then it is not a function ∀β(β → β).

      Of course, that’s a trick question because it is not really a “yes or no” question. The real answer is “this function never returns, so asking about the properties of the returned value is nonsensical”. You’ll note in a comment above that a commenter calls functions like this “diverging” functions.

      That gives us our answer: since (f 123) does not return an integer I see no reason to suppose that f is ∀β(β → β).

      This makes me think of a related question for you to ponder. Is there any function that gets typed as ∀α∀β(α → β) that does not diverge?

      • There are lots of fun problems like this. Suppose we extend our type system to include generic lists, using the standard definition of lists: a list-of-α is either empty, or is an α followed by a list-of-α. Let’s define the function len as the function list-of-α → int that gives the length of the list: zero for an empty list, and so on, you know how this goes.

        Suppose we have a function f whose type schema is ∀α(list-of-α → list-of-α). Can you make an argument that for any list-of-α x, that (len (f x)) is equal to (len x) ?

      • Thanks for the explanation, that makes a lot of sense! As for your second question, I think it is clearer in this case that there exists no function which does not diverge such that ∀α∀β(α → β). You don’t know what α is, so there seems no way to reason about what you should return for β. It seems as though you would have to pick a random value of a random type to return, but I doubt that concept would fit within this functional framework.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s