How do we ensure that method type inference terminates?

Here's a question I got from a coworker recently:

It is obviously important that the C# compiler not go into infinite loops. How do we ensure that the method type inference algorithm terminates?

The answer is quite straightforward actually, but if you are not familiar with method type inference then this article is going to make no sense. You might want to watch this video if you need a refresher.

Method type inference since C# 3.0 basically works like this: we create a set of bounds on each method type parameter. We then "fix" each type parameter to a member of its bounds set. That might enable new bounds to be computed, so this algorithm is a loop. Once every type parameter is fixed, method type inference has succeeded. If any type parameter cannot be fixed for some reason then type inference fails.

We ensure termination like this: if we manage to make it through the body of the loop without fixing at least one type parameter then type inference fails. Therefore, the type inference loop can run at most n times if the method has n type parameters. If we make it to n times through the loop then type inference must have fixed a type parameter on every iteration, and type inference has succeeded. If we failed before n times through the loop then obviously type inference did not run forever.

That's a bit highfalutin; let me flesh that out a bit. A "bound" is nothing more than a type, and a bound can be "upper", "lower" or "exact". For example, suppose we have a type parameter T with three bounds: a lower bound of Giraffe, an exact bound of Mammal, and an upper bound of Animal. Let's say that Animal is a "larger" type than Mammal (because all mammals are animals but not all animals are mammals, thus Animal must be the larger type), and Giraffe is a "smaller" type than Mammal. Given this set of bounds we know that T must be inferred to be first, either Giraffe or a type larger than Giraffe, because Giraffe is a "lower bound"; you can't infer a type smaller than Giraffe. Second, we know that T must be Mammal, exactly. And third, we know that T must be either Animal or a type smaller than Animal, because Animal is an upper bound. We cannot infer a type larger than Animal. The C# compiler deduces that Mammal is the only type in the set that meets all three requirements, and so T would be fixed to Mammal. If there are multiple types in the set that meet all the requirements (which of course cannot happen if there are any exact bounds!) then we pick the largest such type.1

The interesting part of method type inference is how we deal with lambdas. Suppose we have a method Select<A, R>(I<A>, Func<A, R>) where the second argument is c=>c.Name. We say that A is an "input" type parameter and R is an "output" type parameter. (It is of course possible for a type parameter to be both an input and output type parameter!) Furthermore, we say that R "depends on" A, because the type of A could possibly determine the type of R. (Of course the "depends" relationship can be cyclic.)

The type inference algorithm, at a high level, goes like this:

  • Add bounds to type parameters based on all non-lambda arguments, and all lambda arguments where the delegate type has no type parameters in its inputs.
  • Loop
    • Is every type parameter fixed?
      • Type inference has succeeded. Terminate the algorithm.
    • Is there any lambda argument converted to a delegate type where the inputs of the delegate type are all known and the output type involves an unfixed type parameter?
      • Deduce the return type of all such lambdas and make inferences that add bounds to the corresponding delegate's output types.
    • Is there any unfixed, bounded type parameter that does not appear in an output type of a delegate that has unfixed input types?
      • Fix all such type parameters and go back to the top of the loop.
    • Is there any unfixed, bounded type parameter such that an unfixed type parameter depends on it, directly or indirectly?
      • Fix all such type parameters and go back to the top of the loop.
  • If we make it here then we failed to make progress; we have just as many fixed type parameters as we started with. Type inference fails. Terminate the algorithm.

So, for example, if we had Select(customers, c=>c.Name); where customers implements I<Customer> then we start by inferring that A has a lower bound of Customer.2 We have no lambda arguments that correspond to formal parameters where the delegate type has no type parameters in its inputs, so we enter the loop.

Is every type parameter fixed? No.

Is there any lambda argument converted to a delegate type where the inputs are known and the output involves an unfixed type parameter? No. There is a lambda argument converted to a delegate type, and the output involves unfixed type parameter R, but the input type is A and A is not fixed. So we have no inferences to make.

Is there an unfixed type parameter that has bounds and does not appear in an output type of a delegate that has unfixed input types? Yes. A has bounds and does not appear as an output type, period.

Therefore we fix A. It has only one bound, Customer, so we fix it to Customer. We have made progress, so we go back to the top of the loop.

Is every type parameter fixed? No.

Is there any lambda argument converted to a delegate type where the inputs are known and the output involves an unfixed type parameter? Yes! Now we make an inference. A is fixed to Customer, so we add the type of Customer.Name, say, string, as a lower bound to R. Now we must fix something.

Is there an unfixed type parameter that has bounds and does not appear in an output type of a delegate that has unfixed input types? Yes. R is unfixed, it has bounds, and it appears as an output type of a delegate that has fixed input types, so it is a candidate for fixing. We fix R to its only bound, string, and start the loop again.

Is every type parameter fixed? Yes. We're done.

This technique of preventing infinite loops by requiring that each loop iteration make progress is quite useful, and clearly in this case it guarantees that the algorithm executes the loop no more times than there are type parameters to fix.

You might wonder if it is therefore the case that method type inference is O(n) in the number of type parameters. It turns out that it is not, for several reasons. First, as a practical matter it only makes sense to determine the asymptotic order of an algorithm if the size of the problem is likely to become large. I've never seen a method with more than five type parameters in the wild (aside from the tuple constructors, which are straightforward), and even that is pretty unusual. Most generic methods have one or two type parameters. Second, doing the analysis of the lambdas is the expensive part, and it only really makes sense to analyze the behaviour of the most expensive part. We already know that analyzing lambdas is, worst case, an NP-HARD problem so whether or not method type inference is O(some polynomial) is possibly not that relevant. Third, you'll notice that in my sketch of the algorithm we have to answer questions like "is there any unfixed type parameter that has an unfixed type parameter that depends on it?" This requires solving a graph-traversal problem, whose asymptotic cost we have not analyzed! I won't take you through the boring analysis, but suffice to say there could be O(n2) dependency relationships that each cost O(n) to analyze, and we could go through the loop n times, for an extremely unlikely worst case of O(n4). The implementation of this algorithm is actually O(n2) in the common case; because n is likely to be small, as I said, we have not put the effort into more sophisticated algorithms that can solve these graph problems even faster in the asymptotic case.


Next time on FAIC: There are functions on integers that grow faster than any function you can write a program to produce. Demonstrating this surprising fact involves beavers. We'll see why!

  1. Note that this algorithm is consistent with other type inference features in C# in two ways. First, when asked to infer a best type from a set of types, we always choose a type from the set. We never say "well we have Dog and Cat in the set so let's infer Mammal" unless Mammal is itself in the set. Second, when faced with multiple possible "best" types, we pick the largest. There is an argument to be made for picking the smallest, but picking the largest seems to match more people's intuitions of what the right choice is.
  2. Assuming that the type I<T> is covariant in T. If it were contravariant then we would deduce an upper bound, and if it were invariant then we would deduce an exact bound. See my series on variance if that is not clear.

Representation and identity

(Note: not to be confused with Inheritance and Representation.)

I get a fair number of questions about the C# cast operator. The most frequent question I get is:

short sss = 123;
object ooo = sss;            // Box the short.
int iii = (int) sss;         // Perfectly legal.
int jjj = (int) (short) ooo; // Perfectly legal
int kkk = (int) ooo;         // Invalid cast exception?! Why?

Why? Because a boxed T can only be unboxed to T.1 Once it is unboxed, it’s just a value that can be cast as usual, so the double cast works just fine.

Many people find this restriction grating; they expect to be able to cast a boxed thing to anything that the unboxed thing could have been cast to. There are ways to do that, as we’ll see, but there are good reasons why the cast operator does what it does.

To understand why this design works this way it’s necessary to first wrap your head around the contradiction that is the cast operator. There are two2 basic usages of the cast operator in C#:

  • My code has an expression of type B, but I happen to have more information than the compiler does. I claim to know for certain that at runtime, this object of type B will actually always be of derived type D. I will inform the compiler of this claim by inserting a cast to D on the expression. Since the compiler probably cannot verify my claim, the compiler might ensure its veracity by inserting a run-time check at the point where I make the claim. If my claim turns out to be inaccurate, the CLR will throw an exception.
  • I have an expression of some type T which I know for certain is not of type U. However, I have a well-known way of associating some or all values of T with an “equivalent” value of U. I will instruct the compiler to generate code that implements this operation by inserting a cast to U. (And if at runtime there turns out to be no equivalent value of U for the particular T I’ve got, again we throw an exception.)

The attentive reader will have noticed that these are opposites. A neat trick, to have an operator which means two contradictory things, don’t you think?

This dichotomy motivates yet another classification scheme for conversions.3 We can divide conversions intorepresentation-preserving conversions (B to D) and representation-changing conversions (T to U).4 We can think of representation-preserving conversions on reference types as those conversions which preserve the identity of the object. When you cast a B to a D, you’re not doing anything to the existing object; you’re merely verifying that it is actually the type you say it is, and moving on. The identity of the object and the bits which represent the reference stay the same. But when you cast an int to a double, the resulting bits are very different.

All the built-in reference conversions are identity-preserving.5 Obviously trivial “conversions” such as converting from int to int are also representation-preserving conversions. All user-defined conversions6 and non-trivial value type conversions (such as converting from int to double) are representation-changing conversions. Boxing and unboxing conversions are all representation-changing conversions.

The representation-preserving conversions that are known to never fail often result in no codegen at all.7 If a representation-preserving conversion could fail then a castclass instruction is emitted, which does a runtime check and throws if the check fails.

But each representation-changing conversion is handled in its own special way. User-defined conversions are resolved using a special version of the overload resolution algorithm, and generated as a call to the appropriate static method. Boxing and unboxing conversions are generated as box and unbox instructions. All the other built-in conversions (int to double, and so on) are generated as custom sequences of instructions that do the right conversion.

So now that you know that, consider what the compiler would have to do to make this work the way some people expect:

int kkk = (int) ooo;

All that the compiler knows is that ooo is of type object. It could be anything. Suppose it is a boxed int – then the compiler should generate an unboxing instruction. Suppose it is a boxed short. Then the compiler should unbox the short and then generate the custom sequence of instructions that convert a short to an int. Suppose it is a boxed double – same thing, but different instructions. And so on, for all the built-in conversions that go to int.

This would be a huge amount of code to generate, and it would be very slow. The code is of course so large that you would want to put it in its own method and just generate a call to it. Rather than do that by default, and always generate code that is slow, large and fragile, instead we’ve decided that unboxing can only unbox to the exact type. If you want to call the slow method that does all that goo, it’s available – you can always call Convert.ToInt32, which does all that analysis at runtime for you. We give you the choice between “fast and precise” or “slow and lax”, and the sensible default is the former. If you want the latter then call the method.

That’s just the built-in conversions. Let’s continue imagining what would have to happen if we wanted all possible conversions to int to just work out correctly at runtime, instead of just bailing out early if the boxed thing is not an int.

Suppose the object is a Foo where there is a user-defined conversion from Foo (or one of its base classes) to int (or a type that int is explicitly convertible from, like, say, Nullable<int>). Then the compiler would need to generate a call to that conversion method, just as it would if the type had been known at compile time, and then possibly also generate the conversion from the return type of the method to int.

Remember, there could be arbitrarily many such conversion methods on arbitrarily many types. The type Foo and its conversion method might not even be defined in the assembly currently being compiled or any assembly referenced. Therefore the compiler would have to generate code to interrogate Foo at runtime, do the overload resolution analysis, and then dynamically spit the code to do the call.

Which is exactly what the compiler does in C# 4.0 if the argument to the cast is of type dynamic instead of objectThe compiler actually generates code which starts a mini version of the compiler up again at runtime, does all that analysis, and spits fresh code. This is not fast, but it is accurate, if that’s what you really need. (And the spit code is then cached so that the next time this call site is hit, it is much faster.)

I don’t think people really expect the compiler to start up again at runtime every time they cast an object to int; I think they just haven’t thought through carefully exactly how much analysis solving the problem would take. Rather a lot, it turns out.

  1. Or Nullable<T>.
  2. There are others that are not germane to this discussion. For example, a third usage is “Everyone knows that this D is also of base type B; I want the compiler to treat this expression of type D as a B for overload resolution purposes.” That would clearly be an identity-preserving conversion.
  3. There are many ways to classify conversions; we already divide conversions into implicit/explicit, built-in/user-defined, and so on. For the purposes of this discussion we’ll gloss over the details of those other classifications.
  4. I’m glossing over here that certain conversions that the C# compiler thinks of as representation-changing are actually seen by the CLR verifier as representation-preserving. For example, the conversion from int to uint is seen by the CLR as representation-preserving because the 32 bits of a signed integer can be reinterpreted as an unsigned integer without changing the bits. These cases can be subtle and complex, and often have an impact on covariance-related issues; see next footnote. I’m also ignoring conversions involving generic type parameters which are not known at compile time to be reference or value types. There are special rules for classifying those which would be major digressions to get into.
  5. This is why covariant and contravariant conversions of interface and delegate types require that all varying type arguments be of reference types. To ensure that a variant reference conversion is always identity-preserving, all of the conversions involving type arguments must also be identity-preserving. The easiest way to ensure that all the non-trivial conversions on type arguments are identity-preserving is to restrict them to be reference conversions.
  6. The rules of C# prohibit all user-defined conversions that could possibly be identity-preserving coercions. More generally, all user-defined conversions that could possibly be any "standard" conversion are illegal.
  7. Again, I’m ignoring irksome generic issues here. There are situations where humans can prove mathematically that two generic type parameters must be identical at runtime, but the verifier is not smart enough to make those same deductions and requires the compiler to emit type checks.