January | 2012 | Fabulous adventures in coding

If you ask a dozen C# developers what a “local variable” is, you might get a dozen different answers. A common answer is of course that a local is “a storage location on the stack”. But that is describing a local in terms of its implementation details; there is nothing in the C# language that requires that locals be stored on a data structure called “the stack”, or that there be one stack per thread. (And of course, locals are often stored in registers, and registers are not the stack.)

A less implementation-detail-laden answer might be that a local variable is a variable whose storage location is “allocated from the temporary store”. That is, a local variable is a variable whose lifetime is known to be short; the local’s lifetime ends when control leaves the code associated with the local’s declaration space.

That too, however, is a lie. The C# specification is surprisingly vague about the lifetime of an “ordinary” local variable, noting that its lifetime is only kinda-sorta that length. The jitter’s optimizer is permitted broad latitude in its determination of local lifetime; a local can be cleaned up early or late. The specification also notes that the lifetimes of some local variables are necessarily extended beyond the point where control leaves the method body containing the local declaration. Locals declared in an iterator block, for instance, live on even after control has left the iterator block; they might die only when the iterator is itself collected. Locals that are closed-over outer variables of a lambda are the same way; they live at least as long as the delegate that closes over them. And in the upcoming version of C#, locals declared in async blocks will also have extended lifetimes; when the async method returns to its caller upon encountering an “await”, the locals live on and are still there when the method resumes. (And since it might not resume on the same thread, in some bizarre situations, the locals had better not be stored on the stack!)

So if locals are not “variables on the stack” and locals are not “short lifetime variables” then what are locals?

The answer is of course staring you in the face. The defining characteristic of a local is that it can only be accessed by name in the block which declares it; it is local to a block. What makes a local truly unique is that it can only be a private implementation detail of a method body. The name of that local is never of any use to code lexically outside of the method body.

This article was inspired by a question from StackOverflow.

Here’s an inconvenient truth: just about every “public surface area” change you make to your code is a potential breaking change.

First off, I should clarify what I mean by a “breaking change” for the purposes of this article. If you provide a component to a third party, then a “breaking change” is a change such that the third party’s code compiled correctly with the previous version, but the change causes a recompilation to fail. (A more strict definition would be that a breaking change is one where the code recompiles successfully but has a different meaning; for today we will just consider actual “build breaks”.)

A “potential” breaking change is a change which might cause a break, if the third party happens to have consumed your component in a particular way. By a “public surface area” change, I mean a change to the “public metadata” surface of a component, like adding a new method, rather than changing the behaviour of an existing method by editing its body. (Such a change would typically cause a difference in runtime behaviour, rather than a build break.)

Some public surface area breaking changes are obvious: making a public method into a private method, sealing an unsealed class, and so on. Third-party code that called the method, or extended the class, will break. But a lot of changes seem a lot safer; adding a new public method, for example, or making a read-only property into a read-write property. As it turns out, almost any change you make to the public surface area of a component is a potential breaking change. Let’s look at some examples. Suppose you add a new overload:

// old component code:
public interface IFoo {...}
public interface IBar { ... }
public class Component
{
    public void M(IFoo x) {...}
}

Suppose you then later add

    public void M(IBar x) {...}

to Component. Suppose the consumer code is:

// consumer code:
class Consumer : IFoo, IBar
{
   ...
   component.M(this);
   ...
}

The consumer code compiles successfully against the original component, but recompiling it with the new component suddenly the build breaks with an overload resolution ambiguity error. Oops.

What about adding an entirely new method?

// old component code:
...
public class Component
{
  public void MFoo(IFoo x) {...}
}

and now you add

public void MBar(IBar x) {...}

No problem now, right? The consumer could not possibly have been consuming MBar. Surely adding it could not be a build break on the consumer, right?

class Consumer
{
    class Blah
    {
        public void MBar(IBar x) {}
    }
    static void N(Action<Blah> a) {}
    static void N(Action<Component> a) {}
    static void D(IBar bar)
    {
        N(x=>{ x.MBar(bar); });
    }
}

Oh, the pain.

In the original version, overload resolution has two overloads of N to choose from. The lambda is not convertible to Action<Component> because typing formal parameter x as Component causes the body of the lambda to have an error. That overload is therefore discarded. The remaining overload is the sole applicable candidate; its body binds without error with x typed as Blah.

In the new version of Component the body of the lambda does not have an error; therefore overload resolution has two candidates to choose from and neither is better than the other; this produces an ambiguity error.

This particular “flavour” of breaking change is an odd one in that it makes almost every possible change to the surface area of a type into a potential breaking change, while at the same time being such an obviously contrived and unlikely scenario that no “real world” developers are likely to run into it. When we are evaluating the impact of potential breaking changes on our customers, we now explicitly discount this flavour of breaking change as so unlikely as to be unimportant. Still, I think its important to make that decision with eyes open, rather than being unaware of the problem.

This article elicited many reader comments:

You pretty much have to ignore these possible problems, the same way you can’t guess what extension methods we’ve made that may be silently overridden by new methods in a class.

That was our conclusion, yes.

Could you give an example where making a read-only property into a read-write property would result in a breaking change? I can’t think of any…

Sure:

class C { public int P { get; set; } } 
class D { public int P { get; private set; } } 
class E { 
  static void M(Action<C> ac){} 
  static void M(Action<D> ad) {} 
  static void X() { M(q=>{q.P = 123; }); } 
}

The body of X binds without error as long as D.P‘s setter is private. If it becomes public then the call to M is ambiguous.

I’m curious to see how unsealing a class can cause code to stop compiling.

Same trick. This trick is surprisingly versatile.

class B {}
sealed class C {}
class D : B, I {}
interface I {}
class P {
  static void M(Func<C, I> fci){}
  static void M(Func<B, I> fbi){} // special agent!
  static void Main() { M(q=>q as I); }
}

That compiles successfully because q as I is illegal if q is of type C. The compiler knows that C does not implement I, and because it is sealed, it knows that no possible type that is compatible with C could implement I. Therefore overload resolution chooses the func from B to I, because a derived type B, say, D, could implement I. When C is unsealed then overload resolution has no basis to choose one over the other, so it becomes an ambiguity error.

Seems to me you would be much better off in this regard avoiding the new functional features and sticking to the C# 2.0 specifications. That way you’re pretty much protected against breakages like these right?

At what cost? Is it worthwhile to eschew the benefits of LINQ in order to avoid the drawbacks of some obscure, unlikely and contrived breaking changes? The benefits of LINQ outweigh the costs by a huge margin, in my opinion.

You know, all these examples make me think overload resolution is just too clever for its own good — these kinds of algorithms are more the kind you’d expect in the optimizing part, where cleverness abounds, not in the basic semantic part. Clearly, the solution is to abolish overload resolution, assign every method an unambiguous canonical name (that’s stable in the face of changes) and force the developer to specify this name on every call. Alternatively, make overload resolution unnecessary by forbidding overloads. Not sure how to handle generics in such a world — abolishing them seems too harsh. (Mandatory Internet disclaimer: the above is facetious, but slightly ha-ha-only-serious.)

There are languages that do not have overload resolution. A .NET language could, for instance, specify the unique metadata token associated with the method definition. But in a world without overload resolution, how would you do query comprehensions?

Fabulous adventures in coding

Eric Lippert's blog

Monthly Archives: January 2012

What is the defining characteristic of a local variable?

Every public change is a breaking change