Unknown's avatar

About ericlippert

http://ericlippert.com

Covariance and Contravariance in C#, Part 2: Array Covariance

C# implements variance in two ways. Today, the broken way.

Ever since C# 1.0, arrays where the element type is a reference type are covariant. This is perfectly legal:

Animal[] animals = new Giraffe[10];

Since Giraffe is smaller than Animal, and “make an array of” is a covariant operation on types, Giraffe[] is smaller than Animal[], so an instance fits into that variable.

Unfortunately, this particular kind of covariance is broken. It was added to the CLR because Java requires it and the CLR designers wanted to be able to support Java-like languages. We then up and added it to C# because it was in the CLR. This decision was quite controversial at the time and I am not very happy about it, but there’s nothing we can do about it now.

Why is this broken? Because it should always be legal to put a Turtle into an array of Animals. With array covariance in the language and runtime you cannot guarantee that an array of Animals can accept a Turtle because the backing store might actually be an array of Giraffes.

This means that we have turned a bug which could be caught by the compiler into one that can only be caught at runtime. This also means that every time you put an object into an array we have to do a run-time check to ensure that the type works out and throw an exception if it doesn’t. That’s potentially expensive if you’re putting a zillion of these things into the array.

Yuck.

Unfortunately, we’re stuck with this now. Giraffe[] is smaller than Animal[], and that’s just the way it goes.


I would like to take this opportunity to clarify some points brought up in comments to Part One.

First, by “subtype” and “supertype” I mean “is on the chain of base classes” for classes and “is on the tree of base interfaces” for interfaces. I do not mean the more general notion of “is substitutable for”. 

By “bigger than” and “smaller than” I explicitly do not mean “is a base class of” and “is a derived type of”. It is the case that every subclass is smaller than its superclass, yes. But it is not the case that every smaller type is derived from its larger type. 

Giraffe[] is smaller than both Animal[] and System.Array. Clearly Giraffe[] is derived from System.Array, but it is not derived from Animal[].

The “is smaller than” relationship I am defining is more general than the “is a kind of” relationship. I want to draw a distinction between assignment compatibility (smaller than) and inheritance (subtype of).


Next time on FAIC: we’ll discuss a kind of variance that we added to C# 2.0 which is not broken.


Archive of original post

C++ and the pit of despair

Raymond has an interesting post today about two subtle aspects of C#: how order of evaluation in an expression is specified as strictly left-to-right, and how the rules regarding local shadowing ensure that an identifier has exactly one meaning in a local scope. He makes an educated guess that the reason for these sorts of rules is to “reduce the frequency of a category of subtle bugs“.

I’d like to take this opportunity to both confirm that guess and to expand upon it a bit. Continue reading

An inheritance puzzle, part two

Today, the answer to Friday’s puzzle. It prints Int32. But why?

Some readers hypothesized that M would print out Int32 because the declaration B : A<int> somehow tells B that T is to be treated as int, now and forever.

Though the answer is right, the explanation is not quite right. One can illustrate this by taking C out of the picture. If you say (new A<string>.B()).M(), you’ll get String. The fact that B is an A<int> doesn’t make T always int inside B!

The always-keen Stuart Ballard was the first to put his finger upon the real crux of the problem — but he’s seen this problem before, so he had an advantage. Eamon Nerbonne was the first to post a complete and correct explanation.

The really thorny issue here is that the declaration class C : B is upon close inspection, somewhat ambiguous. Is that equivalent to class C : A<T>.B or class C : A<int>.B?

Clearly it matters which we choose. A method group can only be treated as a member if the method is on a base class. Merely being on an outer class doesn’t cut it:

public class X 
{ 
  public void M() { } 
}
public class Y 
{
  public void N() { }
  public class Z : X { }
}
...
// legal, from base class
(new Y.Z()).M(); 
// illegal -- outer class members are not members of inner classes.
(new Y.Z()).N(); 

In our example, when we called M on an instance of A<string>.B.C, it was calling a method of the base class of C. If the base class of C is A<T>.B, then that should call A<T>.B.M, and it should print out whatever the current value of T is — in this case, String. If the base class of C is A<int>.B, then that should call A<int>.B.M, so it should print out Int32.

We choose the latter as the base class. That was certainly a surprise to me. And Stuart Ballard. And, amusingly enough, when I sprang this one upon Anders and didn’t give him time to think about it carefully, it was a surprise to him as well. When I sprang it on Cyrus, he cheerfully pointed out that he already posted a harder version of this problem back in 2005, the solution of which Stuart characterized back then as “Insanely complex, but it makes perfect sense.” I couldn’t agree more, though at least my version of the puzzle is somewhat simpler.

Anyway, why on earth ought that to be the case? Surely the B in class C : B means the immediately containing class, which is A<T>.B, not A<int>.B! And yet it does not.

These generics are screwing up our intuitions. Let’s look at an example which has no generics at all:

public class D 
{
  public class E {}
}
public class F 
{
  public class E { }
  public class G 
  {
    public E e; // clearly F.E
  }
}
public class H : D 
{
  public E e; // clearly D.E
}

This is all legal so far, and should be pretty clear. When we are binding a name to a type, the type we get is allowed to be a member of any base class or a member of any outer class. But what if we have both to choose from?

public class J 
{
  public class E { }
  public class K : D 
  {
    public E e; // Is this J.E or D.E?
  }
}

We could just throw up our hands and say that this is ambiguous and therefore illegal, but we’d rather not do that if we can avoid it. We have to prefer one of them, and we’ve decided that we will give priority to base classes over outer classes. Derived classes have an “is a kind of” relationship with their base classes, and that is logically a “tighter” binding than the “is contained in” relationship that inner classes have with outer classes.

Another way to think about it is that all the members you get from your base class are all “in the current scope”; therefore all the members you get from outer scopes are given lower priority, since stuff inside inner scopes takes priority over stuff in outer scopes.

The algorithm we use to search for a name used in the context of a type S is as follows:

  • search S’s type parameters
  • search S’s accessible inner classes
  • search accessible inner classes of all of S’s base classes, going in order from most to least derived
  • S←S’s outer class, start over

(And if that fails then we invoke the whole mechanism of searching the namespaces that are in scope, checking alias clauses, etc.)

With that in mind, now the solution should finally make some sense. At the point where we are resolving the base class of C we know that C has no type parameters. We do not know what the base class or inner classes of C are — that’s what we’re trying to figure out — so we skip checking them.

The next thing we check is the outer class, which is A<T>.B, but we do NOT say, aha, the outer class is called B, we’re done. That is not at all what the algorithm above says. Instead, it says check A<T>.B to see if it has a type parameter called B or an inner type called B. It does not, so we keep searching.

The base type of A<T>.B is A<int>. The outer type of A<T>.B is A<T>. Both have an accessible inner class called B. Which do we pick? The base type gets searched first, so B resolves to A<int>.B. Obviously.

Having members of base classes bind tighter than members from outer scopes can lead to bizarre situations but they are generally pretty contrived. For example:

public class K 
{ 
  public class L { } 
}
public class L : K 
{
  L myL; // this is K.L!
}

And of course, you can always get around these problems by eliminating the ambiguity:

public class A<T> 
{
  public class B : A<int> 
  {
    public void M() { ... }
    // no longer ambiguous which B is picked.
    public class C : A<T>.B { } 
  }
}

Finally, the specification of this behaviour is a bit tricky to understand. The spec says:

Otherwise, if T contains a nested accessible type having name I and K type parameters, then the namespace-or-type-name refers to that type constructed with the given type arguments. If there is more than one such type, the type declared within the more derived type is selected.

By “if T contains a nested accessible type”, it means “if Tor any of its base classes, contains a nested accessible type”. I completely failed to comprehend that the first n times I read that section. I’ll see if I can get that clarified in the next version of the standard.

An inheritance puzzle, part one

Once more I have returned from my ancestral homeland, after some weeks of sun, rain, storms, wind, calm, friends and family. I could certainly use another few weeks, but it is good to be back too.

Well, enough chit-chat; back to programming language design. Here’s an interesting combination of subclassing with nesting. Before trying it, what do you think this program should output?

public class A<T> 
{
  public class B : A<int> 
  {
    public void M() 
    {
      System.Console.WriteLine(typeof(T).ToString());
    }
    public class C : B { }
  }
}
class MainClass 
{
  static void Main() 
  {
    A<string>.B.C c = new A<string>.B.C();
    c.M();
  }
}

Should this say that T is int, string or something else? Or should this program not compile in the first place?

It turned out that the actual result is not what I was expecting at least. I learn something new about this language every day.

Can you predict the behaviour of the code? Can you justify it according to the specification? (The specification is really quite difficult to understand on this point, but in fact it does all make sense.)

The answer is in the next episode!

Why are overloaded operators always static in C#?

A language design question was posted to the Microsoft internal C# discussion group this morning: “Why must overloaded operators be static in C#? In C++ an overloaded operator can be implemented by a static, instance or virtual method. Is there some reason for this constraint in C#?

Before I get into the specifics, there is a larger point here worth delving into, or at least linking to. Raymond Chen immediately pointed out that the questioner had it backwards. The design of the C# language is not a subtractive process; though I take Bob Cringely’s rather backhanded compliments from 2001 in the best possible way, C# is not Java/C++/whatever with the kludgy parts removed. Former C# team member Eric Gunnerson wrote a great article about how the process actually works.

Rather, the question we should be asking ourselves when faced with a potential language feature is “does the compelling benefit of the feature justify all the costs?” And costs are considerably more than just the mundane dollar costs of designing, developing, testing, documenting and maintaining a feature. There are more subtle costs, like, will this feature make it more difficult to change the type inferencing algorithm in the future? Does this lead us into a world where we will be unable to make changes without introducing backwards compatibility breaks? And so on.

In this specific case, the compelling benefit is small. If you want to have a virtual dispatched overloaded operator in C# you can build one out of static parts very easily. For example:

public class B {
    public static B operator+(B b1, B b2) { return b1.Add(b2); }
    protected virtual B Add(B b2) { // …

And there you have it. So, the benefits are small. But the costs are large. C++-style instance operators are weird. For example, they break symmetry. If you define an operator+ that takes a C and an int, then c+2 is legal but 2+c is not, and that badly breaks our intuition about how the addition operator should behave.

Similarly, with virtual operators in C++, the left-hand argument is the thing which parameterizes the virtual dispatch. So again, we get this weird asymmetry between the right and left sides. Really what you want for most binary operators is double dispatch — you want the operator to be virtualized on the types of both arguments, not just the left-hand one. But neither C# nor C++ supports double dispatch natively. (Many real-world problems would be solved if we had double dispatch; for one thing, the visitor pattern becomes trivial. My colleague Wes is fond of pointing out that most design patterns are in fact necessary only insofar as the language has failed to provide a needed feature natively.)

And finally, in C++ you can only define an overloaded operator on a non-pointer type. This means that when you see c+2 and rewrite it as c.operator+(2), you are guaranteed that c is not a null pointer because it is not a pointer! C# also makes a distinction between values and references, but it would be very strange if instance operator overloads were only definable on non-nullable value types, and it would also be strange if c+2 could throw a null reference exception.

These and other difficulties along with the ease of building your own single (or double!) virtual dispatch mechanisms out of static mechanisms, makes it easy to decide to not add instance or virtual operator overloading to C#.

Chained user-defined explicit conversions in C#

Reader Niall asked me why the following code compiles but produces an exception at runtime:

class Base {}
class Derived : Base {}
class Castable 
{
    public static explicit operator Base() 
    {
      return new Base(); 
    }
}
// ...
Derived d = (Derived)(new Castable());

It should be clear why this produces an exception at runtime; the user-defined operator returns a Base and that is not assignable to a variable of type Derived. But why does the compiler allow it in the first place?

First off, let’s define the difference between an implicit and an explicit conversion. An implicit conversion is one which the compiler knows can always be done without incurring the risk of a runtime exception. When you have a method int Foo(int i){...} and call it with long l = Foo(myshort);, the compiler inserts implicit conversions from int to long on the return side and from short to int on the call side. There is no int which doesn’t fit into a long and there is no short which doesn’t fit into an int, so we know that the conversions will always succeed, so we just up and do them for you.

There are also conversions which we know at compile time will never succeed. If there is no user-defined conversion from Giraffe to int, then Foo(new Giraffe()) is always going to fail at runtime, so this fails at compile time.

An explicit conversion is a conversion which might succeed sometimes but might also fail. We cannot disallow it, because it might succeed, but we can’t go silently inserting one either, since it might fail unexpectedly. We need to force the developer to acknowledge that risk explicitly. If you called ulong ul = Foo(mynullableint); then that might fail, so the compiler requires you to spell out that the conversions are explicit. The assignment could be written ulong ul = (ulong)Foo((int)mynullableint);.

There are two times that the compiler will insert an explicit cast for you without producing a warning or error. The first is the case above. When a user-defined explicit cast requires an explicit conversion on either the call side or the return side, the compiler will insert the explicit conversions as needed. The compiler figures that if the developer put the explicit cast in the code in the first place then the developer knew what they were doing and took the risk that any of the conversions might fail. That’s what the cast means: this conversion might fail, I will deal with it.

I understand that this puts a burden upon the developer to fully understand the implications of a cast, but the alternative is to make you spell it out even further, and it just gets to be too much. The logical extreme of this would be a case such as

public struct S
{
    public static explicit operator decimal?(S s) { return 1.0m; }
}
//...
S? s = new S();
int i = (int) s;

Here we do first an explicit conversion from S? to S, then a user-defined explicit conversion from S to decimal?, then an explicit conversion from decimal? to decimal, and then an explicit conversion from decimal to int. That’s four explicit conversions for the price of one cast, which I think is pretty good value for your money.

I want to note at this point that this is as long as the chain gets. A user-defined conversion can have built-in conversions inserted automatically on the call and return sides, but we never automatically insert other user-defined conversions. We never say that there’s a user-defined conversion from Alpha to Bravo, and a user-defined conversion from Bravo to Charlie, and therefore casting an Alpha to a Charlie is legal. That doesn’t fly.

And a built-in conversion can be lifted to nullable, which may introduce additional conversions as in the case above. But again, these are never user-defined.

The second is in the foreach loop. If you have foreach(Giraffe g in myAnimals) then we generate code which fetches each member of the collection and does an “explicit” conversion to Giraffe. If there happens to be a Snail or a WaterBuffalo in myAnimals, that’s a runtime exception. I considered adding a warning to the compiler for this case to say hey, be aware that your collection is of a more general type, this could fail at runtime. It turns out that there are so many programs which use this programming style, and so many of them have “compile with warnings as errors” turned on in their builds, that this would be a huge breaking change. So we opted to not do it.

A face made for email, part three

It has happened again: someone has put me on video talking about programming languages.

This time our fabulous C# Community Program Manager Charlie Calvert was good enough to put together a little half-hour-long video of me talking about the scenarios which justify changes to the type inference algorithm for C# 3.0.

The video is here. Enjoy!

Odious ambiguous overloads, part two

There were a number of ideas in the comments for what we should do about the unfortunate situation I described yesterday. We considered all of these options and more.  I thought I might talk a bit about the pros and cons of each.

I suggested do nothing.  We shipped with this behaviour in C# 2.0, the world didn’t come to an end, we can keep on shipping with it. The pro side is that this is the cheapest and easiest thing to do.  The con side is that as a reader pointed out, this is a gross “gotcha” just waiting for someone to stumble upon it.

DrPizza suggested that we come up with a syntax that disambiguates between the two ambiguous cases.  On the pro side, the problem would be clearly solved.  On the con side, this might be a breaking change (because suddenly one of the methods that used to bind one way might start binding the other way, depending on what the choice of syntax was.)

Also on the con side, adding new language syntax is something that we do not do lightly.  There is considerable work involved in changing the specification, and this is kind of an edge case.

Ayende Rahien suggested that we add an attribute that hints to the compiler which is the right binding. This has the nice property of not introducing new language syntax, but is kind of ugly.

These are both good ideas but unfortunately, as it turns out, neither of these work for technical reasons which I will go into in detail below.

Chris D suggested that the constructed interface should be treated as an interface with only one method. That’s a really nice idea, but unfortunately it doesn’t work with the CLR definition of generics.  An interface with n methods has n vtable slots that need filling, period, and n is invariant under construction. We’d have to change both the semantics of constructed interfaces and their implementation in the CLR, which means changing both the C# spec and the CLR spec, and that’s way too much work for this edge case.

Kyle Lahnakoski suggested that it simply be an error.  There are several ways this could be an error.

It could be an error to have an unconstructed generic type which could possibly be constructed to produce a duplicated signature.  This was how generics originally worked in C#, but we decided that we could allow these cases and we can’t really go back on that decision now – that would be a huge breaking change. There are many serializable generic objects which use a pattern of having a method that takes a type parameter to overload a method that takes a serialization object parameter, and all of those would suddenly be errors.

It could be an error to construct a generic type so as to actually produce a duplicate signature, but again, we decided to allow that in C# 2.0 and it would be a large breaking change to go back now.

It could be an error to have any explicit implementation of an interface method which matches ambiguously.  This would be a smaller breaking change, but still breaking.

What do any of these breaking changes buy the user though?  Not more functionality, but less.  Basically we’d be saying “you could do this before, and now you can’t, too bad.” That’s a difficult sale to make.

Carlos suggested unification – that is, have an explicit implementation map to every method that it matches, not just the first one.  This is a breaking change, but it seems like the most natural thing to do.  Since an implicit implementation maps to every interface slot that it matches, why shouldn’t an explicit implementation do the same? This means adding new language to the spec, but no new syntax.

This is what we decided to do, and I tried to implement it, only to discover that, uh, actually that stuff I told you yesterday about the implicit implementation matching the first one in source code order is a bit of a mistruth. That’s what I thought when I first looked at the problem, but upon closer examination, something deeper was wrong.

In fact what is going on here is that when the CLR loads the class for the first time, it does a verification check to ensure that the class does in fact implement every method of every interface that it says it does.  It is the CLR, not C#, which matches the explicit implementation to the first matching slot in the constructed interface type! It just so happens that since metadata for the interface is emitted in source-code order, that the CLR appears to do the match in source code order.  It actually does it in a CLR-implementation-defined order that just happens to be source code order.

To understand why we can’t fix this, you need to understand how explicit implementations are represented in metadata.  In the CLR metadata for a class there is a table called the MethodImpl table which says “method blah of class foo explicitly implements method bar of interface IFred”.  The “method bar of interface IFred” portion of that mapping may be either a token to a method of the unconstructed interface (a “MethodDef”) , or it may be a token representing a method on the constructed interface (a “MemberRef”).

We can’t use a MethodDef because there is a signature mismatch between the class’s implementation and the method on the unconstructed interface.  One takes an int, the other takes a type parameter, and the CLR flags that as illegal. So that’s out.

But we can’t use the latter because the MemberRef identifies the constructed interface method by – you guessed it – its signature!  And now we’re in the same hideous bind all over again.  We have no way to tell the CLR “no, really, use the generic-substituted version of the method” , because it can’t tell the two signatures apart either.

To fix this problem by either unifying, or adding a syntax/attribute to disambiguate, we need some way to represent that unification/disambiguation in metadata. But the CLR simply lacks the concept of “method token for constructed method identified by something more unique than signature”.

Though we can certainly suggest that this is a weakness in the CLR metadata design, waiting around for that to be addressed to fix this edge-case problem seems like a non-starter. Given that we can’t fix it in the current CLR the way we’d like, and making it an error seems like an egregious overreaction that buys the user nothing, but at the same time doing nothing is leaving a gotcha in the code waiting to happen, I took the only course of action left:

Warning: Explicit interface implementation 'I1<int>.M(int)' matches 
more than one interface member. Which interface member is actually 
chosen is implementation-dependent. Consider using a non-explicit 
implementation instead."

Unfortunately, sometimes that’s the best you can do.


Update from 2020: C# language designer Neal Gafter informs me that the bug in the CLR metadata processing which led to this situation was fixed in .NET 5.0, which is at the time of this writing in preview release. See his comment below for details.

Microsoft understandably gets some flak when it takes fifteen years for a bug to go from reported to fixed, but as always, you’ve got to consider the opportunity costs. There is a finite amount of effort available to fix bugs, and fixing bugs often introduces new problems in the form of subtle compatibility issues, so you’ve got to ruthlessly prioritize. This is a rare scenario for realistic code, and when it does arise in realistic code it indicates that there’s an opportunity to refactor that code into something less ambiguous.