Suppose we have my usual hierarchy of types, `Animal`

, `Giraffe`

, etc, with the obvious type relationships. An `IEqualityComparer<T>`

is contravariant in its type parameter; if we have a device which can compare two `Animal`

s for equality then it can compare two `Giraffe`

s for equality as well. So why does this code fail to compile?

IEqualityComparer<Animal> animalComparer = whatever; IEnumerable<Giraffe> giraffes = whatever; IEnumerable<Giraffe> distinct = giraffes.Distinct(animalComparer);

This illustrates a subtle and slightly unfortunate design choice in the method type inference algorithm, which of course was designed long before covariance and contravariance were added to the language.

Let’s take a look at the signature of the `Distinct `

extension method that was used above:

public static IEnumerable<T> Distinct<T>( this IEnumerable<T> source, IEqualityComparer<T> comparer)

Type inference computes a set of bounds on what `T`

can be based on an analysis of each argument and its corresponding formal parameter type. For the first argument, since `IEnumerable<T>`

is a covariant interface, the type inference algorithm computes that `T`

must be `Giraffe `

or any larger type.

(Aside: Recall that when I say that one type is larger than another, I mean that there is an implicit conversion from the smaller type to the larger type. `Animal `

is a larger type than `Giraffe `

because all `Giraffe`

s are `Animal`

s, but some `Animal`

s are not `Giraffe`

s. There are more `Animal`

s than `Giraffe`

s, so `Animal`

is the larger type.)

The specification expresses this notion by saying that `Giraffe `

is a “lower bound” — whatever is finally chosen has to be a type that is *not smaller* than `Giraffe`

.

For the second argument, since `IEqualityComparer<T> `

is a contravariant interface, the type inference algorithm concludes that `T`

must be `Animal `

or any smaller type. This is an “upper bound” — `T `

has to be a type that is *not larger *than `Animal`

.

Given these two bounds, what type should actually be inferred for `T`

? One of the design rules of C# that I have mentioned many times on this blog is that when given a set of types to choose from, C# always chooses one of those types rather than making up a type that is not in the set. So we know that we are going to choose either `Animal `

or `Giraffe`

and not some third type which satisfies the constraints, like `Mammal`

. But which? Both types satisfy the bounds! Both `Animal `

and `Giraffe `

are not *smaller* than `Giraffe`

, and both are* not larger* than `Animal`

.

When faced with this situation the type inference algorithm chooses the larger of the types, so `T`

is inferred to be `Animal`

. Thus we have:

public static IEnumerable<Animal> Distinct<Animal>( this IEnumerable<Animal> source, IEqualityComparer<Animal> comparer)

and since the method returns `IEnumerable<Animal>`

, it cannot be assigned to a variable of type `IEnumerable<Giraffe>`

.

You would be forgiven for expecting `Giraffe `

to be chosen because that’s what the spec says:

If among the remaining candidate types U

_{j}there is a unique type V from which there is an implicit conversion to all the other candidate types, then X_{i}is fixed to V.

That’s a long-standing typo that has never been fixed in the spec; the intended wording was “to which”, not “from which”.

It might seem strange to choose the larger type here; why not choose the more specific type if you can? We designed the type inference algorithm for C# 3, which did not have covariance or contravariance (except for array type covariance) and so it was a bit of a moot point; the spec said that in the unlikely event of a conflict, choose the larger type. When C# 4 came along and added contravariance I made the case that we ought to choose the more specific type, because now there were scenarios where it actually might work, but we ultimately decided to stick with the existing algorithm. We then promptly wrote that algorithm down wrong in the spec. Whoops.

In any event, this is a small problem. If it hurts when you do that then simply change the type of the comparer:

IEqualityComparer<Animal> animalComparer = whatever; // Legal thanks to contravariance: IEqualityComparer<Giraffe> giraffeComparer = animalComparer; IEnumerable<Giraffe> giraffes = whatever; IEnumerable<Giraffe> distinct = giraffes.Distinct(giraffeComparer);

And now there is no conflict; the bounds inferred are an upper bound of `Giraffe `

and a lower bound of `Giraffe`

, so there’s only one to choose from.

While it would obviously be too late now to implement such a thing, what would have been the consequences of using two type parameters in `Distinct`, or of having the generic EqualityComparer inherit a non-generic one, and having an overload of Distinct that uses that? While most methods should be covariant with respect to their input parameters, equality comparison is logically both covariant and contravariant: a Cat may be equal to a SiameseCat or to an Animal; a Cat could never be equal to a Dog, but the comparison is be well-defined (reporting “not equal”).

@John, why would it be too late? You can easily create an overload of Distinct with 2 type parameters:

With this extension method in scope, the initial code from the article compiles and runs fine.

Looks like WordPress ate my angled brackets… here’s the code on Gist:

Distinct2.cs

hosted with ❤ by GitHub

Indeed, that is a good way to do it. One small problem is that it does not work if you have a comparer of objects and a sequence of ints, because contravariance does not work in that case. It doesn’t work in the one-type-argument case either, but in that situation the bug is caught at compile time. In the two-type-argument case the bug is not caught until runtime.

Good point; adding a

`class`

constraint on T should solve the problemRight; so then you’d use the one-type-parameter version for structs. Unfortunately I suspect that doesn’t work well either. C# doesn’t take constraint satisfaction into account until overload resolution has found the best match, so doing this could lead to an ambiguity between the one-type-parameter and two-type-parameter versions.

You’re right, it prevents the use of the one-type-parameter overload. Of course I can always call Enumerable.Distinct as a static method, but it’s less convenient…

What would you think of using a couple of dummy interfaces IRequireClass(Of T) where T:class and IRequireStruct(Of T) where T:struct and adding to the two-types class-constrained “Distinct” method a dummy parameter of type IRequireClass(Of T) with a default null value, and having a generic “Distinct” method which is constrained to structure types and has a dummy parameter IRequireStruct(Of T) which also defaults to null? It seems icky to require the caller to waste time passing a parameter which isn’t used for any purpose, but it also seems to work.

My original thought was focused on the fact that equality comparison is both contravariant and covariant [one can check whether any two references target equal objects, regardless of their types, but both the covariant and contravariant cases are “special”]; adding `Object` overloads to `IEqualityComparer.Equals()` and `IEqualityComparer.GetHashCode()` would obviously be a breaking change at this point, but might have been an interesting thing to consider for the aforementioned reason. As for using a two-parameter `Distinct` method, that has the disadvantage of requiring a `class` constraint, which makes use with structures awkward.

Pingback: The Morning Brew - Chris Alcock » The Morning Brew #1409

Eric, I think I’m missing something near the start – how do we know

`IEnumerable<T>`

is Covariant but`IEqualityComparer<T>`

is Contra-variant? Or are you just stating that as fact?Their signatures are defined as such (notice the “in” and “out” generic modifiers):

`public interface IEqualityComparer<in T>`

http://msdn.microsoft.com/en-us/library/ms132151.aspx

`public interface IEnumerable<out T> : IEnumerable`

http://msdn.microsoft.com/en-us/library/9eekhta0.aspx

Ah – Thanks Chris. Implicit knowledge that I was just missing 🙂

If you like rock/metal music, take a break from Lake Huron on the 10th and check out Dirt Fest. Big annual rock/metal concert in Michigan.

“If it hurts when you do that then simply change the type of the comparer:”

There’s an easier way, just specify the type argument to Distinct explicitly:

IEqualityComparer animalComparer = whatever;

IEnumerable giraffes = whatever;

IEnumerable distinct = giraffes.Distinct(animalComparer);

Bah, of course I meant

IEnumerable distinct = giraffes.Distinct<Giraffe>(animalComparer);

Your blog software should stop eating things that look like HTML 😉

Bah, of course I ACTUALLY meant

IEqualityComparer<Animal> animalComparer = whatever;

IEnumerable<Giraffe> giraffes = whatever;

IEnumerable<Giraffe> distinct = giraffes.Distinct<Giraffe>(animalComparer);

Your blog software REALLY should stop eating things that look like HTML 😉