Curiouser and curiouser

Here’s a pattern you see all the time in C#:

class Frob : IComparable<Frob>

At first glance you might ask yourself why this is not a “circular” definition; after all, you’re not allowed to say class Frob : Frob(*). However, upon deeper reflection that makes perfect sense; a Frob is something that can be compared to another Frob. There’s not actually a real circularity there.

This pattern can be genericized further:

class SortedList<T> where T : IComparable<T>

Again, it might seem a bit circular to say that T is constrained to something that is in terms of T, but actually this is just the same as before. T is constrained to be something that can be compared to T. Frob is a legal type argument for a SortedList because one Frob can be compared to another Frob.

But this really hurts my brain:

class Blah<T> where T : Blah<T>

That appears to be circular in (at least) two ways. Is this really legal?

Yes it is legal, and it does have some legitimate uses. I see this pattern rather a lot(**). However, I personally don’t like it and I discourage its use.

This is a C# variation on what’s called the Curiously Recurring Template Pattern in C++, and I will leave it to my betters to explain its uses in that language. Essentially the pattern in C# is an attempt to enforce the usage of the CRTP.

So, why would you want to do that, and why do I object to it?

One reason why people want to do this is to enforce a particular constraint in a type hierarchy. Suppose we have

abstract class Animal
{
  public virtual void MakeFriends(Animal animal);
}

But that means that a Cat can make friends with a Dog, and that would be a crisis of Biblical proportions! (***) What we want to say is

abstract class Animal
{
  public virtual void MakeFriends(THISTYPE animal);
}

so that when Cat overrides MakeFriends, it can only override it with Cat.

Now, that immediately presents a problem in that we’ve just violated the Liskov Substitution Principle. We can no longer call a method on a variable of the abstract base type and have any confidence that type safety is maintained. Variance on formal parameter types has to be contravariance, not covariance, for it to be typesafe. And moreover, we simply don’t have that feature in the CLR type system.

But you can get close with the curious pattern:

abstract class Animal<T> where T : Animal<T>
{
  public virtual void MakeFriends(T animal);
}
class Cat : Animal<Cat>
{
  public override void MakeFriends(Cat cat) {}
}

and hey, we haven’t violated the LSP and we have guaranteed that a Cat can only make friends with a Cat. Beautiful.

Wait a minute… did we really guarantee that?

class EvilDog : Animal<Cat>
{
  public override void MakeFriends(Cat cat) { }
}

We have not guaranteed that a Cat can only make friends with a Cat; an EvilDog can make friends with a Cat too. The constraint only enforces that the type argument to Animal be good; how you use the resulting valid type is entirely up to you. You can use it for a base type of something else if you wish.

So that’s one good reason to avoid this pattern: because it doesn’t actually enforce the constraint you think it does. Everyone has to play along and agree that they’ll use the curiously recurring pattern the way it was intended to be used, rather than the evil dog way that it can be used.

The second reason to avoid this is simply because it bakes the noodle of anyone who reads the code. When I see List<Giraffe> I have a very clear idea of what the relationship is between the List<> part — it means that there are going to be operations that add and remove things — and the Giraffe part — those operations are going to be on giraffes. When I see FuturesContract<T> where T : LegalPolicy I understand that this type is intended to model a legal contract about a transaction in the future which has some particular controlling legal policy. But when I read Blah<T> where T : Blah I have no intuitive idea of what the intended relationship is between Blah<T> and any particular TIt seems like an abuse of a mechanism rather than the modeling of a concept from the program’s “business domain”.

All that said, in practice there are times when using this pattern really does pragmatically solve problems in ways that are hard to model otherwise in C#; it allows you to do a bit of an end-run around the fact that we don’t have covariant return types on virtual methods, and other shortcomings of the type system. That it does so in a manner that does not, strictly speaking, enforce every constraint you might like is unfortunate, but in realistic code, usually not a problem that prevents shipping the product.

My advice is to think very hard before you implement this sort of curious pattern in C#; do the benefits to the customer really outweigh the costs associated with the mental burden you’re placing on the code maintainers?


(*) Due to an unintentional omission, some past editions of the C# specification actually did not say that this was illegal! However, the compiler has always enforced it. In fact, the compiler has over-enforced it, sometimes accidentally catching non-cycles and marking them as cycles.

(**) Most frequently in emails asking “is this really legal?”

(***) Mass hysteria!

12 thoughts on “Curiouser and curiouser

  1. Nice post, but after reading I don’t understand if there is a “better” solution to restrict a Cat to only MakeFriend with other Cats or not?

    Im currently writing a tree node class with some derived node types, that’s why I made a abstract generic Node “base” class for the other node types, the generic Node class has some T properties like Parent and Children. The generic constrain are like in your example Node<T> where T : Node<T> and dervied Nodes like MyNode : Node<MyNode>.

    One “Problem” is that the Node can’t set itself as Parent of other nodes, because the parent is T, so I have to cast this to T inside the Node class, which works but feels “weird”.

    I did some google searches and didn’t find a “better” way to implement a generic tree node in C#. I found similar solutions on stackoverflow. Some of them use a generic node with a T Value property, but for my use case I want T itself to be a node that’s why I used the solution with the derived Nodes.

    Some “improvments” after reading this post would be to add a constructor in the base Node check if this is T and throw otherwise, to prevent the “EvilDog” case and maybe instead of just using T I could rename it to something like TDerivedNode to make it more obvious.

    Or is there a better solution that im missing?

    • Looks like some of the angle brackets were removed after I posted the comment. I hope that you can still understand my examples. The “Node” base class mentiond aboved is supposed to be generic like so Node T where T : Node T

    • I fixed the angle brackets; WordPress is aggressive about preventing injections.

      Your proposal is common, but I have to wonder a few things:

      If you are willing to put an exception in the constructor that throws when the node type is wrong, then why make Node generic at all? Basically what you are saying when you do a runtime type check that throws is “the type system is not strong enough to catch my problem at compile time”. That’s OK, the type system is not required to be able to express any possible constraint. But if you are in that position, you could just write every constructor or mutating method to check whether the node you gave it is of the correct type, and now you don’t need to have the additional complexity of generics *when that complexity does not buy you safety*. In other words, if you’re going to do a runtime check anyways, then why pay the price of a compile-time check that doesn’t work?

      Similarly, you are putting casts in your code — again, the point of a cast is the same as the point of an exception. It is saying “the type system is not strong enough to prove that this operation is safe at compile time but I know it is”. A cast is, again, a runtime type check that verifies a typing constraint that is not known to be met at compile time. So again, why bother going through the trouble of making a generic type if you already know that the generic type cannot provide a proof of the type safety property you desire at compile time?

      A generic node type has other consequences as well. Being a class type, it’s not covariant, so you cannot treat nodes polymorphically (unless you add a non-generic interface or base class — and then again, if you need to add a non-generic interface in order to get work done, why not just use it in the first place and avoid the generic code entirely?) The first thing I usually do when I define a tree data type is write a bunch of code that does stuff like “visit every node in this tree” or “display this tree as a string for debugging purposes” and so on. Having to then make all *that* code generic is taxes that do not actually pay for any benefit.

      I’m not saying that you’re wrong to take the path you’re taking, but I am saying that it has costs and benefits, and that I typically do not find the benefits to outweigh the costs. I like my code to be simple and typesafe; it is often easier to make code typesafe if it is simple.

      • Thank you for your long answer. I think was way too focused on writing the “perfect” generic solution (which isn’t even “perfect” because the type system doesn’t allow it).
        After reading your answer and thinking more about it. It makes sense to make a non generic node, since I have many derived node classes anyways and I want to store them all in a list and traverse them all and all of that with the generic types sounds like a nightmare, because like you said I would have to make a non generic base class / interface to do these things properly.

        The main reason why I wanted the type checks is that I have two categories of nodes “condition nodes” and “action nodes” and some specific (derived) nodes are only allowed to have “action nodes”. With the non generic version it would be possible to add “condition nodes” to these nodes. But I think it would be easier to just add checks to the add node command(s) to make sure that the selected node can be added to the current node and some additional checks to the save/load system to verify that the tree structure is valid and remove the generic nodes.

        I was in a bubble and thought these checks are “bad” and was too focused on a generic “fix” but I changed my mind, because even if a property is a string type it doesn’t mean that all strings are valid in all cases and additional validation logic is needed in some cases, so I think a validation check which child nodes can be added as child valid non generic solution too in this case 😉

        • You’re very welcome. If you have not read my series “Wizards and Warriors”, you might want to check that out. It is about the pattern of thought that many people naturally fall into, myself included, which is “I have a business domain rule, and if I capture it in the type system, it will never be violated”. Sometimes that works out great, but I give examples in that series about times when it certainly does not.

          In particular, the business rule I consider in that series is “a player can hold a weapon, a wizard is a kind of player, a wizard cannot hold a sword”, which is the same situation you’re in, just with “sword” replaced by “condition node”.

          • I didn’t read it yet, but it sounds interesting, I will do that now, thanks 🙂

            I just realized that the .NET framework itself has a similar situation with the System.Xml.Linq XML classes. The abstract XContainer base class has a virtual AddChild but the XDocument can’t contain XAttribute directly and will throw a exception if XDocument.AddChild is called with a XAttribute parameter. (That’s just one example, im sure there are other invalid argument cases)

          • I just saw a news post about C# 9 “Covariant returns”. I didn’t find many informations about it yet, but does that mean that C# 9 will make this pattern easier to implement without the need of generic?

          • I should write a blog post about this one.

            The short answer to your question is no; this is a different kind of covariance that Mads is talking about.

            There’s a feature that has been requested for C# for literally 20 years and it has always been turned down because (1) it’s just not very compelling, (2) it can lead to versioning problems (3) the CLR doesn’t support it directly and (4) there are easy ways to work around it. Apparently it only took 20 years of asking and it will finally be in C# 9.

            The feature is “virtual return type covariance”. The name is complicated but the idea is straightforward: a virtual method on a base class that returns a Mammal can be overridden by a method on a derived class that returns a Giraffe.

            This is type safe because anyone calling from the base class gets a Mammal, as required, because a Giraffe is a mammal, but anyone calling from the derived class gets a more specific type.

            This kind of covariance is supported by C++ but not C# before C# 9.

  2. Would it be possible and typesystem-sound in terms of C# to have `self` as a generic type constraint: `class Animal where T : self` and `class Cat: Animal`. The compiler would then prevent the EvilDog situation without any runtime checks needed.

Leave a comment