Nullable micro-optimizations, part one

Which is faster, Nullable<T>.Value or Nullable<T>.GetValueOrDefault()?

Before I answer that question, my standard response to "which horse is faster?" questions applies. Read that first.

.
.
.

Welcome back. But again, before I answer the question I need to point out that the potential performance difference between these two mechanisms for obtaining the non-nullable value of a nullable value type is a consequence of the fact that these two mechanisms are not semantically equivalent. The former may legally only be called if you are sure that the nullable value is non-null;1 the latter may be called on any nullable value. A glance at a simplified version of the source code illustrates the difference.

struct Nullable<T> where T : struct
{
  private bool hasValue;
  private T value;
  public Nullable(T value)
  {
    this.hasValue = true;
    this.value = value;
  }
  public bool HasValue { get { return this.hasValue; } }
  public T Value
  {
    get
    {
      if (!this.HasValue) throw something;
      return this.value;
    }
  }
  public T GetValueOrDefault() 
  {
    return this.value; 
  }
  ... and then all the other conversion gear and so on ...
}

The first thing to notice is that a nullable value type's ability to represent a "null" integer or decimal or whatever is not magical.2 A nullable value type is nothing more than an instance of the value type plus a bool saying whether it's null or not.

If a variable of nullable value type is initialized with the default constructor then the hasValue field will be its default value, false, and the value field will be default(T). If it is initialized with the declared constructor then of course the hasValue field is true and the value field is any legal value, including possibly T's default value. Thus, the implementation of GetValueOrDefault() need not check the flag; if the flag is true then the value field is set correctly, and if it is false, then it is set to the default value of T.

Looking at the code it should be clear that Value is almost certainly not faster than GetValueOrDefault() because obviously the former does exactly the same work as the latter in the success case, plus the additional work of the flag check. Moreover, because GetValueOrDefault() is so brain-dead simple, the jitter is highly likely to perform an inlining optimization.3 How the jitter chooses to inline or not is an implementation detail, but it is reasonable to assume that it is less likely to perform an inlining optimization on code that contains more than one "basic block"4 and explicitly throws.

It should also be clear that though the relative performance difference might be large, the absolute difference is small. A call, field fetch, conditional jump and return in the typical case makes up the difference, and those things are each only nanoseconds.

Now, this is of course not to say that you should willy-nilly change all your calls to Value to GetValueOrDefault() for performance reasons. Read my rant again if you have the urge to do that! Don't go changing working, debugged, tested code in order to obtain a performance benefit that is (1) highly unlikely to be a real bottleneck, and (2) highly unlikely to be your worst performance problem.

And besides, using Value has the nice property that if you have made a mistake and fetched the value of a null, you'll get an exception that informs you of where your bug is! Code that draws attention to its faults is a good thing.5


Next time on FAIC: How does the C# compiler use its knowledge of the facts discussed today to your advantage? Have a great Christmas everyone; we'll pick up this subject again in a week.

  1. Put another way, calling Value without knowing that HasValue is true is a boneheaded exception.
  2. Nullable value types are magical in other ways; for example, there's no way to write your own struct that has the strange boxing behaviour of a nullable value type; an int? boxes to either an int or null, never to a boxed int?.
  3. An inlining optimization is where the jitter eliminates an unnecessary "call" and "return" instruction by simply generating the code of the method body "inline" in the caller. This is a great optimization because doing so can make code both smaller and faster in some cases, though it does make it harder to debug because the debugger has no good way to generate breakpoints inside the inlined method.
  4. A "basic block" is a region of code where you know that the code will execute from the top of the block to the bottom without any "normal" branches in or out of the middle of the block. (A basic block may of course have exceptions thrown out of it.) Many optimizing compilers use "basic blocks" as an abstraction because it abstracts away the unnecessary details of what the block actually does, and treats it solely as a node in a flow control graph.
  5. Note that here we have one of those rare cases where the frameworks design guidelines have been deliberately bent. We have a "Get" method is actually faster than a property getter, and the property getter throws! Normally you expect the opposite: the "Get" method is usually the one that is slow and can throw, and the property is the one that is fast and never throws. Though this is somewhat unfortunate, remember, the design guidelines are our servants, not our masters, and they are guidelines, not rules.

Inheritance and representation

(Note: Not to be confused with Representation and Identity.)

Here's a question I got this morning:

class Alpha<X>
where X : class
{}
class Bravo<T, U>
where T : class
where U : T
{
  Alpha<U> alpha;
}

This gives a compilation error stating that U cannot be used as a type argument for Alpha's type parameter X because U is not known to be a reference type. But surely U is known to be a reference type because U is constrained to be T, and T is constrained to be a reference type. Is the compiler wrong?

Of course not. Bravo<object, int> is perfectly legal and gives a type argument for U which is not a reference type. All the constraint on U says is that U must inherit from T.1 int inherits from object, so it meets the constraint. All struct types inherit from at least two reference types, and some of them inherit from many more.2

The right thing for the developer to do here is of course to add the reference type constraint to U as well.

That easily-solved problem got me thinking a bit more deeply about the issue. I think a lot of people don't have a really solid understanding of what "inheritance" means in C#. It is really quite simple: a derived type which inherits from a base type implicitly has all inheritable members of the base type. That's it! If a base type has a member M then a type that inherits from it has a member M as well.3

People sometimes ask me if private members are inherited; surely not! What would that even mean? But yes, private members are inherited, though most of the time it makes no difference because the private member cannot be accessed outside of its accessibility domain. However, if the derived class is inside the accessibility domain then it becomes clear that yes, private members are inherited:

class B
{
  private int x;
  private class D : B
  {

D inherits x from B, and since D is inside the accessibility domain of x, it can use x no problem.

I am occasionally asked "but how can a value type, like int, which is 32 bits of memory, no more, no less, possibly inherit from object?  An object laid out in memory is way bigger than 32 bits; it's got a sync block and a virtual function table and all kinds of stuff in there."  Apparently lots of people think that inheritance has something to do with how a value is laid out in memory. But how a value is laid out in memory is an implementation detail, not a contractual obligation of the inheritance relationship! When we say that int inherits from object, what we mean is that if object has a member -- say, ToString -- then int has that member as well. When you call ToString on something of compile-time type object, the compiler generates code which goes and looks up that method in the object's virtual function table at runtime. When you call ToString on something of compile-time type int, the compiler knows that int is a sealed value type that overrides ToString, and generates code which calls that function directly. And when you box an int, then at runtime we do lay out an int the same way that any reference-typed object is laid out in memory.

But there is no requirement that int and object be always laid out the same in memory just because one inherits from the other; all that is required is that there be some way for the compiler to generate code that honours the inheritance relationship.

  1. More specifically, it must inherit from T or be identical to T, or inherit from a type related to T by some variant conversion. Consult the specification for details.
  2. Enum types inherit from System.Enum, many struct types implement interface types, and so on.
  3. Of course that's not quite it; there are some odd corner cases. For example, a class which "inherits" from an interface must have an implementation of every member of that interface, but it could do an explicit interface implementation rather than exposing the interface's members as its own members. This is yet another reason why I'm not thrilled that we chose the word "inherits" over "implements" to describe interface implementations. Also, certain members like destructors and constructors are not inheritable.