Nullable micro-optimizations, part eight

Today, the answer to the puzzle I posed last week. The answer is no, that optimization is not legal in general, though it is legal in a few specific cases.

As I have discussed several times before, C# has very strict rules about what order operations happen in. A quick rule of thumb is: operands run left-to-right. In our example we had X() * Y() + Z(), so let’s see what conclusions we reach by applying this rule:

  • X() is an operand of the * to the left of Y(), so the call to X() must happen before the call to Y().
  • X()*Y() is an operand of the + to the left of Z(), so the multiplication must happen before the call to Z().

The compiler must generate code that computes X(), then Y(), then the multiplication, then Z(), and then the addition. But our proposed optimization was

int? r;
int? tempX = X();
int? tempY = Y();
int tempZ = Z();
r = tempX.HasValue & tempY.HasValue ?
  new int?(tempX.GetValueOrDefault() * tempY.GetValueOrDefault() + tempZ) :
  new int?();

which computes Z() before the multiplication.

So what’s the big deal? Multiplication of integers can throw an exception in a checked context. That exception should prevent Z() from being called in the first place should X() * Y() throw on the multiplication. This optimization is only valid in an unchecked context.

And of course it just gets worse if we start to consider lifted arithmetic on types other than int?. It’s a bad practice to write a user-defined operator with a side effect, but it is legal and the C# compiler must ensure that side effects of user-defined operators are observed to happen in the prescribed order.

Rather than try to figure out all the types for which this optimization is legal in various contexts, Roslyn makes no attempt to optimize this kind of binary operator. It generates a temporary int? for the multiplication, and then does the regular lifted arithmetic for the addition. Another lovely optimization spoiled by conflict with an lovely invariant.

But wait!

The problem is that the side effects of the multiplication must be observed to come before the side effects of the right addend. What if the right addend has no side effects? Then we can do this optimization!

The most common situation in which the right addend has no side effects is when it is a constant:

int? r = X() * Y() + 1;

This can legally be optimized to

int? r;
int? tempX = X();
int? tempY = Y();
r = tempX.HasValue & tempY.HasValue ?
  new int?(tempX.GetValueOrDefault() * tempY.GetValueOrDefault() + 1) :
  new int?();

And in fact I did add this optimization to Roslyn; just as unary operations and conversions can be “distributed”, so can binary operations where the right operand is a constant. Moreover, as a pleasant side effect, doing so allowed for an easy implementation of various optimizations that improve the quality of lifted x += 1 and x++ expressions.

Well, that took rather more episodes than I thought it would when I started! I could talk a bit more about how Roslyn optimizes other more exotic nullable arithmetic, like equality, inequality, and logical operations. I could also talk a bit about the stuff I didn’t get to implement before I left; Roslyn does a slightly worse job than the original recipe compiler when optimizing expressions where we know that the result is going to be null, but also must compute a side effect. But rather than go down those lengthy rabbit holes, I think I’ll call this series done at eight episodes.


Next time on FAIC: Some of these optimizations we’ve been talking about are elisions; I’ll talk a bit about how computer programmers have borrowed this term from linguistics, and how we use it in two different senses.

17 thoughts on “Nullable micro-optimizations, part eight

  1. This would work too, wouldn’t it?

    int? r;
    int? tempX = X();
    int? tempY = Y();
    r = tempX.HasValue & tempY.HasValue ?
    new int?(tempX.GetValueOrDefault() * tempY.GetValueOrDefault() + Z()) :
    new int?();

    Now, if you insisit on “everything in temps for optimized form”, that should become:
    int? r;
    int? tempX = X();
    int? tempY = Y();
    if (tempX.HasValue & tempY.HasValue)
    {
    int tempXY = tempX.GetValueOrDefault() * tempY.GetValueOrDefault();
    int tempZ = Z();

    r = new int?(tempXY + tempZ);
    }
    else
    r = new int?();

    Either of those should yield the same optimization as the faulty version, but without the problem.

    • No, none of the operators in the expression short-circuit in the non-exception case, so you would need:

      int? r;
      int? tempX = X();
      int? tempY = Y();
      if(tempX.HasValue & tempY.HasValue) {
      r = new int?(tempX.GetValueOrDefault() * tempY.GetValueOrDefault() + Z())
      } else {
      Z();
      r = new int?();
      }

  2. Does the C# compiler optimize “.Value” to “.GetValueOrDefault” if the developer explicitly wrote “.Value”? I like to write “.Value” to assert the non-nullness of my value. But if csc can prove that it cannot be null it could optimize this out.

    int? x = f();
    if(x != null) { cw(x.Value); }

    Should become:

    int? x = f();
    if(x != null) { cw(x.GetValueOrDefault()); }

    Great series, btw.

  3. Very interesting, as always. I’d like to hear more about the other nullable optimizations as well, if you ever extend this series in the future.

    It’s a shame that you didn’t get to finish your work on the Roslyn project before you left.

    A general blogging question:
    You mentioned a while back that you were going to give your opinion on TypeScript at some point. They just release 0.8.2. As a former language designer yourself, I’m interested to hear your take on it (criticisms, feature suggestions, etc.)

    Also, Anders has mentioned that TypeScript does more type inference than the C# compiler; I’d be interested in hearing about how and what improvements could be made in the C# compiler (and if Roslyn will have any, to your knowledge).

  4. Pingback: Hummingbird PSA | Fabulous adventures in coding

Leave a comment