UPDATE: A commenter points out that today is the 200th anniversary of the birth of George Boole; I had no idea when I scheduled this article that it would be so apropos. Happy birthday George Boole!
Here’s a little-known and seldom-used fact about C# operators: you can apply the
| operators to
bools, not just to integers. The
| operators on
bools differ from
|| in only one way: both operators always “eagerly” evaluate both operands. This is in marked contrast to the “lazily” computed evaluation of the
|| operators, which only evaluate their right hand argument if needed. Why on earth would you ever want to evaluate the right hand side if you didn’t need to? Why have this operation at all on
A few reasons come to mind. First, sometimes you want to do two operations, and know whether both of them succeeded:
bool totalSuccess = First() & Second();
If you want both operations to happen regardless of whether the first succeeded then using
&& would be wrong. (And similarly if you want to know if either succeeded, you’d use
| instead of
Though this code is correct, I don’t like it. I don’t like expressions that are useful for their side effects like this; I’d prefer to see one effect per statement:
bool firstSucceeded = First(); bool secondSucceeded = Second(); bool totalSuccess = firstSucceeded & secondSucceeded;
(Also, the original code seems harder to debug; I might want to know when debugging or testing which of the operations succeeded. And of course I am not a super big fan of the “success code” pattern to begin with, but that’s another story.)
But still here we have the
& operator instead of the
&& operator. What’s the compelling benefit of using
& here instead of
Think about it this way. Suppose you wish to write this code:
bool totalSuccess = firstSucceeded && secondSucceeded; ...
but you don’t get the
&& operator. In fact, all you get is:
ifstatements of the form
if(bool)where the body is a
- assignment of literals to variables and variables to variables.
Well, that’s pretty straightforward:
bool totalSuccess; if (firstSucceeded) goto CONSEQUENCE; totalSuccess = false; goto DONE; CONSEQUENCE: totalSuccess = secondSucceeded; DONE: ...
But this is the situation that C# is actually in; the C# code must be translated into IL, and IL has no
&& instruction. It has conditional branches, unconditional branches, and assignments, so C# generates the IL equivalent of that code every time you use
&&. (And similarly for
That’s a lot of code! But there is an IL instruction for
|, so the code generation there is very straightforward and very small.
What are the consequences of the much larger code generation? First of all, the executable is a few bytes larger. Larger code means that less code fits into the processor cache, which means more cache misses at jit time.
The jitter has an optimizer of course, and many optimizers work by analyzing the “basic blocks” of a method. A “basic block” is a section of IL where control flow always enters at the top and always leaves at the bottom; by knowing where all the basic blocks are, the optimizer can analyze the control flow of the method. The & and | operators introduce no additional basic blocks into a method, but the && operator as you can see above introduces two new basic blocks that were not there before, labeled CONSEQUENCE and DONE. Now the jitter has more work to do.
And remember, the jitter has to work fast; it is jitting code in real time here. As method complexity increases, the number of optimizations that can be successfully performed at runtime at reasonable cost decreases. The jitter is entirely within its rights to say “this method either is too long / has too many basic blocks; I’m never going to inline it”, for example. So perhaps the machine code generated is a little worse than it otherwise could have been.
And finally, think about the generated machine code. Again, the code generated from the && version will be larger, which means less program logic fits in the small processor cache, which means more cache evictions. Also, the more branches that are in the code, the more branch prediction the CPU must do, which means more opportunities to predict wrong.
UPDATE: A commenter asks if the C# compiler or jitter can decide to change lazy operators into eager operators if doing so is provably correct and likely faster. Yes, a compiler is allowed to do so; whether the C# or JIT compilers actually do so, I don’t know. I’ll check!
ANOTHER UPDATE: It does! I was unaware of this optimization, and probably should have checked to see if it existed before I wrote this article. 🙂 In C# 6, if the right hand side of an && operation is a local variable then the IL is generated as though it was &. I do not recall having seen this optimization before; perhaps it is new, or perhaps I simply never took a sufficiently close look at the IL generator. (I was aware that if either side of the operator is a compile-time constant true or false then optimizations are performed, but optimizations when operands are known at compile time is a good subject for another day.)
Now, I hasten to point out that these considerations here are the very definition of nano-optimizations. No commercial program ever attributed its widespread acceptance and profitability in the marketplace because a few
&s were used judiciously instead of
&&. The road to performance still demands good engineering discipline rather than random applications of tips and tricks. Still, I think it is useful to realize that avoiding the evaluation of the right hand side might, in some cases, be more expensive than simply doing the evaluation. When generating code to lower nullable arithmetic, for example, the C# compiler will generate eager operations instead of lazy operations.