ATBG: null coalescing precedence

Posted on October 23, 2013 by ericlippert

In the latest episode of the Coverity Development Testing Blog‘s continuing series “Ask the Bug Guys”, I dig into two questions about the null coalescing operator. This handy little operator is probably the least well understood of all the C# operators, but it is quite useful. Unfortunately, it is easy to accidentally use it incorrectly due to precedence issues.

As always, if you have questions about a bug you’ve found in a C, C++, C# or Java program that you think would make a good episode of ATBG, please send your question along with a small reproducer of the problem to TheBugGuys@Coverity.com. We cannot promise to answer every question or solve every problem, but we’ll take a selection of the best questions that we can answer and address them on the dev testing blog every couple of weeks.

16 thoughts on “ATBG: null coalescing precedence”

CodesInChaos on October 23, 2013 at 8:50 am said:

IMO the whole idea that operator precedence should be a total order is bad. Having precedence for some well understood cases (e.g. multiplication over addition) is useful. In most other cases the compiler should emit an error, requiring the user to add clarifying parentheses.

Reply ↓
John Payson on October 23, 2013 at 9:35 am said:

I agree, and would extend this principle further. I would posit that in a well-designed language, compilers *should* squawk in cases where:

-1- the behavior of the code may be different from what is intended

-2- an explicit clarification of programmer intent would generally make code if anything more readable

-3- such “clarification” would not increase the brittleness of code.

I would posit that it’s less important to be “consistent” about what compiles, than to minimize the astonishment factor of code which does compile. Unfortunately, language designers seem more interested in consistency of what compiles rather than in the principles given above. One of my personal pet peeves is with implicit conversion rules, which seem to be designed around the assumption that if X is implicitly convertible to Y in cases where the conversion would be obvious, it should be substitutable for Y even in places where the conversion would not be. I would consider conversions from Decimal to double or float, or from double to float, to be non-astonishing (and say they should be legal) in cases where the destination type is clear, but most implicit conversions should be forbidden in cases where the destination type is unclear. Even if the compiler has rules which would–from its point of view–resolve things unambiguously, that doesn’t mean the program represents a clear statement of programmer intent. There’s no ambiguity in how the compiler will process the statement

int someIntVar = 16777217;
if (someFloatVar == someIntVar) X();

It will execute X if `someFloatVar` holds the value 16777216.0f. I would say there’s considerable ambiguity, however, as to what the programmer intended; in a code review, I would ask that it be rewritten

if (someFloatVar == (float)someIntVar) X();

which conveys the intent much more clearly.

Reply ↓
- Joshua Lawrence on October 23, 2013 at 3:27 pm said:
  
  In a code review, in that if statement, my “floating point ==” alarm would ring, and I’d ask if they had considered floating-point uncertainty, and would this be better:
  
  if(abs(someFloatVar – (float)someIntVar) < EPSILON) {
  X();
  }
  
  Then I'd ask if there is a good reason why they're not just using double.
  
  All this is completely beside the point you are making, of course. 🙂
  
  Reply ↓
  - John Payson on October 23, 2013 at 10:16 pm said:
    
    Floating-point comparisons can be perfectly appropriate in some cases, though if I were designing a language I would fix the worst aspect (e.g. {double x=double.NaN; if (x!=x) Debug.Print(“Languages where == doesn’t represent reflexive equivalence are silly”);) at least for any type which wasn’t explicitly declared “IEEEdouble”. Fixing that might slow things down slightly since I don’t think there’s any single instruction to do a what in most cases would be the most sensible comparison (one which implements an equivalence relation, but still regards +0 and -0 as equivalent) but probably not too much. On the other hand, I would have liked to see `Object.Equals` use a stricter definition of equivalence, such that any value which could be semantically recognized as different would compare as unequal. I wouldn’t want == to regard +0 and -0 as distinct, but for things like interning caches, it’s important that Equals be defined very narrowly.
    
    Reply ↓
    - Gabe on October 25, 2013 at 8:17 am said:
      
      You’re right; there should two different varieties of “equals”. We obviously need a mathematical “equals” where NaN != NaN and +0 == -0, but we also need an “equals” that says whether two objects are interchangeable. That way your hash table could tell that NaN is-interchangeable-with NaN but +0 !is-interchangeable-with -0.
      
      Reply ↓
      - John Payson on October 25, 2013 at 10:22 pm said:
        
        Even from a mathematical standpoint, I would posit that it would be better to specify that whether or not all NaN values compare equal to each other, certain cases would be guaranteed to do so (at minimum, make equality reflexive; preferably also specify things so that if repeated passes of successive-approximation code yield NaN, code will decide that nothing has changed rather than endless looping; I would regard the statement “sqrt(-1.0) == log(0.0)” to be no less true than “1.0E308*1.0E308 == 1.0E308*10.0” or even “1E7f / 3f == 3333333.25f”. The `==` operator cannot reasonably be expected to imply anything stronger than that the left-hand and right-hand operands are indistinguishable; it may sometimes test a weaker condition [e.g. accepting the numerical equality of +0 and -0] but cannot be expected to test a stronger one.
Anarchymedes on October 23, 2013 at 10:36 pm said:

Personally, I always add brackets wherever I feel the precedence may potentially be an issue: not even from the compiler’s viewpoint, but from the point of view of someone reading the code (and yes, if you have to switch between multiple languages it’s sometimes easy to forget the simplest things like the operator precedence).
However, when the syntax becomes too restrictive, I find it more than a little annoying: it’s a bit like being peremptorily ordered to do something, this way or the highway. You might have done it gladly on your own, but once you’ve heard such an order, you resent it — at least, I always do. 🙂
So how about a compromise: a conditional compilation variable, or some kind of equivalent of the #pragma directive from C++, that would tell the compiler that from now on, the brackets will be mandatory, or the operators will be evaluated always left-to-right, or…insert more options here…?
That should satisfy both the lazy and the meticulous, the champions of strict order and the syntax freedom fighters (if there are any 🙂 ).

Reply ↓
Jerome on October 24, 2013 at 1:50 am said:

@Anarchymedes: Think of the sphagetti that would be required to implement that, and the “randomness” of the bugs that could potentially come about.

Reply ↓
- John Payson on October 24, 2013 at 8:52 am said:
  
  The Turbo C Compiler (1980’s) had a warning option for whether certain operators should require parentheses. It’s not hard. One general approach would be to build two expression trees from a piece of code using slightly different rules and then check whether they’re equal. Another approach is to have a compiler include more than one internal type for certain runtime types, and have the different internal types use different overloads and promotion rules. For example, there’s a semantic difference between the value produced by reading a `float` variable and the value produced by adding two `float` quantities. If a programmer writes `doubleVar = floatVar;`, there’s only one plausible intended meaning, but for `doubleVar = floatVar1+floatVar2;` there are three plausible intended meanings [perform the addition as float, or as double, or whatever the compiler thinks would be fastest]. A language could allow implicit conversions from a “normal float” to “double”, or from a “sum of floats” to a “normal float”, while disallowing implicit conversion from “sum of floats” to “double”.
  
  What’s most important I think, though, are (1) ensuring that any “yes that’s really what I want” constructs demanded by the compiler improve clarity and do not introduce brittleness; (2) recognizing that rather than allowing all constructs of a type, forbidding all constructs, or trying to have the compiler guess which constructs should be allowed, it’s often most helpful to allow programmers to indicate when certain constructs should or should not be allowed. For example, being able to invoke `Add` on a read-only instance of `Drawing.Point` is helpful, but being able to invoke `Offset` on such an instance is not. Rather than allowing or forbidding all method invocations on read-only struct instances, it would be more helpful for a language to allow struct methods to be marked with an attribute that would say whether those particular methods should be usable on read-only struct instances. Allowing people who know what constructs should be usable or not in various cases to specify that is more helpful than trying to formulate rules for such things.
  
  Reply ↓
Chris Marisic on October 24, 2013 at 8:15 am said:

Can we get a set if null operator?

I really hate writing code

if(list == null) list = new List()

I really want something like list ??= new List()

Reply ↓
- Joshua Lawrence on October 24, 2013 at 4:50 pm said:
  
  Seeing as Eric doesn’t work at MS anymore, I expect not. 🙂 Or at least not as a result of this request.
  
  Reply ↓
  - Chris Marisic on October 25, 2013 at 6:08 am said:
    
    Eric clearly has more connections and more pull than i certainly would have in this. I don’t even know where i could specifically request this other than the bone yard that is microsoft connect.
    
    Reply ↓
    - John Payson on October 25, 2013 at 10:27 pm said:
      
      What would be more generally helpful I would think would be a syntax somewhat like member invocation which would invoke a static method with the left-hand operand as a `ref` parameter. If a dot-colon combination was used, then one could use e.g.
      
      MyDelegate.:AtomicAdd(someOtherDelegate);
      
      as a shorthand for:
      
      SomeClass.AtomicAdd(ref MyDelegate, someOtherDelegate);
      
      The distinctive token before the method name would make clear that the operation was being done upon the storage location MyDelegate, which would be required to be writable. A set-if-null method could easily be defined as a generic extension upon all class types.
      
      Reply ↓
- voo on October 29, 2013 at 11:43 am said:
  
  That’s an unfair comparison, the better question is if:
  list ??= new List()
  is really better than:
  list = list ?? new List()
  
  I don’t see any problem with the idea, but it’s not something I really miss much and the current solution isn’t that ugly anyhow.
  
  Reply ↓
Alexander Batishchev on November 4, 2013 at 10:54 am said:

Hi Eric,
I didn’t find the proper way to ask the bad guys.
A question about weird C# compiler behavior here please.

Having 2 constructors:
public Ctor(int I) : this(“”) {}
public Ctor(string s) : this(0) {}

And fields you can write any bulshit you want into:
private byte f = new object() / new object() ^ 2;

Why if compiler detects nested constructor loop, it skips further analyze and just compiles (or quits?).

Reply ↓
Dui Attorney Fees on February 6, 2014 at 8:19 pm said:

When these things happen it’s imperative to keep calm and wait for an opportunity to discuss things with an attorney.
This indicates that they are indeed well versed with the law for DWI.
It is also an advantage to you if they have come up versus the similar legal opponents, and
judge, prior to.

Reply ↓