ATBG: de facto and de jure reachability

Posted on November 6, 2013 by ericlippert

UPDATE: Coverity has removed the dev testing blog; I’ve put in a link to the internet archive.

In the latest episode of the Coverity Development Testing Blog‘s continuing series “Ask the Bug Guys”, I dig into an interesting question about inconsistencies between de facto and de jure unreachability — that is, the difference between code that is actually unreachable and the smaller set of code that the C# compiler detects as unreachable. The difference can cause some interesting inconsistencies in the compiler’s behavior. And my colleague Jon ponders the wisdom of fixing fragile, hard-to-understand code even if it is at present correct.

As always, if you have questions about a bug you’ve found in a C, C++, C# or Java program that you think would make a good episode of ATBG, please send your question along with a small reproducer of the problem to TheBugGuys@Coverity.com. We cannot promise to answer every question or solve every problem, but we’ll take a selection of the best questions that we can answer and address them on the dev testing blog every couple of weeks.

17 thoughts on “ATBG: de facto and de jure reachability”

Random832 on November 6, 2013 at 10:28 am said:

“Alternative() is de facto unreachable but de jure reachable because true || Condition() is not a compile-time constant.”

Why isn’t true || Condition() a compile-time constant?

Reply ↓
- Random832 on November 6, 2013 at 10:29 am said:
  
  Or 0 * someIntegerLocalVariable, for that matter.
  
  Reply ↓
  - Eric Lippert on November 6, 2013 at 11:15 am said:
    
    As I said, because compile time constants are defined as expressions which contain only compile-time constants. Variables and method calls are not constants.
    
    Reply ↓
  - Joshua Lawrence on November 6, 2013 at 1:26 pm said:
    
    I expect that if the definition of compile-time constant in the spec was extended to include arithmetic optimisations like you’re saying, that would:
    
    * Make the particular arithmetic optimisations the compiler performs part of the spec, whereas at the moment it isn’t. This would complicate the spec and restrict the freedom of compiler writers to choose the optimisations they wish to implement,
    * Require that arithmetic optimisations are never turned off, which would make debugging code weirder,
    * Make the code more brittle, in that changing a zero to a one would suddenly change code from being constant to not, which because this affects reachability may change the warnings or errors produced by the compiler. In short, such a small change may make a compiling program not be, or vice versa.
    
    If I’m right, I guess the spec authors didn’t want to go down that path.
    
    Reply ↓
Joshua Lawrence on November 6, 2013 at 3:44 pm said:

It’s a bit noticeable on that blog that the attractive women use portrait photos for their posts, but the guys for the most part do not.

Reply ↓
- Eric Lippert on November 6, 2013 at 10:21 pm said:
  
  Another way to characterize your observation is that the marketing staff people have headshots and the development staff does not.
  
  Reply ↓
  - Joshua Lawrence on November 7, 2013 at 12:32 pm said:
    
    Fair enough. 🙂
    
    Reply ↓
Anarchymedes on November 6, 2013 at 10:57 pm said:

Now I know it’s a silly question, but I’m just curious: this code below

if (something/0 > 0)
whatever();

is spotted by the VS2012 IntelliSense: it immediately says ‘Error: division by constant zero’, even before going into the de-jure unreachability. And of course,
in this fragment, whatever() is definitely de-facto unreachable, but not because the expression never evaluates to true. It’s because the expression never evaluates at all — it just throws a ZeroDivideException. But then how about this:

for (Something s = new Something() { z = null }; s.z.Length < 100; s.z = s.z + " ")
whatever();

I just compiled it in VS2012 without any warning, but of course running it immediately causes a NullReferenceException. And theoretically — just theoretically — it was possible to spot the problem at compile time: first, s.z is explicitly assigned to null, next s.z.Length is invoked… What gives? Why in the first case a warning is too early (even before the compilation is attempted), and in the second, it never comes?

Reply ↓
- Kern on November 7, 2013 at 2:52 am said:
  
  In short, 0 is a constant and z is a variable.
  Some other thread could update z’s value before s.z.Length is invoked and it would be ok.
  
  Ps: Nice post Eric =)
  
  Reply ↓
- Eric Lippert on November 8, 2013 at 9:15 am said:
  
  Indeed, that is a kind of analysis that the C# compiler does not perform, though the Coverity analyzer I work on now does! That sort of flow analysis is much harder to get correct and much more time consuming, and therefore the compiler does not go to the trouble of implementing it. The compiler team wants warnings to be fast to compute, easy to explain and therefore relatively shallow bugs; dividing by a constant zero is very easy to notice and very easy to explain. But Coverity’s explanations of why a particular value must be null at a particular place can be complicated, hard to follow, and expensive to compute.
  
  Reply ↓
John Payson on November 8, 2013 at 10:15 am said:

I often find myself thinking for a variety of reasons that languages should regard operators as more than just a special syntax for something that behaves like a method call, and constant-propagation seems “yet another” example of a case where they should do so. When evaluating `x ? y : z`, it would seem that if x is a constant true, the expression should be constant if and only if y is constant; likewise, if x is always false, it should depend on z. If y and z are the same constant, the expression should be constant if and only if x has no side-effects, regardless of whether its actual value can be determined. If x is constant (whether true or false) x && y is equivalent to x ? y : false, and x || y is equivalent to x ? true : y.

Reply ↓
- Eric Lippert on November 8, 2013 at 11:51 am said:
  
  Sure, I take your point, but then the obvious question is: if x is the constant true and y is constant then why did the user write x ? y : z in the first place instead of just “y”? Often when something is the constant true and then used in a context with variables, the reason is because someone is debugging the code and temporarily rewrote a variable to be a constant in order to force a path. Making the assumption that the whole expression is intended to be constant seems like an unwarranted assumption.
  
  Remember, all code is a living document intended to be written and read by humans. We should be evolving our language and tooling design to take into account how code changes over time, and what the human factors are in those processes. Today compile-time analysis and heavy-duty static analysis takes into account only a “snapshot” of one version of the code in time; it would be interesting to instead slice it the other way, and analyze how code changes over time.
  
  Reply ↓
  - Anarchymedes on November 10, 2013 at 4:47 pm said:
    
    Analyzing the code changes over time may be helpful as a cloud service: some sort of AI (in the serious, engineering sense, not in the Sci-Fi Matrix sense 🙂 ) that constantly analyzes millions of lines of code and then, based on the collected knowledge, suggests the possible improvement. Basically, just tells you how other programmers have solved similar issues under similar circumstances. But that’s too crazy: it’s only the early 21st Century out there, after all. 🙂
    
    Reply ↓
  - John Payson on November 10, 2013 at 5:10 pm said:
    
    There are times when an expression which is always true or always false is a mistake, and there are other times when it is a result of code leaving room for expansion. Perhaps what’s needed is a means of declaring that something should be regarded as a constant at the JITter level, but will be expected to change between the time a DLL is built and the time it is executed. I’m not sure the JITter should normally go out of its way to try to optimize constants (in many cases the payoff will be limited and the cost significant) but something like:
    
    … within a big loop
    {
    if (someLibrary.Version > 12)
    someLibrary.doThisOperation();
    else
    someLibrary.someOtherOperation();
    }
    
    Having the C# compiler regard someLibrary.Version as a compile-time constant 11 would cause it to build a DLL which could be incompatible with future versions of the someLibrary DLL. On the other hand, it shouldn’t really be a variable, either. What’s needed is for the C# compiler to generate code which the JITter can recognize as performing a constant comparison, but withotu the C# compiler itself folding it.
    
    Reply ↓
configurator on November 10, 2013 at 9:32 am said:

There really should be an extra type of compiler message, which is only in effect when optimization is enabled and may change with every compilation because of different optimization techniques.

“Optimizer warning CS10162: Unreachable code detected.”

Reply ↓
Chris on November 12, 2013 at 5:38 am said:

I’m going to be honest. I have never come across a situation whereby the compiler’s definition if unreachable has caused me any problems at all. Has it for anyone? Really?

Reply ↓
- Anarchymedes on November 12, 2013 at 3:20 pm said:
  
  I haven’t either. But then again, we techies are perfectionists and most of us don’t like things that don’t make 100% perfect sense. So if somewhere something that’s potentially paradoxical, or illogical, or downright weird seems to be happening, the discussion warriors are up in arms. 🙂
  Although personally, I’ve always enjoyed some kind of mystery that’s “just so” or “just because”: I think it keeps me from getting arrogant and thinking I know it all. But I’ve always been more of an artist than a programmer: for me, programming is just a living. 🙂
  
  Reply ↓