The no-lock deadlock

People sometimes ask me if there is a cheap-and-easy way to guarantee thread safety. For example, “if my method only reads and writes local variables and parameters, can I guarantee that my method is threadsafe?” Questions like that are dangerous because they are predicated on an incorrect assumption: that if every method of a program is “threadsafe”, whatever that means, then the entire program is “threadsafe”. I might not be entirely clear on what “threadsafe” means, but I do know one thing about it: thread safety is a property of entire programs, not of individual methods.

To illustrate why these sorts of questions are non-starters, today I present to you the world’s simplest deadlocking C# program:

class C
  static C() 
    // Let's run the initialization on another thread!
    var thread = new System.Threading.Thread(Initialize);
  static void Initialize() { }
  static void Main() { }

(Thanks to my Roslyn compiler team colleague Neal Gafter for this example, which was adapted from his book Java Puzzlers.)

At first glance clearly ever method of this incredibly simple program is “threadsafe”. There is only a single variable anywhere in the program; it is local, is written once, is written before it is read, is read from the same thread it was written on, and is guaranteed to be atomic. There are apparently no locks anywhere in the program, and so there are no lock ordering inversions. Two of the three methods are empty. And yet this program deadlocks with 100% certainty; the program “globally” is clearly not threadsafe, despite all those nice “local” properties. You can build a hollow house out of solid bricks; so too you can build a deadlocking program out of threadsafe methods.

The reason why this deadlocks is a consequence of the rules for static constructors in C#; the important rule is that a static constructor runs exactly zero or one times, and runs before a static method call or instance creation in its type. Therefore the static constructor of C must run to completion before Main starts. The CLR notes that C‘s static constructor is “in flight” on the main thread and calls it. The static constructor then starts up a new thread. When that thread starts, the CLR sees that a static method is about to be called on a type whose static constructor is “in flight” another thread. It immediately blocks the new thread so that the Initialize method will not start until the main thread finishes running the class constructor. The main thread blocks itself waiting for the new thread to complete, and now we have two threads each waiting for the other to complete.

Next time on FAIC: We’re opening up the new Coverity office in Seattle! After which, we’ll take a closer look at the uses and abuses of the static constructor.

Defect spotting, part two

I had a great time hanging out with my colleagues Bob and Amie yesterday at the HUB, talking with students, spotting defects and handing out yo-yos. Thanks to all who came out, and to the SWE for putting on a great event.

To follow up on the puzzle I posted yesterday: the terrible flaw, which most people spotted right away, was that the expression geteuid != 0 was of course intended to be geteuid() != 0. The code as written compares the function pointer to null, which it never is, and therefore the right side is always true, and therefore the conditional falls into the “fail with an error” branch more often than it ought to. The program succeeds if the user really is root, or if they are “sudo” root. It is intended to succeed also if the user is “effectively” root, but it does not. Thank goodness in this case the program fails to a secure mode! It is not at all difficult to imagine a situation where such an accidental function pointer usage causes the program to fail into the insecure mode. In any event, Coverity’s checker catches this one. (And of course more modern languages like C# do not allow you to use methods in a context other than a call or delegate conversion.)

There are of course any number of other flaws in this fragment. First, it’s now considered bad form to check for root like this; rather, check to see if the user is granted an appropriate permission. Second, the code is hard to read if you do not know the convention that the root user gets magical id zero by default; the code could be much more self-documenting. And so on; several people made good observations in the comments.

Next time on FAIC: You can build a hollow house out of solid bricks, and you can build a deadlocking program out of threadsafe methods too.

Defect spotting at the HUB

The University of Washington chapter of the Society of Women Engineers is putting on their quarterly career fair today at the Husky Union Building, and I’ll be there with a couple of my fabulous Coverity colleagues. If you happen to be a UW student reading this and want to stop on by for a round of my favourite game, “Spot the Defect” please do. We’ll be presenting a whole pile of buggy code; some of the defects will be quite straightforward and some of them will be subtle, but they all have a lesson.

If you want to play along at home, here’s one of the easy ones; this code was in a real used-by-customers product when Coverity’s static analysis engine discovered it:

  if (!strcmp(argv[i], "-configure"))
    if (getuid() != 0 && geteuid != 0)
      puts("Only root can use -configure.n");
  xf86DoConfigure = TRUE;
  xf86AllowMouseOpenFail = TRUE;
  return 1;

Can you spot the defect and describe its consequence?

Anyways, like I said, if you’re a UW student then stop on by the booth and we’ll chat. The fair is open 12:30 to 5:30; if you’re a SWE member then you get early admission and can come in any time after 12:00. Hope to see you there!

Next time on FAIC: The solution to today’s puzzle.

Integer division that rounds up

Integer arithmetic is surprisingly difficult to get right. There are usually lots of weird corner cases and the temptation to use too-clever bit-twiddling tricks is high. Today I thought I’d give an example of how I approach solving integer arithmetic problems on those rare occasions when I have to. So let’s consider the problem of writing a C# method that divides two 32 bit integers such that the division rounds up if it is inexact. Continue reading

Five-dollar words for programmers: elision

Good writers borrow from other writers. Great writers steal from them outright. — Aaron Sorkin

Sorkin was of course stealing outright from T.S. Eliot, who in 1920 wrote:

Immature poets imitate; mature poets steal; bad poets deface what they take, and good poets make it into something better, or at least something different. The good poet welds his theft into a whole of feeling which is unique, utterly different than that from which it is torn; the bad poet throws it into something which has no cohesion. A good poet will usually borrow from authors remote in time, or alien in language, or diverse in interest.

Sorkin’s pithy summary, or a slight modification of it, is often attributed to Pablo Picasso and Oscar Wilde, but I’ve found no reliable evidence that either man ever said anything like it. See this blog post for details.

Computer programmers have a habit of taking jargon from one area of study and borrowing it for use in their own. Today’s five dollar word for programmers is elision; the verb form is to elide. This term from linguistics is used in two non-standard ways by computer programmers.

In natural language linguistics, elisions are the removal of sounds from spoken words that do not change their meaning. Saying

I’m gonna eat my vegtables.

instead of

I am going to eat my vegetables.

elides a great many sounds but does not change the meaning.

A less common usage of the term “elision” in natural linguistics is the omission of entire words. Many of my friends from Pennsylvania, for example, tend to elide the words “to be” from sentences that would require them in the Queen’s English, like “do you have any towels that need washed?”

It is in this sense that computer language designers have borrowed the term: an elision in a programming language is when the language makes optional what might be thought of as a necessary portion of the grammar.

There are a number of legal elisions in C#. For example, the accessibility modifier private may be elided almost everywhere; you can usually remove private from a declaration without changing the meaning of the program. All versions of C# allow you to elide the array bounds and array type in a local variable initializer. These are all the same:

int[] x = { 10, 20, 30 };
int[] x = new int[] { 10, 20, 30 };
int[] x = new int[3] { 10, 20, 30 };

And C# 3.0 and higher also allow

int[] x = new[] { 10, 20, 30 };

I said there were two ways in which computer programmers co-opt the word; the second is in optimization theory. As we just saw in my long series on optimizing lifted arithmetic, the optimizer is allowed to elide conversions if it knows that they are unnecessary: in an expression where an int? is added to an int, by rights we ought to be implicitly converting the non-nullable integer to a nullable integer. But the compiler knows that doing so is a pointless waste of time because the conversion can be removed without changing the meaning of the program. Or, as I described back in 2010, the compiler will sometimes elide the creation of a temporary when initializing a local variable of value type. This form of elision optimization is called copy elision.

Next time on FAIC: Integer arithmetic can be tricky. We’ll try dividing two numbers and see what happens.

Nullable micro-optimizations, part eight

Today, the answer to the puzzle I posed last week. The answer is no, that optimization is not legal in general, though it is legal in a few specific cases.

As I have discussed several times before, C# has very strict rules about what order operations happen in. A quick rule of thumb is: operands run left-to-right. In our example we had X() * Y() + Z(), so let’s see what conclusions we reach by applying this rule:

  • X() is an operand of the * to the left of Y(), so the call to X() must happen before the call to Y().
  • X()*Y() is an operand of the + to the left of Z(), so the multiplication must happen before the call to Z().

The compiler must generate code that computes X(), then Y(), then the multiplication, then Z(), and then the addition. But our proposed optimization was

int? r;
int? tempX = X();
int? tempY = Y();
int tempZ = Z();
r = tempX.HasValue & tempY.HasValue ?
  new int?(tempX.GetValueOrDefault() * tempY.GetValueOrDefault() + tempZ) :
  new int?();

which computes Z() before the multiplication.

So what’s the big deal? Multiplication of integers can throw an exception in a checked context. That exception should prevent Z() from being called in the first place should X() * Y() throw on the multiplication. This optimization is only valid in an unchecked context.

And of course it just gets worse if we start to consider lifted arithmetic on types other than int?. It’s a bad practice to write a user-defined operator with a side effect, but it is legal and the C# compiler must ensure that side effects of user-defined operators are observed to happen in the prescribed order.

Rather than try to figure out all the types for which this optimization is legal in various contexts, Roslyn makes no attempt to optimize this kind of binary operator. It generates a temporary int? for the multiplication, and then does the regular lifted arithmetic for the addition. Another lovely optimization spoiled by conflict with an lovely invariant.

But wait!

The problem is that the side effects of the multiplication must be observed to come before the side effects of the right addend. What if the right addend has no side effects? Then we can do this optimization!

The most common situation in which the right addend has no side effects is when it is a constant:

int? r = X() * Y() + 1;

This can legally be optimized to

int? r;
int? tempX = X();
int? tempY = Y();
r = tempX.HasValue & tempY.HasValue ?
  new int?(tempX.GetValueOrDefault() * tempY.GetValueOrDefault() + 1) :
  new int?();

And in fact I did add this optimization to Roslyn; just as unary operations and conversions can be “distributed”, so can binary operations where the right operand is a constant. Moreover, as a pleasant side effect, doing so allowed for an easy implementation of various optimizations that improve the quality of lifted x += 1 and x++ expressions.

Well, that took rather more episodes than I thought it would when I started! I could talk a bit more about how Roslyn optimizes other more exotic nullable arithmetic, like equality, inequality, and logical operations. I could also talk a bit about the stuff I didn’t get to implement before I left; Roslyn does a slightly worse job than the original recipe compiler when optimizing expressions where we know that the result is going to be null, but also must compute a side effect. But rather than go down those lengthy rabbit holes, I think I’ll call this series done at eight episodes.

Next time on FAIC: Some of these optimizations we’ve been talking about are elisions; I’ll talk a bit about how computer programmers have borrowed this term from linguistics, and how we use it in two different senses.

Hummingbird PSA

hummingbirdWe interrupt our adventures in lifted arithmetic optimization for a brief public service announcement on behalf of my little friend to the left there. (Click on the image for a larger version.) I took this photo at my home in Seattle this past Sunday, before I flew down to San Francisco for training at Coverity’s head office. (And it took a number of attempts to freeze the wings, even at 1/640th of a second shutter speed with my Canon DSLR.)  Amazingly enough, male Anna’s Hummingbirds like my friend there will spend winters in Seattle. At the time this photo was taken it was well below freezing. Apparently more and more male hummingbirds are staying in Seattle over the winter, as climate change makes the winters here slightly more survivable; even Rufous and Allen’s hummingbirds are being observed. They migrate out of the hills to the warmer lowlands, but don’t go farther.

Hummingbirds are always within mere moments of starving to death and in the Seattle winter there are far fewer sources of food than other times of the year. (Anna’s Hummingbirds are unusual in that they can eat bugs while in flight, which does help.) If you’ve got hummingbirds coming to a feeder, keep it stocked. Take it in at night; I found my feeder frozen solid and a very vexed hummingbird trying in vain to feed one morning.

A number of people have asked me where they can get my feeder; you can find it here.

You’ll notice that in my feeder the liquid is clear and colourless. It is one part sugar to three or four parts water, and that’s it. Stores will try to sell you hummingbird food at enormously inflated prices, but what they’re selling is sugar mixed with red dye. The red dye does nothing to attract the birds, and it is bad for them. Their little kidneys have to remove the dye from their bloodstream.

I’m in California again next week, but I’ve queued up a couple of articles to go out while I’m away. Have a great weekend, and tune in Monday for the final episode in my excessively long series on lifted arithmetic optimization.

Nullable micro-optimizations, part seven

Today, a puzzle for you.

We’ve been talking about how the Roslyn C# compiler aggressively optimizes nested lifted unary operators and conversions by using a clever technique. The compiler realizes the inner operation as a conditional expression with a non-null nullable value on the consequence branch and a null nullable value on the alternative branch, distributes the outer operation to each branch, and then optimizes the branches independently. That then gives a conditional expression that can itself be the target of further optimizations if the nesting is deeper.

This works great for lifted conversions and unary operators. Does it also work for binary operators? It seems like it would be a lot harder to make this optimization work for a lifted binary operator where both operands are themselves lifted operations. But what if just one of the operands was a lifted operation, and the other operand was guaranteed to be non-null? There might be an opportunity to optimize such an expression. Let’s try it. Suppose X() and Y() are expressions of type int? and that Z() is an expression of type int:

int? r = X() * Y() + Z();

We know from our previous episodes that operator overload resolution is going to choose lifted multiplication for the inner subexpression, and lifted addition for the outer subexpression. We know that the right operand of the lifted addition will be treated as though it was new int?(Z()), but we can optimize away the unnecessary conversion to int?. So the question is can the C# compiler legally code-generate that as though the user had written:

int? r;
int? tempX = X();
int? tempY = Y();
int tempZ = Z();
r = tempX.HasValue & tempY.HasValue ?
  new int?(tempX.GetValueOrDefault() * tempY.GetValueOrDefault() + tempZ) :
  new int?();

If you think the answer is “yes” then the follow-up question is: can the C# compiler legally make such an optimization for all nullable value types that have lifted addition and multiplication operators?

If you think the answer is “no” then the follow-up questions are: why not? and is there any scenario where this sort of optimization is valid?

Next time on FAIC we’ll be kind to our fine feathered friends; after that, we’ll find out the answer to today’s question.

Eric is crazy busy at Coverity’s head office; this posting was pre-recorded.

Nullable micro-optimizations, part six

Previously in this series I said that the fact that the original C# compiler pursues a less aggressive strategy for optimizing away temporaries and branches from nested lifted conversions and unary operators because it suffers from “premature optimization”. That’s a loaded term and I’m not using it in the standard sense, so I want to clarify that a bit.

Donald Knuth, author of the classic four-volume series The Art of Computer Programming, famously said “premature optimization is the root of all evil.“. I think however that it is more instructive to read that quotation with more context:

We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil. Yet we should not pass up our opportunities in that critical 3%. A good programmer will not be lulled into complacency by such reasoning, he will be wise to look carefully at the critical code; but only after that code has been identified.

Which is of course what I have echoed in my numerous performance rants over the years: don’t waste your valuable time making risky changes to the 97% of the code that isn’t the slowest thing and that no customer will ever notice. Use a profiler, find the slowest 3%, and spend your optimization budget on that.

That is good advice, but when I say that a compiler suffers from “premature optimization”, that’s not at all what I mean. Rather, I mean that the compiler performs an optimization pass too early in the compilation process.

Back in 2010 I described in considerable detail the various stages, or “passes”, that the original recipe C# compiler performs when going from raw text to IL, so you might want to read that. For the purposes of this discussion we can simplify that all down to four stages:

  1. Lexical and grammatical analysis
  2. Initial “binding” — that is, semantic analysis
  3. “Lowering” — that is, rewriting high-level code into low-level code — and additional error detection
  4. Code generation

You would expect that semantic optimizations like the ones I described back in 2009  such as lifted arithmetic lowering would happen in the third stage. Some optimizations of course happen during the fourth stage, because the code generator itself can identify branches and temporaries that can be eliminated. But most happen in the third stage.

Doing an optimization in the wrong stage can introduce correctness problems; for some examples of how premature optimizations in the initial binding pass led to bugs and breaking changes, see my posts from 2006 on that subject. Part one. Part two.

Today I’m not concerned about correctness; I’m concerned about how complete the optimization is. The implementation decision which is vexing me today is that the original recipe C# compiler’s strategy is that the initial binding pass identifies portions of lifted arithmetic expressions that can be optimized later, and flags them as needing attention during the lowering pass, which is where the optimizations are done.

I am over-simplifying here; it is not as simple as a Boolean flag in most cases. In fact, the amount of information that is stored by the initial binding pass for the use of the optimizer later is quite scary because it is easy to accidentally use the wrong bits when lowering. An example of such a bug is in this StackOverflow question. But we can think of it logically as a flag.

The problem is that the initial binding pass only identifies opportunities for optimization based on the original form of the code. If the optimization pass produces “lowered” code that is itself amenable to further optimization then it is never optimized because there’s no flag left in there by the initial binding pass! Deciding whether something could benefit from optimization was being done too soon.

To make a long story short — and yes, this seems to have gotten rather long, sorry — the practical upshot is that the original recipe compiler is very good at finding “shallow” optimization opportunities on lifted operations, but very bad at making optimizations compose nicely when lifted operations are deeply nested; those tend to generate lots of unnecessary temporaries and branches.

Like I said previously, the compiler is not required to make those optimizations, but it has always vexed me that it does not. Roslyn improves on this situation by deferring all lowering and optimization of lifted arithmetic to the lowering phase; only the bare minimum analysis is performed during the initial binding pass. Roslyn optimizes each lifted arithmetic expression as it lowers it to temporaries and conditionals, and then tries aggressively to “distribute” lifted unary operations and conversions into those generated conditionals in order to skip creation of unnecessary temporaries and branches.

Next time on FAIC: Is it ever possible to apply this optimization technique to a lifted binary operator? I’ll pose that question in the form of a puzzle.

Eric is crazy busy at Coverity’s head office; this posting was pre-recorded.

First day

coverity-logoToday is my first day at Coverity! I am super-excited!

I have an enormous amount to learn about their systems. Since I have not so much as logged in to a Unix-based development environment since 1996, I’m going to be completely heads-down for the next couple of weeks. I am spending that time in San Francisco at the head office, drinking from the fire hose of static analysis knowledge. I will therefore have almost no time to spend on the blog. Expect things to be on auto-pilot for a little while. I’ve spent part of my time off queuing up articles, so it should not go entirely dark.

I’ll then spend the last week of January working remotely; the new office in the Columbia Tower in Seattle should be set up by the first week of February if all goes as planned. I am so looking forward to working downtown! I’ll post some photos when I get access to the new space.

Next time on FAIC: we’ll get back to the subject of how the Roslyn compiler optimizes lifted arithmetic.