Real world async/await defects

Today I’m once again asking for your help.

The headliner feature for C# 5 was of course the await operator for asynchronous methods. The intention of the feature was to make it easier to write asynchronous programs where the program text is not a mass of inside-out code that emphasizes the mechanisms of asynchrony, but rather straightforward, easy-to-read code; the compiler can deal with the mess of turning the code into a form of continuation passing style.

However, asynchrony is still hard to understand, and making methods that allow other code to run before their postconditions are met makes for all manner of interesting possible bugs. There are a lot of great articles out there giving good advice on how to avoid some of the pitfalls, such as:

These are all great advice, but it is not always clear which of these potential defects are merely theoretical, and which you have seen and fixed in actual production code. That’s what I am very interested to learn from you all: what mistakes were made by real people, how were they discovered, and what was the fix?

Here’s an example of what I mean. This defect was found in real-world code; obviously the extraneous details have been removed:

Frob GetFrob()
{
  Frob result = null;
  var networkDevice = new NetworkDevice();
  networkDevice.OnDownload += 
    async (state) => { result = await Frob.DeserializeStateAsync(state); };
  networkDevice.GetSerializedState(); // Synchronous
  return result; 
}

The network device synchronously downloads the serialized state of a Frob. When it is done, the delegate stored in OnDownload runs synchronously, and is passed the state that was just downloaded. But since it is itself asynchronous, the event handler starts deserializing the state asynchronously, and returns immediately. We now have a race between GetFrob returning null, and the mutation of closed-over local result, a race almost certainly won by returning null.

If you’d rather not leave comments here — and frankly, the comment system isn’t much good for code snippets anyways — feel free to email me at eric@lippert.com. If I get some good examples I’ll follow up with a post describing the defects.

About these ads

ATBG: Reordering optimizations

Last time on the Coverity Development Testing Blog’s continuing series Ask The Bug Guys I discussed whether it was a good idea to remove a lock which protects an integer field. My conclusion was that it is not, because the lock prevents many potentially confusing optimizations. This week I follow up on that episode with an example where eliding locks on volatile reads and writes permits a surprising result.


As always, if you have questions about a bug you’ve found in a C, C++, C# or Java program that you think would make a good episode of ATBG, please send your question along with a small reproducer of the problem to TheBugGuys@Coverity.com. We cannot promise to answer every question or solve every problem, but we’ll take a selection of the best questions that we can answer and address them on the dev testing blog every couple of weeks.

ATBG: Can I skip the lock when reading an integer?

Today on the Coverity Development Testing Blog‘s continuing series Ask The Bug Guys, I answer a question that made its way to me from a Coverity customer: is it a good idea to remove a lock which only protects the read of an integer field? After all, that read is guaranteed to be atomic, so it seems safe. As is usually the case, the situation is unexpectedly complicated!  (Wait… is “usually unexpected” an oxymoron?)

Continue reading

ATBG: How do exceptions interact with the “using” statement?

Today on the Coverity Development Testing Blog‘s continuing series Ask The Bug Guys, I answer a question I get quite frequently: what guarantees do we have about object disposal when the body of a using block is interrupted by an exception? The situation is rather complicated, it turns out.

As always, if you have questions about a bug you’ve found in a C, C++, C# or Java program that you think would make a good episode of ATBG, please send your question along with a small reproducer of the problem to TheBugGuys@Coverity.com. We cannot promise to answer every question or solve every problem, but we’ll take a selection of the best questions that we can answer and address them on the dev testing blog every couple of weeks.

Construction destruction

Take a look at this little program skeleton:

class Foo 
{ 
    private int x;
    private int y;
    public Foo(int x, int y) 
    {
        this.x = x;
        this.y = y;
        SideEffects.Alpha(); // Notice: does not use "this"
    }
    ~Foo() 
    { 
        SideEffects.Charlie(); 
    }
}
static class SideEffects
{
    public static void Alpha() { ... }
    public static void Bravo() { ... }
    public static void Charlie() { ... }
    public static void M()
    {
        Foo foo = new Foo(1, 2); 
        Bravo();
    }
}

Let’s suppose we have three side effects: Alpha, Bravo and Charlie. What precisely they are is not important. The question is: what do we know about the order in which Alpha, Bravo and Charlie execute when a call to M() occurs? Continue reading

The no-lock deadlock

People sometimes ask me if there is a cheap-and-easy way to guarantee thread safety. For example, “if my method only reads and writes local variables and parameters, can I guarantee that my method is threadsafe?” Questions like that are dangerous because they are predicated on an incorrect assumption: that if every method of a program is “threadsafe”, whatever that means, then the entire program is “threadsafe”. I might not be entirely clear on what “threadsafe” means, but I do know one thing about it: thread safety is a property of entire programs, not of individual methods.

To illustrate why these sorts of questions are non-starters, today I present to you the world’s simplest deadlocking C# program:[1. Thanks to my erstwhile colleague Neal Gafter for this example.]

class C
{
  static C() 
  {
    // Let's run the initialization on another thread!
    var thread = new System.Threading.Thread(Initialize);
    thread.Start();
    thread.Join();
  }
  static void Initialize() { }
  static void Main() { }
}

At first glance clearly ever method of this incredibly simple program is “threadsafe”. There is only a single variable anywhere in the program; it is local, is written once, is written before it is read, is read from the same thread it was written on, and is guaranteed to be atomic. There are apparently no locks anywhere in the program, and so there are no lock ordering inversions. Two of the three methods are empty. And yet this program deadlocks with 100% certainty; the program “globally” is clearly not threadsafe, despite all those nice “local” properties. You can build a hollow house out of solid bricks; so too you can build a deadlocking program out of threadsafe methods.

The reason why this deadlocks is a consequence of the rules for static constructors in C#; the important rule is that a static constructor runs exactly zero or one times, and runs before a static method call or instance creation in its type. Therefore the static constructor of C must run to completion before Main starts. The CLR notes that C‘s static constructor is “in flight” on the main thread and calls it. The static constructor then starts up a new thread. When that thread starts, the CLR sees that a static method is about to be called on a type whose static constructor is “in flight” another thread. It immediately blocks the new thread so that the Initialize method will not start until the main thread finishes running the class constructor. The main thread blocks itself waiting for the new thread to complete, and now we have two threads each waiting for the other to complete.


Next time on FAIC: We’re opening up the new Coverity office in Seattle! After which, we’ll take a closer look at the uses and abuses of the static constructor.