The no-lock deadlock

Posted on January 31, 2013 by ericlippert

People sometimes ask me if there is a cheap-and-easy way to guarantee thread safety. For example, “if my method only reads and writes local variables and parameters, can I guarantee that my method is threadsafe?” Questions like that are dangerous because they are predicated on an incorrect assumption: that if every method of a program is “threadsafe”, whatever that means, then the entire program is “threadsafe”. I might not be entirely clear on what “threadsafe” means, but I do know one thing about it: thread safety is a property of entire programs, not of individual methods.

To illustrate why these sorts of questions are non-starters, today I present to you the world’s simplest deadlocking C# program:

class C
{
  static C() 
  {
    // Let's run the initialization on another thread!
    var thread = new System.Threading.Thread(Initialize);
    thread.Start();
    thread.Join();
  }
  static void Initialize() { }
  static void Main() { }
}

(Thanks to my Roslyn compiler team colleague Neal Gafter for this example, which was adapted from his book Java Puzzlers.)

At first glance clearly ever method of this incredibly simple program is “threadsafe”. There is only a single variable anywhere in the program; it is local, is written once, is written before it is read, is read from the same thread it was written on, and is guaranteed to be atomic. There are apparently no locks anywhere in the program, and so there are no lock ordering inversions. Two of the three methods are empty. And yet this program deadlocks with 100% certainty; the program “globally” is clearly not threadsafe, despite all those nice “local” properties. You can build a hollow house out of solid bricks; so too you can build a deadlocking program out of threadsafe methods.

The reason why this deadlocks is a consequence of the rules for static constructors in C#; the important rule is that a static constructor runs exactly zero or one times, and runs before a static method call or instance creation in its type. Therefore the static constructor of C must run to completion before Main starts. The CLR notes that C‘s static constructor is “in flight” on the main thread and calls it. The static constructor then starts up a new thread. When that thread starts, the CLR sees that a static method is about to be called on a type whose static constructor is “in flight” another thread. It immediately blocks the new thread so that the Initialize method will not start until the main thread finishes running the class constructor. The main thread blocks itself waiting for the new thread to complete, and now we have two threads each waiting for the other to complete.

Next time on FAIC: We’re opening up the new Coverity office in Seattle! After which, we’ll take a closer look at the uses and abuses of the static constructor.

45 thoughts on “The no-lock deadlock”

Patrick Simpson on January 31, 2013 at 7:03 am said:

I think your example is the not the greatest. There is clearly a lock somewhere in the program: in the runtime. So not all methods are threadsafe, just the ones in the visible part of the program. Though I guess that’s the point you’re making 😉

Reply ↓
- teta on February 1, 2013 at 4:57 am said:
  
  The example is excellent. It means that though there is not lock anywhere in the code you can get your code out of sync (Initialize will finish before Join is called).
  
  Reply ↓
Mormegil on January 31, 2013 at 7:06 am said:

Mmmm… I don’t like the example very much. While the main point of the article is obviously true, the example does not explain that really well, IMHO. (As much as it is _interesting_ and surprising!) The deadlock here is a result of outside-of-the-program behavior (sure, defined behavior of CLR is obviously “a part of the program”, but… you know what I mean). It’s like saying something like
Main() { Thread.Sleep(Timeout.Infinite); }
“magically” deadlocks, even though the method is “obviously thread-safe” (_no_ variables at all). While (in some sense) true, it (IMHO) does not point to the essence of the problem. Or something.

Reply ↓
- Daniel Miranda on January 31, 2013 at 7:11 am said:
  
  A deadlock involves two tasks waiting for each other, which isn’t the case in your program. It hangs, but it’s not deadlocked per se.
  
  Reply ↓
  - John Payson on January 31, 2013 at 7:38 am said:
    
    The program is deadlocked, since there is a newly created thread which is waiting for the static constructor to finish before it can proceed, while the thread with the static constructor is blocked waiting for the new thread to finish. That having been said, I would regard the “thread.Join()” as being a rather overt blocking statement; a more interesting example might have been to use two static classes, since such a program could deadlock without any visible blocking statements.
    
    Eric Lippert: Out of curiosity, since the locking behavior for static class initialization is not needed once the static constructor has finished, do you know if the vtable gets patched at that point to bypass any blocking primitives?
    
    Reply ↓
    - Eric Lippert on January 31, 2013 at 7:46 am said:
      
      I am not an expert on the inner workings of the CLR, but I don’t think you’d have to patch the vtable. If a vtable exists and can be used then an instance must exist, and if an instance exists then the static initializer has already been executed.
      
      Reply ↓
    - Timothy Fries on January 31, 2013 at 3:55 pm said:
      
      With relaxed type construction (with beforefieldinit), the JIT will ensure the static constructor is called before generating any native code using the type, so there’s never any checks pertaining to the static constructor in the generated code.
      
      With strict construction, the JIT checks to see if the static constructor has already been called, and will emit code to do the check immediately before the type is accessed. If the static constructor has already been called when a method is being JITted, it just emits code without the check. Unfortunately, because .NET doesn’t re-JIT methods, that means that the first method JITted that accesses a type with a static constructor will always have the overhead of performing the check. The overhead isn’t that bad though, since I believe the CLR uses a check-lock-recheck pattern, so once the constructor has been called and once the caches are coherent, there’s never a possibility of blocking again.
      
      Reply ↓
    - Brian on February 1, 2013 at 7:50 am said:
      
      The two static class situation won’t normally deadlock. Jon Skeet discusses this at http://msmvps.com/blogs/jon_skeet/archive/2012/04/07/type-initializer-circular-dependencies.aspx :
      “If the CLI notices that type A needs to be initialized in order to make progress, but it’s already in the process of initializing type A in the same thread, it continues as if the type were already initialized.”
      
      Reply ↓
  - Eric Lippert on January 31, 2013 at 2:36 pm said:
    
    Of course there are two tasks. The first task is to run the static constructor, and the second task is to run Initialize. Those two tasks wait on each other.
    
    Reply ↓
Chris B on January 31, 2013 at 7:29 am said:

The MSDN documentation probably isn’t helping much with such confusion. The doc pages for a large number of classes state “Any public static (Shared in Visual Basic) members of this type are thread safe. Any instance members are not guaranteed to be thread safe” in “Thread Safety” section. Corollary to what you said, thread safety is not a property of types.

For the life of me, I can’t discern what is being communicated there. I actually took that from the documentation page for System.Random, and static instances of those are definitely not safe to share across threads.

Reply ↓
- W on January 31, 2013 at 8:31 am said:
  
  There is no such thing as “static instance”. You can only have a static field pointing to an instance.
  
  `System.Random` has no static members(unless you count constructors or those inherited from `System.Object`), so there is nothing thread-safe in that class.
  
  Reply ↓
- Damien on January 31, 2013 at 10:47 pm said:
  
  Many people find that boilerplate text confusing. They *think* it’s a comment about members that *they* define that are “of that type”, whereas I believe that it’s a statement about the members *belonging* to the type. I.e. It’s saying that (if it has any) any static members belonging to System.Random may be safely accessed from multiple threads. It’s saying nothing about any field or property that you create inside your own classes.
  
  It all boils down to “of this type” being interpretable in two different manners.
  
  Reply ↓
  - Chris B on February 1, 2013 at 6:55 am said:
    
    W’s comment made me realize exactly what you said. It’s not particularly clear what “public static members of this type” means. That is further exacerbated by applying that text to types which define no static members outside of those inherited from base types. I’m also not sure why the text appears in the documentation for the type instead of the documentation for the member. It further seems that the statement can only apply to the member access itself, but say nothing of the returned result of such an access. I could see that being a further source of confusion.
    
    Reply ↓
    - Jesse C. Slicer on February 1, 2013 at 8:16 am said:
      
      Yes, instead of “public static members of this type”, they should have worded it as “public static members on this type”. One letter makes a world of difference!
      
      Reply ↓
Chris B on January 31, 2013 at 7:34 am said:

Metacomment, it looks like CSS received a minor hosing. The comments are bleeding over into the sidebar. A brief inspection in chrome shows that the commentlist class has a width of 120%. Reducing it to 100 seems to correct it.

Reply ↓
- Eric Lippert on January 31, 2013 at 7:43 am said:
  
  I know; people were complaining that the comments were too narrow when set to 100%. I haven’t figured out all the CSS tricks yet to make them look good with this theme. I am a WordPress newb.
  
  Reply ↓
Adam on January 31, 2013 at 10:04 am said:

It is a bit disingenuous to call this a “no-lock” deadlock. The deadlock is due to the run-time’s type initializer lock. Which is a lock, albeit an implicit one. By definition, one cannot have a deadlock without a blocking operation involving multiple threads, after all.

Reply ↓
- Eric Lippert on January 31, 2013 at 2:39 pm said:
  
  Sure, it *is* a lock, but why does it *need* to be a lock? That’s an implementation detail. One could design a runtime in which a thread that was blocked waiting for a static constructor was then scheduled to service other work items until the static constructor completed asynchronously.
  
  Moreover, the notion that you need to have two threads “by definition” in order to have a deadlock seems unwarranted as well. Suppose you were given two tasks by your manager, and each task depended on the successful completion of the other, and your remaining work depended upon the completion of both. You’re telling me that you’re not deadlocked, just because there isn’t a second employee whose work you’re depending on?
  
  Many people think of threads as units of work, but they are not. Threads are *workers*. Most of the problems you see in multithreaded systems have analogous problems in single-threaded systems. People are just not yet in the habit of mentally separating workers from work.
  
  Reply ↓
  - John Payson on January 31, 2013 at 4:38 pm said:
    
    I think the reason “deadlock” is often taken to imply the existence of two workers is that it’s usually restricted to cases where someone is waiting for something that *they reasonably expect will happen*, but which can’t happen while they’re waiting for it. One could have a deadlock with a single worker, but only if the worker in question either “believed” in the existence of other independent workers or entities that could cause the awaited condition to occur, or else kept bouncing from task to task without noticing that nothing was actually getting accomplished.
    
    Reply ↓
Neal Gafter on January 31, 2013 at 2:42 pm said:

Although you have not presented it as a puzzle, I approve of this adaptation of Puzzle 85 (“Lazy Initialization”) from the book Java Puzzlers.

Reply ↓
- Eric Lippert on January 31, 2013 at 2:46 pm said:
  
  Indeed, I intended to credit you and it slipped my mind while I was writing this. Apologies!
  
  Reply ↓
Joel Coehoorn on January 31, 2013 at 7:12 pm said:

Haven’t been here if a few days, but the text is *black*! Why is the text black? Where is the purple? I don’t know who I am anymore!

Reply ↓
- Eric Lippert on February 1, 2013 at 7:08 am said:
  
  It’s purple on my machine. What browser are you using?
  
  Reply ↓
  - Douglas on February 4, 2013 at 7:38 am said:
    
    Black for me on Google Chrome Version 24.0.1312.57 m
    
    Reply ↓
    - Chris B on February 4, 2013 at 8:43 am said:
      
      Odd…I am on the same version of Chrome, and I’ve got purple….
      
      Perhaps you have a stale copy of the CSS in your cache?
      
      Reply ↓
      - Douglas on February 4, 2013 at 11:40 am said:
        
        Curious. I took a screenshot of the page and pasted into Paint.NET. If I crop out just the text (text on white, no grays), I can clearly see that the text is purple. But within the context of the whole page, I cannot.
        
        Maybe I have some kind of color blindness?
      - Douglas on February 4, 2013 at 11:50 am said:
        
        …It’s not just context.
        
        If I take a screenshot, I can clearly see purple when that screenshot is displayed in Paint.NET, or I save it as an image and view it in Windows image viewer.
        
        But displayed in Chrome, I cannot see purple, only black.
        
        This is not possible. The pixels have the same values!
        
        Just weird.
Jeroen Frijters on February 1, 2013 at 2:36 am said:

Your description of the static constructor rules is not entirely complete. There is an additional rule that says that static constructors by themselves never dead lock, so it is in fact possible to run a static method before the static constructor has run to completion.
This is in contrast with Java, where they can dead lock.

Reply ↓
- Eric Lippert on February 1, 2013 at 7:07 am said:
  
  Indeed, I’ll be exploring these subtleties in an upcoming series.
  
  Reply ↓
defaultex on February 1, 2013 at 11:50 pm said:

Shame I didn’t take many notes while developing my parallel game engine. Ran with a “lock-less” approach to maximize concurrency. Had so many deadlocks occur at various points of development that never really made any sense even after digging into the IL and reflected code from .Net.

Reply ↓
- Luiz Monad on July 8, 2017 at 4:20 pm said:
  
  You should go state-less to be lock-less, you don’t need to lock constants, immutable state are basically data-race free. There’s an entire academic discipline dedicated to creating programs that way, there are even languages for that, like F#. Also, if everything is constant and can only be changed by return new values, you can get more concurrency.
  
  Reply ↓
Joker_vD on February 2, 2013 at 7:42 am said:

Hmm, I (personally) prefer to use “thread-safe code” for “whatever threading shenanigans happen, this code produces consistent result”. So this program *is* threadsafe under this definition: it always deadlocks. Obviously, it’s not what it was *supposed* to do, but that’s another topic.

Reply ↓
Tim on February 3, 2013 at 7:51 am said:

Like a lot of the other comments, I don’t like your example. But I’m going to go a step further, and say your whole idea is wrong.

First of all, a static constructor is a static constructor, not a method. It has special rules that don’t apply to methods.

Second, the thread safety of a method depends on the thread safety of the methods it calls. Creating, starting and joining a thread are all obvious places where threading errors could happen. “Join” in particular puts the current thread to sleep until some other thread finishes, and should be treated with almost as much caution as lock.

I think the rule that you attempt to disprove is actually true. If your method reads and writes only local variables, and only calls other methods that follow these rules, then it is thread safe.

But thread.join() reads global state, the state of another thread.

But you could also deadlock or livelock yourself in a single threaded application. Imagine a program that reads a byte from a file, and if that byte isn’t what it’s expecting, rereads that byte until it is.

But the contents of that file count as a global variable, and so the method I described would not fit into the “automatically thread safe” category.

Reply ↓
Pingback: The Daily Six Pack: February 6, 2013 | Dirk Strauss
Ivaylo Bratoev on February 6, 2013 at 7:40 am said:

Not much of a bragging point but I can do it with 1 line:
Thread.CurrentThread.Join();

Reply ↓
Michael Hutchinson on February 6, 2013 at 11:20 am said:

Here’s another way to cause the same problem that’s less obvious, and quite possible to do accidentally:

class MainClass {
public static void Main () {}
static MainClass () {
System.Threading.Tasks.Task.Factory.StartNew (() => 0).ContinueWith (t => t.Result).Wait ();
}
}

Reply ↓
RobertH on February 7, 2013 at 10:13 pm said:

Minor nit, if you want to bother:

“At first glance clearly *every* method of this incredibly simple program…”

Reply ↓
Aia Patag on June 7, 2013 at 12:14 am said:

Aha! 🙂 Very informative. I like the example.

Reply ↓
Pingback: Insights on Passing by Reference in C# – Insights on a complex world
Pingback: 写一个肯定会陷入僵局的程序 | CODE问答
Pingback: Thread Safety
Pingback: c# - Come rendere il metodo statico thread-safe?
Pingback: Are C# static class private fields thread safe? | Developer FAQs
Pingback: c# - Comment faire de la méthode statique "thread-safe"?
Pingback: Static constructors, part one | Fabulous adventures in coding