Monitor madness, part one

Locks are tricky; I thought today I’d talk a bit about some of the pitfalls of locking that you might not have seen before.

As you probably know, the lock statement in C# is a syntactic sugar for the use of a monitor, so I’ll use the terms “lock” and “monitor” somewhat interchangeably. A monitor is an oddly-named data structure that gives you two basic operations: “enter” the monitor and “exit” the monitor. The fabulous thing about a monitor is that only one thread can enter a given monitor at a time; if threads alpha and beta both attempt to enter a monitor then only one wins the race; the other blocks at least until the winner exits. After the winner exits the monitor, the loser gets another chance to attempt to acquire the monitor; of course, there might be another race.

I give an example of a very very simple implementation of a monitor here, though of course you would never do this yourself; you’d just use the built-in support in the framework and the C# language.

The lock(x) { body } statement in C# is a syntactic sugar for

bool lockWasTaken = false;
var temp = x;
try 
{ 
  Monitor.Enter(temp, ref lockWasTaken); 
  { 
      body 
  }
}
finally 
{ 
  if (lockWasTaken) 
    Monitor.Exit(temp); 
}

The details of how the operating system implements (or fails to implement!) a system whereby every thread gets access eventually in some sort of fair manner is beyond the scope of what I want to talk about today. Rather, I want to talk about an advanced use of monitors. In addition to the straightforward “enter, do some work, exit” mechanism, monitors also provide a mechanism for temporarily exiting a monitor in the middle of a lock! That is, for the workflow:

  • block until I enter the monitor
  • do some work
  • temporarily exit the monitor until something happens on another thread that I care about
  • when the thing I care about happens, block until the monitor can be entered again
  • do some work
  • exit the monitor

Under what circumstances would we want to do this? The classic example is that we have a not-threadsafe finite-size queue of jobs to perform, and two threads called the producer and the consumer. The consumer thread sits there in a loop attempting to remove items from the queue so that the job can be performed. The producer thread runs around looking for work to do and puts it on the queue. Our code must have the following characteristics:

  • if the consumer is attempting to modify the queue then an attempt by the producer to modify the queue must block
  • and similarly vice versa
  • if the producer is attempting to put work onto a full queue, it must block and allow the consumer to clear out space
  • if the consumer is attempting to take work off of an empty queue, it must block and allow the producer to find more work

To achieve these goals, a monitor provides three operations in addition to enter and exit:

  • “wait” causes the monitor to exit and puts the current thread to sleep. More specifically, it causes a thread to enter the “wait state”.
  • “notify” — which, oddly enough is called Pulse in .NET — allows the thread which is currently in the monitor to place a single waiting thread (of the runtime’s choice) in the “ready” state. A ready thread is still blocked, but it is marked as ready to enter the monitor when the monitor becomes available. (There is no guarantee that it will do so; again, it might be racing against other threads.)
  • “notify all” pulses every thread in the wait state that is waiting for a particular monitor.

Of course there are many other uses for these operations than producer-consumer queues, but as a canonical example we’ll stick with that for the purposes of this series.

What does the code typically look like? On the producer we’d have something like:

while(true)
{
  var someWork = FindSomeWork();
  lock(myLock)
  {
    while (myQueue.IsFull)
    { 
      Monitor.Wait(myLock); 
      // We cannot do any work while the queue is full. 
      // Put the producer thread into the wait state.
      // When someone pulses us, try to acquire the lock again,
      // and then check again to see if the queue is full.
    }
    // If we got here then we have acquired the lock,
    // and the queue has room.
    myQueue.Enqueue(someWork);
    // The queue might have been empty, and the consumer might be
    // waiting. If so, put the consumer thread in the ready state.
    Monitor.Pulse(myLock); 
    // The consumer thread, if it was asleep, is now ready, but we
    // still own the lock. Let's leave the lock and give it a chance
    // to run. It might lose the race of course.
  }
}

And on the consumer side we’d see something unsurprisingly similar. I won’t labour the point with excessive comments here.

while(true) 
{
  Work someWork = null;
  lock(myLock)
  {
    while (myQueue.IsEmpty)
      Monitor.Wait(myLock);
    someWork = myQueue.Dequeue();
    Monitor.Pulse(myLock);
  }
  DoWork(someWork);
}

OK, I hope that is all very clear. This was a long introduction to what is a surprisingly not-straightforward question: why did I write a loop around each wait? Wouldn’t the code be just as correct if written

  lock(myLock)
  {
    if (myQueue.IsEmpty) // no loop!
      Monitor.Wait(myLock);
    ...

Next time on FAIC: Yeah, what’s up with that?

26 thoughts on “Monitor madness, part one

  1. My guess is: if you have two waiting consumers, the reason one of them might be woken is that the other has grabbed the only Work in the Queue and called Pulse on the monitor on its way out. So the Queue is now empty, which you will only find out by going through the loop.

    • The article says that there are two threads, so I don’t think that’s it.

      Also, if multiple producers or consumers were allowed, I don’t think this code would work: there is no guarantee that Pulse() wakes the right kind of thread (you would need PulseAll()).

        • (I’ll use ‘Q’ as syntactic sugar for Queue.)

          If there are multiple producers then

          while(myQ.IsFull) Monitor.Wait(myLock);
          myQ.Enqueue(someWork);

          may still fail as another thread may write to the queue between the IsFull check and the Enqueue call.

          I would say that for any multithreaded code, even example code showing another point, try-until-success would be a safer approach than check-try_assuming_success method. Would you agree?

  2. I’ve looked up the documentation on Monitor.Wait, and it seems to return a boolean. The boolean is only true when the lock is reacquired, and wait returns. Will the boolean ever be false? And if not, why return a boolean in the first place?

  3. My guess is something else outside our code is calling Monitor.Pulse. Since this is a static method, anything could be calling it and affecting our lock.

    Alternative theory: that the lock might have a timeout, in which case you want it to re-enter the wait state when the timeout elapses. A defense against deadlocks?

  4. Could you elaborate why this code (uiding wait and pulse) is more efficient than just one while loop that uses lock and and if statement?

    while {
    lock {
    if {
    // do something
    }
    }
    }

    // Ryan

    • Imagine each thread was a person. The technique that uses wait means “I see our shared queue is empty; wake me up when you put something in it.” Your suggestion means “Hey, is there something in the queue? No? How about now? How about now? How about now? How about now?” Also think about the difference in the amount of time the lock is contended. Suppose the queue is empty, the producer finds work, and the producer tries to take the lock to put something in the queue. How long does the producer have to wait in a world where the consumer is *constantly* taking out the lock? Potentially a long time; threads aren’t guaranteed to be fair. The mechanism that uses waits and pulses ensures that the lock is taken for only the amount of time that the queue methods are being accessed.

      • Thanks, you have confirmed my thoughts about this, but wasn’t sure. Usually I would use manualresetevents for these kinds of problems. Guess wait/pulse is even more efficient than using an manualresetevent?

        // Ryan

  5. Pingback: The Morning Brew - Chris Alcock » The Morning Brew #1970

  6. Pingback: Dew Drop – November 17, 2015 (#2134) | Morning Dew

  7. Monitor.Wait and Monitor.Pulse is difficult to use correctly, as I suspect we’ll find out exactly how in part 2. AutoResetEvent/ManualResetEvent is a lot easier to use and can be used with WaitHandle.WaitAll/WaitAny.

  8. About 15 years ago I implemented Monitors in C++ based on the Java API so that we could port some code from Java to C++. It’s been too long now, but I do recall some subtleties! Looking forward to part two.

  9. You use a loop because otherwise if the consumer starts up prior to the producer, it’ll wait for work once, reacquire the uncontended lock and then exit.

  10. Probably when you “pulse” the waiting thread is goes to “ready to run” state but not necessary immediately run, e.g. OS scheduler could decide to not do it right at this moment so the orginal thread that did “pulse” could possibly re-aquire lock again and modify the queue so another “waiting” one when continue execution would find the queue non-ready

  11. Pingback: Monitor madness, part two | Fabulous adventures in coding

  12. Pingback: Visual Studio – Developer Top Ten for Nov 27th, 2015 - Dmitry Lyalin

  13. Pingback: LIKELY.CLUB » .NET дайджст #7: ASP.NET 5 CLI, Slice, .NET для стартапов, TypeScript 1.7, NDC {London}

  14. Hello Eric!
    I’m working on a sophisticated multi threading batch processing algorithm. It has multiple threads, let’s say ten and if there are items to process each thread just grabs a next item and process it and continue doing this in a loop. But once some thread encounters there are no more items at the time, it notifies all other threads to not continuously ping the source, and starts exclusively pinging the source once in a few seconds until something new appears. And when it gets a new item, it tells other threads also to make a try.
    I researched many possibilities and considered Monitor.Pulse/Wait is something very similar to what I need, but I was not able to catch how they exactly behave. I was so pleased that google brought me in to your old good blog in the century of these social networks. I knew that is the place where I will find the complete explanation and I was right!

    But I always had another question about Monitor, which I wanted to ask to someone who was close to the .NET architecture design. Maybe you can answer it or redirect it to a right person. Why Monitor is a static class and lock() can be applied to any object?
    Why we write
    private object _syncRoot = new object();
    lock(_syncRoot) {…}
    Monitor.Enter(_syncRoot);
    but not
    private Monitor _myMonitor = new Monitor();
    lock(_myMonitor) {…}
    _myMonitor.Enter();
    ?
    However I can guess the answer. I suppose it is “because Java did it this way with it’s synchronized() block and we did the same thing after them”. But then my question splits into two parts: why .NET considered Java approach worth reproduction, and why Java did it this way? If Java did it after some another language, please continue the chain of questions. 🙂

    I hope you read the comments to your old posts and if you have an answer and it is not quite short, you can white a new article instead of replying here. I’ve been reading your blog via RSS even long before it was moved from MSDN.

    • That’s a great question and I wish I knew the answer to it. I have often wondered the same thing. Every single object of reference type created by the CLR has a sync block in its header, and 99.9999% of them are never used; it is very wasteful! I do not understand why this decision was made. If we instead made Monitor an instance class, and you can only lock on an instance of Monitor, that would force people into the good practice of locking only on private objects that are purpose-built for locking. It would make a lot of reasoning about threading easier in fact.

      As you note, the CLR was designed so that it could efficiently implement a language arbitrarily similar to Java, so that was certainly a factor. But that has just begged the question: why did Java decide that all objects could be used as synchronization objects? It seems equally bad in Java!

      There are a bunch of decisions that were made early on in the history of C# and the CLR that I find questionable now. Every object can describe its own type — that seems reasonable. Every object can be compared for equality to every other object … ok, I guess I can see that. But every object can be put into a hash table? Every object can be converted to string? These seem very questionable to me; were I designing this type system today I would probably make those interfaces.

Leave a Reply to CarlD Cancel reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s