Volatile and Lock in Multithreading

I've had a lot of trouble getting a clear grasp of what is considered to be thread-safe in .NET. There is the excellent guide titled Multi-threading in .NET by Jon Skeet and of couse there also is your friendly but vague documentation on MSDN, but actually understanding what these sources are trying to explain to you is still not an easy undertaking.

The big problem with not having an absolute clear grasp about the thread safety concepts of your platform is that any errors you make likely won't become visible immediately. They will either only show up in random, almost impossible to track down situations or, worse yet, be something that the hardware of your development system is designed to solve automatically but that makes other hardware fall flat on its face.

I'm amongst the people who learn better by example and because I know I'm not alone, let me hand you some examples that will highlight the little slips that can happen when you write multi-threaded applications. If you already know a good deal of multithreading, you can try reading only the title and the code sample and then think about why the sample is declared safe or not!

Bad

private int var;

void thread1() {
  while(var != 12345) { } // Loop
}
void thread2() {
  var = 12345;
}

Assume the thread1() method is called by the one thread and shortly after, the thread2() method will be called in another thread.

This perfectly fine looking piece of code is not thread-safe. The .NET compiler might store the value of var in a processor register for the loop in thread1() and never again look what's in the actual memory cell of the variable. So thread1() will be looping on and on while thread2() thinks it has stopped the loop.

Good

private volatile int var;

void thread1() {
  while(var != 12345) { } // Loop
}
void thread2() {
  var = 12345;
}

This piece of code is safe because the volatile statement tells the .NET compiler that it mustn't cache the value of var in a processor register but instead read/write the actual value from/to main memory whenever it is accessed.

Now, because setting an integer value is an atomic operation, this code is safe. An atomic operation is an operation that is executed in a single step and no other thread, even on a multi-core or multi-cpu system, can intervene and access the variable in a half-updated state.

This is possible because the memory bus operates in at least 32 bits at a time, ensuring that even between different CPUs with distinct first- and second-level caches, the memory can be guaranteed to be updated in one all-exclusive step.

Good

private int var;

void thread1() {
  for(;;)
    lock(var)
      if(var == 12345)
        break;
}
void thread2() {
  lock(var)
    var = 12345;
}

As you may know, lock is a simple synchronization measure that pauses any other threads trying to obtain a lock on the same object until the thread that is currently holding the lock releases it again.

So, with this understanding of the lock statement, the lock would appear quite useless. The field var is being changed in an atomic operation and thus doesn't require a lock. And we still have the processor register issues. Or have we?

We don't! When a lock is acquired, all external variables inside the lock will be read from main memory the first time they are used. And on their final write before the lock is released again, they will be written back to main memory. That is why you can get away without the volatile statement in such cases.

Bad

private static MyClass instance;
private static syncRoot = new object();

static MyClass getInstance() {
  if(instance == null)
    lock(syncRoot)
      if(instance == null)
        instance = new MyClass();
}

Now, what the heck is getInstance() doing there?

This is called the double-checked locking idiom. The idea is that if you only want to set a variable under specific circumstances (like when your singleton's instance variable is null), you can avoid the costly lock statement most of the time.

This code is not thread-safe because instance is being accessed outside of a lock and thus the processor register problems become an issue again. The .NET compiler might have cached the value of instance in a register, thus bailing out on the outer if of getInstance() even though some other piece of code has set the instance variable back to null!

Good

private static volatile MyClass instance;
private static syncRoot = new object();

static MyClass getInstance() {
  if(instance == null)
    lock(syncRoot)
      if(instance == null)
        instance = new MyClass();
}

Now, all is right again, instance is volatile, so getInstance() will read the contents of the variable from main memory and notice when it is set back to null.

But couldn't we remove the lock in its entirety, now that we have declared our instance variable volatile?

Let's see:

Bad

private static volatile MyClass instance;

static MyClass getInstance() {
  if(instance == null)
    instance = new MyClass();
}

This is not safe anymore. Being volatile, the actual contents of our instance will be read from memory each time, but volatile does absolutely nothing to prevent the race condition we've just constructed.

Race condition? Assume a thread on CPU #1 calls the getInstance() method and, reaching the if statement, sees that instance is null. At the same time, a thread running on CPU #2 also enters the getInstance() method and does the same check. Both see that instance is null and decide to assign a new instance to it. The threads on CPU #1 and CPU #2 have just constructed two instances of the singleton.

Such cases are termed race conditions because they put two threads in a race against each other. The outcome of the operation is determined by which thread gets done first. A more typical example would be the execution of some processing code in a background thread whilst the main thread just goes ahead and reads the results produced in the background thread. It is thereby hoping (but not knowing) that the background thread is already finished when it tries to access the results.