I've had a lot of trouble getting a clear grasp of what is considered to be thread-safe in .NET. There is the excellent guide titled Multi-threading in .NET by Jon Skeet and of couse there also is your friendly but vague documentation on MSDN, but actually understanding what these sources are trying to explain to you is still not an easy undertaking.
The big problem with not having an absolute clear grasp about the thread safety concepts of your platform is that any errors you make likely won't become visible immediately. They will either only show up in random, almost impossible to track down situations or, worse yet, be something that the hardware of your development system is designed to solve automatically but that makes other hardware fall flat on its face.
I'm amongst the people who learn better by example and because I know I'm not alone, let me hand you some examples that will highlight the little slips that can happen when you write multi-threaded applications. If you already know a good deal of multithreading, you can try reading only the title and the code sample and then think about why the sample is declared safe or not!
Bad
void thread1() {
while(var != 12345) { } // Loop
}
void thread2() {
var = 12345;
}
Assume the thread1() method is called by the one thread
and shortly after, the thread2() method will be called in
another thread.
This perfectly fine looking piece of code is not thread-safe. The .NET compiler
might store the value of var in a processor register for
the loop in thread1() and never again look what's in
the actual memory cell of the variable. So thread1() will be
looping on and on while thread2() thinks it has stopped
the loop.
Good
void thread1() {
while(var != 12345) { } // Loop
}
void thread2() {
var = 12345;
}
This piece of code is safe because the volatile statement
tells the .NET compiler that it mustn't cache the value of
var in a processor register but instead read/write
the actual value from/to main memory whenever it is accessed.
Now, because setting an integer value is an atomic operation, this code is safe. An atomic operation is an operation that is executed in a single step and no other thread, even on a multi-core or multi-cpu system, can intervene and access the variable in a half-updated state.
This is possible because the memory bus operates in at least 32 bits at a time, ensuring that even between different CPUs with distinct first- and second-level caches, the memory can be guaranteed to be updated in one all-exclusive step.
Good
void thread1() {
for(;;)
lock(var)
if(var == 12345)
break;
}
void thread2() {
lock(var)
var = 12345;
}
As you may know, lock is a simple synchronization
measure that pauses any other threads trying to obtain a lock on the
same object until the thread that is currently holding the lock
releases it again.
So, with this understanding of the lock statement,
the lock would appear quite useless. The field var is
being changed in an atomic operation and thus doesn't require a lock.
And we still have the processor register issues. Or have we?
We don't! When a lock is acquired, all external variables inside the lock will be read from main memory the first time they are used. And on their final write before the lock is released again, they will be written back to main memory. That is why you can get away without the volatile statement in such cases.
Bad
private static syncRoot = new object();
static MyClass getInstance() {
if(instance == null)
lock(syncRoot)
if(instance == null)
instance = new MyClass();
}
Now, what the heck is getInstance() doing there?
This is called the
double-checked locking idiom.
The idea is that if you only want to set a variable under specific
circumstances (like when your singleton's instance
variable is null), you can avoid the costly
lock statement most of the time.
This code is not thread-safe because instance is
being accessed outside of a lock and thus the processor register
problems become an issue again. The .NET compiler might have
cached the value of instance in a register, thus
bailing out on the outer if of getInstance() even
though some other piece of code has set the instance
variable back to null!
Good
private static syncRoot = new object();
static MyClass getInstance() {
if(instance == null)
lock(syncRoot)
if(instance == null)
instance = new MyClass();
}
Now, all is right again, instance is volatile,
so getInstance() will read the contents of the variable
from main memory and notice when it is set back to null.
But couldn't we remove the lock in its entirety, now that we have
declared our instance variable volatile?
Let's see:
Bad
static MyClass getInstance() {
if(instance == null)
instance = new MyClass();
}
This is not safe anymore. Being volatile, the actual contents
of our instance will be read from memory each time, but
volatile does absolutely nothing to prevent the
race condition we've just constructed.
Race condition?
Assume a thread on CPU #1 calls the getInstance() method
and, reaching the if statement, sees that
instance is null. At the same time, a
thread running on CPU #2 also enters the getInstance()
method and does the same check. Both see that instance
is null and decide to assign a new instance to it. The threads
on CPU #1 and CPU #2 have just constructed two instances of the singleton.
Such cases are termed race conditions because they put two threads in a race against each other. The outcome of the operation is determined by which thread gets done first. A more typical example would be the execution of some processing code in a background thread whilst the main thread just goes ahead and reads the results produced in the background thread. It is thereby hoping (but not knowing) that the background thread is already finished when it tries to access the results.