[multithreading] Is this code correct

Pages: 12
closed account (EzwRko23)
I recently came accross such a fragment of code:

1
2
3
4
5
6
7
8
9
10
11
12
volatile UserManager* userManager = 0;

// Returns the only instance of UserManager. The first call initializes it.
// The subsequent calls return cached instance.
UserManager* getUserManager() {
  if (userManager == 0) {   // check to avoid locking; if it's not 0, then we already have an  userManager created
    Lock acquireLock;      // only one thread can enter here
    if (userManager == 0)  // now we can safely check
      userManager = new UserManager(); 
  }
  return userManager;   
}


Is it thread safe?
Last edited on
Provoided 'Lock' is an RAII style mutex, yes that particular bit of code is threadsafe.

However using the returned pointer would not be threadsafe unless it's behind the same mutex again.
How does the Lock object know which mutex to lock? Or is it like a single, all-encompassing mutex?

However using the returned pointer would not be threadsafe unless it's behind the same mutex again.
Not really. As long as the UserManager implements its own locking mechanisms, it will be thread-safe. It doesn't need to use the same mutex as the one from line 7.
closed account (EzwRko23)
The Lock is a RAII style mutex.
Anyway, are you sure? Can you trust volatile?


Whenever I wanted to have a thread safe singleton, I always wrote code like this:
1
2
3
4
5
6
UserManager* getUserManager() {
    Lock acquireLock;      
    if (userManager == 0)  
      userManager = new UserManager(); 
    return userManager;   
}


Simple and works.
Is the code I posted in the first post any better?

Last edited on
Anyway, are you sure?

Yes.

Can you trust volatile?

Yes. volatile means that the compiler will not optimize out memory accesses, so every time you access that pointer in code it will read from or write to the actual pointer. So changes will be visible across threads.

Is the code I posted in the first post any better?

Yes. It avoids an unnecessary lock.
closed account (EzwRko23)
Ok, I can assure you, it is **not** thread safe. It avoids lock, but it is broken, thus useless. :)
So now it is easier, because you know there is a bug. Can you see it?
Last edited on
userManager isn't necessarily aligned and accesses to it aren't necessarily atomic? Although that's a rather platform-specific behavior. Technically, memory accesses don't have to be atomic even if the memory is aligned.
closed account (EzwRko23)
Right, but this is only a part of the problem. Even if it were atomic, it is still broken.
Last edited on
Other than the fact that there's nothing preventing any other function from writing to the pointer, I don't see anything else. Certainly nothing to do with concurrency (other than what I already mentioned in my first post, about Lock not knowing what to lock).
Last edited on
closed account (EzwRko23)

Other than the fact that there's nothing preventing any other function from writing to the pointer


This is a general propery of C++, and nothing can be done against this. But assume the only code modifying this global is getUserManager.

-----

Jesus, 100% of advanced C++ programmers I asked this (except this forum I asked that to some of my friends coding at Google) can't spot the bug. Guys, how can you write safe concurrent code? (anyway, this code is **also** broken in Java <= 1.4; but fine in Java >=5, so this probably goes not only to C++ programmers).

Hint: call to getUserManager can return a **valid** pointer to allocated, but **uninitialised** or partially initialised memory. Well, after this hint, the answer should be obvious...

And of course the next question is: how to fix this? (without locking for every read, as in the second code sample)


Last edited on
Oh, that. Assign to a temporary before assigning to the global.

EDIT:
This is a general propery of C++, and nothing can be done against this.
You could make the pointer static local global.
Last edited on
Whaaaaat?

I don't see that. new can't return until after the constructor is called, right? How could the memory be uninitialized?
Sigh... Today isn't my day. Maybe I need more caffeine.
Yeah, Disch is right.
From Andrei Alexandrescu's Modern C++ Design:

1
2
3
4
5
6
7
8
9
10
11
12
Singleton& Singleton::Instance()
{
    if (!pInstance_)
    {
        Guard myGuard(lock_);
        if (!pInstance_)
        {
            pInstance_ = new Singleton;
        }
    }
    return *pInstance_;
}


Very experienced multithreaded programmers know that even the Double-Checked Locking pattern, although correct on paper, is not always correct in practice. In certain symmetric multiprocessor environments (the ones featuring the so-called relaxed memory model), the writes are committed to the main memory in bursts, rather than one by one. The bursts occur in increasing order of addresses, not in chronological order. Due to this rearranging of writes, the memory as seen by one processor at a time might look as if the operations are not performed in the correct order by another processor. Concretely, the assignment to pInstance_ performed by a processor might occur before the Singleton object has been fully initialized! Thus, sadly, the Double-Checked Locking pattern is known to be defective for such systems.

In conclusion, you should check your compiler documentation before implementing the Double-Checked Locking pattern. (This makes it the Triple-Checked Locking pattern.) Usually the platform offers alternative, nonportable concurrency-solving primitives, such as memory barriers, which ensure ordered access to memory. At least, put a volatile qualifier next to pInstance_. A reasonable compiler should generate correct, nonspeculative code around volatile objects.
closed account (EzwRko23)

At least, put a volatile qualifier next to pInstance_. A reasonable compiler should generate correct, nonspeculative code around volatile objects.


He is right, but unfortunately this is up to the compiler's implementers. The C++ standard does not enforce it. C++0x doesn't, neither. Volatile is useless for interthread communication and does not solve the problem.



Oh, that. Assign to a temporary before assigning to the global.


Let Scott Meyers answer this one for me:

As a rule, programmers don’t like to be pushed around by their compilers.
Perhaps you are such a programmer. If so, you may be tempted to try to outsmart
your compiler by adjusting your source code so that pInstance remains
unchanged until after Singleton’s construction is complete. For example, you
might try inserting use of a temporary variable
[...].

In essence, you’ve just fired the opening salvo in a war of optimization. Your
compiler wants to optimize. You don’t want it to, at least not here. But this is
not a battle you want to get into. Your foe is wiley and sophisticated, imbued
with strategems dreamed up over decades by people who do nothing but think
about this kind of thing all day long, day after day, year after year. Unless you
write optimizing compilers yourself, they are way ahead of you. In this case,
for example, it would be a simple matter for the compiler to apply dependence
analysis to determine that temp is an unnecessary variable, hence to eliminate it,
thus treating your carefully crafted “unoptimizable” code if it had been written
in the traditional DCLP manner. Game over. You lose.

If you reach for bigger ammo and try moving temp to a larger scope (say
by making it file static), the compiler can still perform the same analysis and
come to the same conclusion. Scope, schmope. Game over. You lose.

So you call for backup. You declare temp extern and define it in a separate
translation unit, thus preventing your compiler from seeing what you are doing.
Alas for you, some compilers have the optimizing equivalent of night-vision
goggles: they perform interprocedural analysis, discover your ruse with temp,
and again they optimize it out of existence. Remember, these are optimizing
compilers. They’re supposed to track down unnecessary code and eliminate it.
Game over. You lose.

So you try to disable inlining by defining a helper function in a different file,
thus forcing the compiler to assume that the constructor might throw an exception
and therefore delay the assignment to pInstance. Nice try, but some build
environments perform link-time inlining followed by more code optimizations
[5, 11, 4]. GAME OVER. YOU LOSE.


Nothing you do can alter the fundamental problem: you need to be able
to specify a constraint on instruction ordering, and your language gives you no
way to do it.


Last edited on
That's somewhat interesting, and also completely retarded.

If the problem is the writes not being commited in time, I don't see how you could have any simple, reliable code that would get around it. Even the "always lock" approach that xorsldjfsld suggested would be suseptible to the same issue, wouldn't it?

That raises issues with all sorts of other multithreading code, too. I mean if you can't rely on writes being commited in order, how can you write anything that's threadsafe? Sounds like this memory model just sucks balls.
Let Scott Meyers answer this one for me: [...]
I should point out that you're replying to a bad post. I had no idea about that behavior, I was just thinking poorly.

I can win the war, though:
1
2
3
4
5
6
T *temp=new T;
char s[sizeof(T *)*2+1];
sprintf(s,"%x",(int)temp);
int temp2;
sscanf(s,"%x",&temp2)
global=(T*)temp2;
BAM! The compiler has to earthly idea what in the hell I'm doing.
closed account (EzwRko23)

If the problem is the writes not being commited in time, I don't see how you could have any simple, reliable code that would get around it. Even the "always lock" approach that xorsldjfsld suggested would be suseptible to the same issue, wouldn't it?


No. The problem of writes committed out of order does not occur, when special instructions are emitted (memory berrier). This is what locking does, so the second code snippet is perfectly thread-safe. But locking goes down deep into OS and is hardware dependent. Note that out of order writes are not only done by the hardware, but can also be done by the compiler. C++ compilers do not know about threads, so they can accidenrtaly break multithreaded code, if it relies on the order of statements instead of the locking primitives supported by OS and hardware.

Full article on this here: http://www.aristeia.com/Papers/DDJ_Jul_Aug_2004_revised.pdf

@helios: Nice, but now you have created a potential problem for running your program under e.g. Boehm's GC.
Last edited on
Using garbage collection with C/++ is like trying to have your cake and eat it too.

Anyway, I don't see what the point of any of this is. That check on line isn't thread-safe even if that memory model wasn't in place. The compiler generates thread-unsafe code for code that's thread-unsafe. What a surprise.
the end of that paper wrote:
Java 1.5’s volatile [10] has the more restrictive, but simpler, acquire/release
semantics: any read of a volatile is guaranteed to occur prior to any memory
reference (volatile or not) in the statements that follow, and any write to a
volatile is guaranteed to occur after all memory references in the statements
preceding it.


That really is what should be done in C++ as well.

They really didn't include that in C++0x?
Pages: 12