posix threads with optimize flag -O2

Hi,

I have made a program hat uses posix threads.
My program works fine when I compile with no optimization flag (with -O0).
When I change the flag to example -O1 or -O2 not all threads are starting up.
Before I throw myself into a bebugging hell, it could be interesting to hear if you have had the same problem and what the solution could be.

So.. before I show some snippet code of my work, is possible to use pthreads and optimization flag at the same time?

Thanks in advance.

Jesper
Yes, it is possible.
closed account (EzwRko23)
The compiler does not know anything about threads. It may apply optimizations that screw up your threading code totally. It happens rarely, but you must be careful. Ouch, debugging a bug which appears only in the optimized version - this will be probably very painful or even impossible.

I would resort to an old method: prinft debugging and bug isolation. Add some logging, see what is called and what is not. Try to leave just the part that starts the threads, and leave out all the rest, and see if it starts all thread as needed.
Thank you very much. I will try to do some 'std::cout' debugging!

I figure out what the problem was.

I have a variable which I use to assign each thread with a unique number
When I assign each thread in the thread function I use pthread_mutex_lock to ensure that only one thread can write to the variable:
Example
1
2
3
4
5
6
void worker(void)
{
   pthread_mutex_lock(&m_mutex);
   unsigned int id = running_threads++;
   pthread_mutex_unlock(&m_mutex);
}


running_threads is a global variable and is set to zero at the start. When I call pthread_create() I can measure on it to ensure that all threads have got their ids before I move on in the program.

In my init() function I only read running_threads, so in theory I do not need to lock/unlock m_mutex before I read it.
Example. I start 8 threads. So n_threads = 8, where n_threads is a global variable as running_threads also is.

I put these two variables into a while loop as a kind of barrier, to measure if all threads got their ids
1
2
// wait to all threads got their ids
while(running_threads < n_threads) continue;


This works fine, when you use no optimization flags (-O0).
When you use -O1 or -O2 it seems the while loop got its own scope where it keeps the information for running_threads and n_threads.
So even though I print this out on the screen, the while loop goes into a infinity loop:
1
2
3
4
while(running_threads < n_threads)
{
   std::cout << "running_threads=" << running_threads << ", n_threads=" << n_threads << std::endl;
}


The output is:

running_threads=0, n_threads=8
running_threads=2, n_threads=8
running_threads=5, n_threads=8
running_threads=8, n_threads=8
running_threads=8, n_threads=8
running_threads=8, n_threads=8
running_threads=8, n_threads=8
running_threads=8, n_threads=8
running_threads=8, n_threads=8
....


When i declare running_threads with the volatile keyword, it works with -O1 and -O2
 
volatile unsigned int running_threads = 0;


If you think another solution would be better than this one, I will be glad if you write some words. Or else I use this solution.
Maybe I could have used pthread_barrier instead...

/Jesper
You might want to look into condition variables rather than perform your empty loop while waiting for your threads to be created. Also I don't think you need to mutex round reading or writing to basic types like int, as long as you mark it volatile. Reading and writing to basic types should be atomic.
Last edited on

By Galik:
You might want to look into condition variables rather than perform your empty loop while waiting for your threads to be created.


Which condition variable are you talking about here?
Its a while since I used pthreads but this reference has some examples:
http://computing.llnl.gov/tutorials/pthreads/#ConditionVariables

Instead of just looping as you have done you create a special variable called a 'condition' variable. Your code then 'waits' on that variable. The other thread that is incrementing the thread counter can then test the counter to see if it reached the critical value. If it did, then it signals the condition variable and the thread code that is waiting on it will be released.
Last edited on
Thanks Galik, I see your point... something to think about :-)

Topic archived. No new replies allowed.