I would be interested to hear about approaches for this. I have successfully used good ol' logging in the past, but it is a bit cumbersome. I'll explain:
First of all, you must synchronize access to the log file. I created a pure virtual class called ILock with three methods: Lock(), Unlock() and Name(). Then I created a new class that inherits from ILock and implements Lock() and Unlock() via a mutex (it is multi-threaded, multi-process), but if you don't cross process boundaries, you could code a class that uses a semaphore or a critical section (being the latter the fastest).
Once synchronized, you log every attempt to acquire a synchronization lock, namely mutexes, critical sections, events, semaphores, waiting on threads or processes, etc; you also log if you were successful in acquiring the said lock, and you also log where you release the locks. Finally, every line in the log file needs to include the thread ID (and the process ID if multi-process).
With the above, a well behaved program will have log pairs. The first item in the pair is two lines, while the second item is one line:
-Thread ID 0xXYZ: Trying to acquire lock "MyMutexLock".
-Thread ID 0xXYZ: Acquired lock "MyMutexLock".
- .....
-Thread ID 0xXYZ: Releasing lock "MyMutexLock".
|
Any thread not releasing a lock is a potential problem that could be reflected in another thread. The thread hanging would have the first half of the first item of the expected pair, but nothing else.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
|
...
#ifdef _UNICODE
#define __TFILE__ TEXT(__FILE__)
#else
#define __TFILE__ __FILE__
#endif
...
CLog gLog = "C:\\MyLog.txt";
ILock theLock = new CMutexLock("MutexName");
...
gLog.Log(__TFILE__, __LINE__, TEXT("Trying to acquire lock \"%s\".", theLock.Name());
theLock.Lock();
gLog.Log(__TFILE__, __LINE__, TEXT("Acquired lock \"%s\".", theLock.Name());
....
gLog.Log(__TFILE__, __LINE__, TEXT("Releasing lock \"%s\".", theLock.Name());
theLock.Unlock();
| |
If the code above produces something like:
-Thread ID 0xXYZ: Trying to acquire lock "MyMutexLock".
-Thread ID 0xXYZ: Acquired lock "MyMutexLock".
- ...
-Thread ID 0xABC: Trying to acquire lock "MyMutexLock".
//The following line may or may not show up. If it doesn't, thread ID 0xABC will deadlock.
-Thread ID 0xXYZ: Releasing lock "MyMutexLock".
|
Then you may end up with a deadlock. Maybe your code acquires locks for a long time, so you need to judge the time to consider it a deadlock.