CMPS 3600 Lab 13 - Mutexes & Condition Variables

Goals

resources:

This lab introduces the condition variable, a sychronization primitive that adds additional capabilities to a mutex. A limitation of the mutex is that only the owner of the mutex lock can release the lock. The producer/consumer synchronization model where the producer releases the consumer and the consumer releases the producer is not possible with a mutex only.

You could use a semaphore to synchronize your threads. But when writing low-level multithreaded programs mutexes and condition variables are much more efficient. Typically each mutex is coupled with one condition variable. The use of a condition variable plus a mutex allows thread A to release thread B when thread B is locked on a mutex. In linux a mutex is implemented in Linux by a futex (fast userspace mutex) call. When you run strace 'futex' is the call you need to trace.

Copy some files into your 3600/d/ folder.

    $ cd
    $ cd 3600/d
    $ cp /home/fac/gordon/public_html/3600b/examples/d/cond-wait.c .
    $ cp /home/fac/gordon/public_html/3600b/examples/d/timed-wait.c .
    $ cp /home/fac/gordon/public_html/3600b/examples/d/Makefile .
Carefully read the code and comments in cond-wait.c and timed-wait.c before starting. We will not make a timed_wait call in this lab but timed_wait may be a little easier to understand. Compile and execute. Code only when you are comfortable with the concepts in both programs. You are going to write rvlab13.c from scratch. If you want to start with some code use cond-wait.c (the header files and calls will be the same).

There are two big hurdles with using a mutex and an associated condition variable (each cond var must use an associated mutex or things will not work). The first problem is that thread A could signal thread B before thread B is ready to receive the signal (i.e., has not yet made the condition wait call). If this occurs the signal is lost and you have deadlock. Given the vagaries of kernel thread scheduling, this problem can only be solved by surrounding the wait call with a predicate flag. In this way the wait will not be entered that could end in deadlock. Note that the mutex must be associated with the condition variable. Assume condition variable cond with associated mutex mtx:

    Thread 1                    Thread 2
    --------                    --------
    lock mutex                  lock mutex
    *critical code*             while (!flag) 
    flag++                         wait on signal by cond-var
    signal with cond-var        *critical code*
    unlock mutex                unlock mutex
If Thread 1 finishes all its code before Thread 2 then Thread 2 does not wait so no deadlock. If Thread 2 grabs the lock first it will release it during wait. At which point Thread 1 will grab it. No deadlock.

The second problem is a spurious wakeup (see pthread_cond_wait manpage). This problem is solved by checking the predicate flag in a loop rather than an if statement.

The obvious question is why not just use the predicate flag to control synchronization? Since the write and read of the flag is inside a mutex this is safe. However, the busy-wait inherent in the while loop is a very bad idea. Better to go into suspend and wait on a signal. If the signal is erroneous you just check once and go back to sleep.

You are going to write code rvlab13.c that accomplishes the synchronization described below. Add an entry for rvlab13.c to your Makefile.

Spawn 5 threads, tnum 0,1,2,3,4. All threads use the same thread function. Counting the parent (root thread) this gives you a total of 6 threads. The neighbor of root is thread 0. The neighbor of thread 1 is thread 2. Thread 4 has no neighbor. The critical code in each thread is to write time to the screen and release its neighbor. The parent releases thread 0, thread 0 releases thread 1 and so on. This is just done once. There is no loop.

To perform this synchronization you need
  5 mutexes
  5 condition variables
  5 predicate flags

Each thread (except the last) blocks on its own mutex and waits on its condition variable. To prevent spurious wakeups the wait call could be in a loop checking on the value of the predicate flag. After waking up the thread writes time to the screen and unlocks its mutex.

Each thread (except the last) then grabs the mutex of its neighbor, sets its neighbor's predicate flag, signals its neighbor and releases its neighbors mutex.

The parent signals thread 0 to get things started. Here is the thread function algorithm:

Thread Function ---------------- lock my-mutex while (!my-flag) wait on my-cond write time to screen unlock my-mutex if !last thread lock neighbor's mutex increment neighbor's flag signal neighbor unlock neighbor's mutex
If all goes well you should see synchronization on output where the parent writes first, followed by thread 0, thread 1, and so on. When everything is working test your synchronization by
1) creating the threads in reverse order
2) passing  an argument from the command line to be passed to fib. Call fib
   inside the critical code section.
If you run valgrind you do not want to call fib since it will take forever to finish.

Sample runs:

$ ./lab13 1 Thread 4 created. Thread 3 created. Thread 2 created. Thread 1 created. Thread 0 created. parent writes time: Sun May 31 12:10:43 2024 thread 0 writes time: Sun May 31 12:10:43 2024 thread 1 writes time: Sun May 31 12:10:43 2024 thread 2 writes time: Sun May 31 12:10:43 2024 thread 3 writes time: Sun May 31 12:10:43 2024 thread 4 writes time: Sun May 31 12:10:43 2024
$ strace -q -f -e trace=futex ./lab13 0 2> out 1>/dev/null $ cat out | grep -E -v "(resumed|EAGAIN)" > out2 $ cat out2 [pid 16755] futex(0x7f74bc003df0, FUTEX_WAIT_PRIVATE, 2, NULL [pid 16754] futex(0x7f74bc003df0, FUTEX_WAKE_PRIVATE, 1 [pid 16755] futex(0x7f74bc003df0, FUTEX_WAKE_PRIVATE, 1) = 0 [pid 16755] futex(0x601784, FUTEX_WAIT_PRIVATE, 1, NULL [pid 16756] futex(0x601754, FUTEX_WAIT_PRIVATE, 1, NULL [pid 16757] futex(0x601724, FUTEX_WAIT_PRIVATE, 1, NULL [pid 16758] futex(0x6016f4, FUTEX_WAIT_PRIVATE, 1, NULL [pid 16759] futex(0x601560, FUTEX_WAIT_PRIVATE, 2, NULL [pid 16754] futex(0x601560, FUTEX_WAKE_PRIVATE, 1 [pid 16754] futex(0x7f74b9c789d0, FUTEX_WAIT, 16759, NULL [pid 16759] futex(0x601560, FUTEX_WAKE_PRIVATE, 1) = 0 [pid 16759] futex(0x6016f4, FUTEX_WAKE_OP_PRIVATE, 1, 1, 0x6016f0, {FUTEX_OP_SET, 0, FUTEX_OP_CMP_GT, 1}) = 1 [pid 16758] futex(0x601588, FUTEX_WAKE_PRIVATE, 1 [pid 16754] futex(0x7f74ba4799d0, FUTEX_WAIT, 16758, NULL [pid 16758] futex(0x601724, FUTEX_WAKE_OP_PRIVATE, 1, 1, 0x601720, {FUTEX_OP_SET, 0, FUTEX_OP_CMP_GT, 1}) = 1 [pid 16754] futex(0x7f74bac7a9d0, FUTEX_WAIT, 16757, NULL [pid 16757] futex(0x6015b0, FUTEX_WAKE_PRIVATE, 1) = 0 [pid 16757] futex(0x601754, FUTEX_WAKE_OP_PRIVATE, 1, 1, 0x601750, {FUTEX_OP_SET, 0, FUTEX_OP_CMP_GT, 1}) = 1 [pid 16754] futex(0x7f74bb47b9d0, FUTEX_WAIT, 16756, NULL [pid 16756] futex(0x6015d8, FUTEX_WAKE_PRIVATE, 1) = 0 [pid 16756] futex(0x601784, FUTEX_WAKE_OP_PRIVATE, 1, 1, 0x601780, {FUTEX_OP_SET, 0, FUTEX_OP_CMP_GT, 1}) = 1 [pid 16754] futex(0x7f74bbc7c9d0, FUTEX_WAIT, 16755, NULL [pid 16755] futex(0x601600, FUTEX_WAKE_PRIVATE, 1) = 0
$ valgrind --tool=helgrind ./lab13 0 2>out 1>/dev/null $ cat out ==15655== Helgrind, a thread error detector ==15655== Copyright (C) 2007-2011, and GNU GPL'd, by OpenWorks LLP et al. ==15655== Using Valgrind-3.7.0 and LibVEX; rerun with -h for copyright info ==15655== Command: ./lab13 0 ==15655== ==15655== ==15655== For counts of detected and suppressed errors, rerun with: -v ==15655== Use --history-level=approx or =none to gain increased speed, at ==15655== the cost of reduced accuracy of conflicting-access information ==15655== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 226 from 85)

Files due at 12:30am...

3600/d/rvlab13.c
3600/d/Makefile