Suppose there are two threads accessing a global variable var that is protected by a lock:
#include <pthread.h>
#include <stdio.h>
pthread_mutex_t lock = PTHREAD_MUTEX_INITIALIZER;
static int var;
void *thread1_func(void *ptr) {
pthread_mutex_lock(&lock);
var = var + 1;
printf("Thread 1: %d\n", var);
pthread_mutex_unlock(&lock);
}
int main() {
pthread_t thread1;
var = 0;
pthread_create(&thread1, NULL, thread1_func, NULL);
pthread_mutex_lock(&lock);
printf("Thread 0: %d\n", var);
pthread_mutex_unlock(&lock);
pthread_join(thread1, NULL);
return 0;
}
Assume the instructions are interleaved as follows:
Thread 0: Thread 1:
1. var = 0;
2. pthread_create(&thread1, NULL, thread1_func, NULL);
3. pthread_mutex_lock(&lock);
4. var = var + 1;
5. printf("Thread 1: %d\n", var);
6. pthread_mutex_unlock(&lock);
7. pthread_mutex_lock(&lock);
8. printf("Thread 0: %d\n", var); // Should print 1!
9. pthread_mutex_unlock(&lock);
The C standard avoids formally defining a concurrency model (source), therefore, it appears that a standards compliant C compiler is allowed to cache the value of var in a register across the call to pthread_mutex_lock(), causing "0" to be printed at step 8. However, in practice I would assume implementations of the pthreads API do something to prevent this.
What do pthreads implementations do to prevent the compiler from caching the value of var in a register across a call to pthread_mutex_lock()? I'm interested in the specific annotation in the pthreads source code that conveys the necessary information to the compiler. Since there are many implementations of pthreads, feel free to restrict yourself to one in your answer. For example, glibc compiled for aarch64.
For this question, assume pthread_mutex_lock() is defined in the same file and the compiler is able to prove that it does not access var.