Writing an answer for another question some interesting things came out and now I can't understand how Interlocked.Increment(ref long value) works on 32 bit systems. Let me explain.
Native InterlockedIncrement64 is now not available when compiling for 32 bit environment, OK, it makes sense because in .NET you can't align memory as required and it may be called from managed then they dropped it.
In .NET we can call Interlocked.Increment() with a reference to a 64 bit variable, we still don't have any constraint about its alignment (for example in a structure, also where we may use FieldOffset and StructLayout) but documentation doesn't mention any limitation (AFAIK). It's magic, it works!
Hans Passant noted that Interlocked.Increment() is a special method recognized by JIT compiler and it will emit a call to COMInterlocked::ExchangeAdd64() which will then call FastInterlockExchangeAddLong which is a macro for InterlockedExchangeAdd64 which shares same limitations of InterlockedIncrement64.
Now I'm perplex.
Forget for one second managed environment and go back to native. Why InterlockedIncrement64 can't work but InterlockedExchangeAdd64 does? InterlockedIncrement64 is a macro, if intrinsics aren't available and InterlockedExchangeAdd64 works then it may be implemented as a call to InterlockedExchangeAdd64...
Let's go back to managed: how an atomic 64 bit increment is implemented on 32 bit systems? I suppose sentence "This function is atomic with respect to calls to other interlocked functions" is important but still I didn't see any code (thanks Hans to point out to deeper implementation) to do it. Let's pick InterlockedExchangedAdd64 implementation from WinBase.h when intrinsics aren't available:
FORCEINLINE
LONGLONG
InterlockedExchangeAdd64(
_Inout_ LONGLONG volatile *Addend,
_In_ LONGLONG Value
)
{
LONGLONG Old;
do {
Old = *Addend;
} while (InterlockedCompareExchange64(Addend,
Old + Value,
Old) != Old);
return Old;
}
How can it be atomic for reading/writing?