In 1999 it was shown that the semantics of multithreaded programs in Java had some serious problems. So an effort was started in 2000 to design a new memory model for Java.
There was consensus at that point of time that a programming language should provide sequential consistency (SC) for data race free programs. Since Java is expected to provide safety & security for any program, the memory model should also provide some guarantees for programs with data races. The language should be able to limit the damage done by untrusted code. So the main objectives of the memory model were
1) Programmability : Provide a simple & intuitive memory model, which the programmer can use to reason about the correctness of a program
2) Performance : The memory model should allow most of the compiler/hardware optimizations.
3) Safety/Security : The memory model should prevent arbitrary behavior even in the presence of data races (buggy code)
All 3 seemed hard to achieve at the same time & after 5 years of proposals & debate, a new memory model was approved. The memory model provided SC for data race free programs. Synchronized methods & volatile variables should be used for synchronization.
The new memory model is too complex for an average programmer to understand. Some of the complexity of the memory model came from the requirement that Java should not return values out-of-thin-air. For example, in the following program, r1=r2=42 shouldn’t be a possible outcome.
Initially X=Y=0
Core 1 Core 2
r1 = X; r2 = Y;
Y = r1; X = r2;
This example was created assuming that there could be a future microprocessor that does “value prediction”, which predicts values that a load instruction would return. If it predicts 42 as the value of X & speculatively writes 42 into Y, it could lead to the outcome which needs to be prevented.
Many such examples & the outcomes that the memory model needs to prevent were considered during the memory model design.
The memory model uses the notion of causality (whether a speculative write could occur in some well behaved execution) to reason about the correctness of a program. This is one of the primary reasons for it’s complexity.
In a sequential program we can use the operational semantics of the programming language to reason about the correctness/safety of the program. This is not the case with Java memory model. We may have to consider future accesses.
The memory model also lists some of the compiler/hardware optimizations that are valid.
After the memory model was revised there were several flaws exposed in it. For example a commonly used implementation of double checked locking was found to be invalid in the new memory model. The paper “On the validity of Program Transformations in Java Memory Model”. (Aspinall et al, ECOOP 2008) has shown that some of the optimizations claimed to be valid by the memory model are not actually valid.
The Java memory model will have to be revised some time. A memory model needs to be simpler than the current one. A programmer should be able to reason about a program using it’s operational semantics of legal interleavings rather than reason using causality. Is SC for data race free programs the right direction ? It makes sense for C/C++ where performance is an issue. What about languages which requires safety guarantees ? Perhaps safe languages should compromise some performance to provide other guarantees. Any thoughts, opinions ?
References
The Java Memory Model, J. Manson, W. Pugh, and S. V. Adve. In Proceedings of the Symposium on Principles of Programming Languages (PoPL), January 2005.
Memory Models: A Case for Rethinking Parallel Languages and Hardware Sarita V. Adve, Hans-J. Boehm. Communications of the ACM, Vol. 53 No. 8, Pages 90-101, August, 2010. 10.1145/1787234.1787255
Sevcik, J. and Aspinall, d. On validity of program transformations in the Java memory model. In Proceedings of the European Conference on Object- Oriented Programming, 2008, 27–51.
