Certain classes of garbage collection require write barriers on primitives as well as references. This patch adds the required support to JikesRVM and MMTk for primitive write barriers on both Intel and PowerPC using either the Baseline or Opt compiler.
Specifically this patch adds support for write barriers on primitive putfields and primitive array stores. Object cloning and reflection code has been updated to use primitive write barriers as required. A separate patch can be made available that provides a collector that demonstrates using primitive write barriers if desired.
As would be expected, applying this patch to trunk (r15745) results in no measurable slowdown to configurations that do not require primitive write barriers - the optimising compiler removes all primitive write barrier specific code (detailed measurements below).
In adding primitive write barriers to the compilers, a number of enhancements were made to Magic operations that should be of benefit to all users:
- Address.store can now store a boolean with optional offset
- Added Magic.setFloatAtOffset and setBooleanAtOffset (with optional locationMetadata)
- Optional locationMetadata argument for Magic.setByteAtOffset, setCharAtOffset, setShortAtOffset, setLongAtOffset, setDoubleAtOffset and setIntAtOffset
It was decided to use the Java type system to provide a separate MMTk write barrier for each Java type (char, short etc.) rather than abuse the type system by having a different barrier for each field size (byte, short, word and double word). Whilst this approach leads to a larger patch, we believe that:
a) preserving type safety is important
b) it improves readability of the code
c) this technique allows for accounting by type
d) as the barriers are inlined, the runtime cost to the compiler of extra barrier methods will be small and there should be no additional mutator overhead (although this has not been measured)
The code styles for the IA32 and PowerPC compilers are very different and this patch attempts to implement the primitive write barriers in a native style for each compiler. Where possible helper methods have been used to reduce the size of code and reduce boiler plate. For a change of this size I fully support a review of the code and it being signed off by the compiler maintainers before it enters trunk.
Quick performance numbers:
The performance of a clean checkout of trunk was compared to the performance of trunk with the patch applied on a number of ia32 machines. Each benchmark was run with 3x minimum heap for 6 iterations within a single RVM invocation, this was repeated for a total of 5 invocations per build/benchmark. A compiler advice file was used to keep the compiler workload constant and the machines had their networks down. The geomean of total execution time for each build/benchmark was calculated and used to calculate the overhead between builds:
Benchmark: Relative overhead with patch applied: