Details
Description
In most JVMs there is no confusion between memory allocated for the application and memory allocated by the running of the VM itself (for example a call to malloc() within the JIT). However, in Jikes RVM, the VM and the application are both written in Java. Moreover, they currently share the same heap. It would be very desirable to improve this situation and separately allocate VM and application objects. Aside from cleaner accounting and behaving more like a production JVM, there may be opportunities for performance optimizations since the lifetimes of objects created by the JIT will typically be bounded by the invocation of a single compilation, as an example.
This project would start by identifying all transitions from the application into the VM proper and channeling all such transitions through a zero-cost "trap", which simply serves as a marker. The trap can be viewed as analogous to a kernel trap in the OS setting. The project would also involve writing a simple checking routine which would walk the stack and determine whether execution was currently within the VM or application context. The combination of these mechanisms could then be used to identify and verify all application<->VM transitions.
More thoughts:
Abstractly, we need to maintain a per-thread binary state: user/system. So at any time we should know unambiguously whether the thread is in user state or system state. Note that some code (libraries, for example) may be executed in either user or system state (other code can only ever be system or only ever user). So in general it is a dynamic property, though in some cases it may be statically determinable.
If we had such a thing working (and we were not concerned about performance), then allocation could easily be made conditional on that state, so allocation would dynamically go to the appropriate heap
When thinking about this, I like to draw an analogy with system mode in an OS, as above.
A number of questions then need to be answered:
1. What is the definition of user or system mode? Without this it is hard to draw the boundaries between them
Some thoughts:
a) The execution of any org.jikesrvm or org.mmtk code probably should be in system mode. One could then construct some rules which covered transitions into and out of libraries etc. Such a rule would require the creation of new packages, such as org.mmtk.helper (for example), to contain helper code which is executed by the user, in user mode, on behalf of the system (eg allocation and write barrier fast paths). That code would "trap out" to system mode on the occasions when it needed to go slow path. (See RVM-415).
b) The user heap should never contain pointers into the system heap: this is a property that any definition of the modes should presumably uphold.
2. How do we get this to work correctly? In another setting, Martin Hirzel suggested that one could write a simple function which inspected a call chain and determined whether it was currently in system or user. If one had such a function (which could be trivial if one used 1a) above), then one could call this function very regularly (every method entry?) and assert that the thread was in fact in the correct mode. Of course this would be expensive, but it could be a good way to ensure correctness.
3. How do we get this to work efficiently? Not necessarily hard, and plenty of opportunities for some fairly straightforward optimizations. However the focus should be on correctness first.