RVM
  1. RVM
  2. RVM-399

Separate Heap For VM objects

    Details

    • Type: Improvement Improvement
    • Status: Open Open
    • Priority: Major Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: 1000
    • Component/s: MMTk, Runtime
    • Labels:
      None
    • Number of attachments :
      0

      Description

      In most JVMs there is no confusion between memory allocated for the application and memory allocated by the running of the VM itself (for example a call to malloc() within the JIT). However, in Jikes RVM, the VM and the application are both written in Java. Moreover, they currently share the same heap. It would be very desirable to improve this situation and separately allocate VM and application objects. Aside from cleaner accounting and behaving more like a production JVM, there may be opportunities for performance optimizations since the lifetimes of objects created by the JIT will typically be bounded by the invocation of a single compilation, as an example.

      This project would start by identifying all transitions from the application into the VM proper and channeling all such transitions through a zero-cost "trap", which simply serves as a marker. The trap can be viewed as analogous to a kernel trap in the OS setting. The project would also involve writing a simple checking routine which would walk the stack and determine whether execution was currently within the VM or application context. The combination of these mechanisms could then be used to identify and verify all application<->VM transitions.

        Issue Links

          Activity

          Hide
          Steve Blackburn added a comment -

          More thoughts:

          Abstractly, we need to maintain a per-thread binary state: user/system. So at any time we should know unambiguously whether the thread is in user state or system state. Note that some code (libraries, for example) may be executed in either user or system state (other code can only ever be system or only ever user). So in general it is a dynamic property, though in some cases it may be statically determinable.

          If we had such a thing working (and we were not concerned about performance), then allocation could easily be made conditional on that state, so allocation would dynamically go to the appropriate heap

          When thinking about this, I like to draw an analogy with system mode in an OS, as above.

          A number of questions then need to be answered:

          1. What is the definition of user or system mode? Without this it is hard to draw the boundaries between them Some thoughts:
          a) The execution of any org.jikesrvm or org.mmtk code probably should be in system mode. One could then construct some rules which covered transitions into and out of libraries etc. Such a rule would require the creation of new packages, such as org.mmtk.helper (for example), to contain helper code which is executed by the user, in user mode, on behalf of the system (eg allocation and write barrier fast paths). That code would "trap out" to system mode on the occasions when it needed to go slow path. (See RVM-415).
          b) The user heap should never contain pointers into the system heap: this is a property that any definition of the modes should presumably uphold.

          2. How do we get this to work correctly? In another setting, Martin Hirzel suggested that one could write a simple function which inspected a call chain and determined whether it was currently in system or user. If one had such a function (which could be trivial if one used 1a) above), then one could call this function very regularly (every method entry?) and assert that the thread was in fact in the correct mode. Of course this would be expensive, but it could be a good way to ensure correctness.

          3. How do we get this to work efficiently? Not necessarily hard, and plenty of opportunities for some fairly straightforward optimizations. However the focus should be on correctness first.

          Show
          Steve Blackburn added a comment - More thoughts: Abstractly, we need to maintain a per-thread binary state: user/system. So at any time we should know unambiguously whether the thread is in user state or system state. Note that some code (libraries, for example) may be executed in either user or system state (other code can only ever be system or only ever user). So in general it is a dynamic property, though in some cases it may be statically determinable. If we had such a thing working (and we were not concerned about performance), then allocation could easily be made conditional on that state, so allocation would dynamically go to the appropriate heap When thinking about this, I like to draw an analogy with system mode in an OS, as above. A number of questions then need to be answered: 1. What is the definition of user or system mode? Without this it is hard to draw the boundaries between them Some thoughts: a) The execution of any org.jikesrvm or org.mmtk code probably should be in system mode. One could then construct some rules which covered transitions into and out of libraries etc. Such a rule would require the creation of new packages, such as org.mmtk.helper (for example), to contain helper code which is executed by the user, in user mode, on behalf of the system (eg allocation and write barrier fast paths). That code would "trap out" to system mode on the occasions when it needed to go slow path. (See RVM-415 ). b) The user heap should never contain pointers into the system heap: this is a property that any definition of the modes should presumably uphold. 2. How do we get this to work correctly? In another setting, Martin Hirzel suggested that one could write a simple function which inspected a call chain and determined whether it was currently in system or user. If one had such a function (which could be trivial if one used 1a) above), then one could call this function very regularly (every method entry?) and assert that the thread was in fact in the correct mode. Of course this would be expensive, but it could be a good way to ensure correctness. 3. How do we get this to work efficiently? Not necessarily hard, and plenty of opportunities for some fairly straightforward optimizations. However the focus should be on correctness first.
          Hide
          Ian Rogers added a comment -

          I think I understand your thoughts:

          • have a call in thread to determine whether the thread is in a system or user mode
            • this could be done through setting/clearing a flag on entry/exit to a VM routine
            • this may also be done by inspection of the call stack
          • have an invariant that checks that when modifying VM objects the thread is in VM mode
          • have an invariant that checks VM objects are only reachable from the VM heap (and likewise for the user heap)

          Any thoughts on the JTOC?

          Show
          Ian Rogers added a comment - I think I understand your thoughts: have a call in thread to determine whether the thread is in a system or user mode this could be done through setting/clearing a flag on entry/exit to a VM routine this may also be done by inspection of the call stack have an invariant that checks that when modifying VM objects the thread is in VM mode have an invariant that checks VM objects are only reachable from the VM heap (and likewise for the user heap) Any thoughts on the JTOC?
          Hide
          Steve Blackburn added a comment -

          That's not quite what I meant. Interesting, anyway. Let me clarify.

          1. Maintain per-thread state (as a bit or whatever), indicating whether in user or system mode.

          2. Primary prerequisite is forming a clean definition of user and system mode.

          3. Given these, then ensure correctness: a) by implementing a method that dynamically determines the actual mode, and then implementing an invariant that its value matches the per-thread state, b) by implementing a write barrier which maintains the invariant that no user object points to system heap (system objects must be able to point to user heap in some particular cases, such as class objects).

          4. Use 1 & 2 to selectively allocate objects into separate user and system heaps.

          What is your question about the JTOC?

          Show
          Steve Blackburn added a comment - That's not quite what I meant. Interesting, anyway. Let me clarify. 1. Maintain per-thread state (as a bit or whatever), indicating whether in user or system mode. 2. Primary prerequisite is forming a clean definition of user and system mode. 3. Given these, then ensure correctness: a) by implementing a method that dynamically determines the actual mode, and then implementing an invariant that its value matches the per-thread state, b) by implementing a write barrier which maintains the invariant that no user object points to system heap (system objects must be able to point to user heap in some particular cases, such as class objects). 4. Use 1 & 2 to selectively allocate objects into separate user and system heaps. What is your question about the JTOC?
          Hide
          Ian Rogers added a comment -

          Thanks Steve. For the JTOC do we need to think about knowing what are VM statics/literals and what are user statics/literals, to maintain the reachability invariant? Would this be more cleanly implemented with RVM-324?

          Show
          Ian Rogers added a comment - Thanks Steve. For the JTOC do we need to think about knowing what are VM statics/literals and what are user statics/literals, to maintain the reachability invariant? Would this be more cleanly implemented with RVM-324 ?

            People

            • Assignee:
              Unassigned
              Reporter:
              Ian Rogers
            • Votes:
              1 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated: