Details

    • Type: Bug Bug
    • Status: Closed Closed
    • Priority: Blocker Blocker
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 3.0
    • Component/s: None
    • Labels:
      None
    • Number of attachments :
      0

      Activity

      Hide
      David Grove added a comment -

      I looked at half a dozen; they all have this as the stacktrace:

      JikesRVM: internal error: recursive use of hardware exception registers (exiting)

      – Stack –
      at [0x684f39f8] Lorg/jikesrvm/runtime/RuntimeEntrypoints; deliverHardwareException(II)V at line 694
      at [0x684f3a0c] <hardware trap>
      at [0x684fb8c4] Lorg/jikesrvm/scheduler/RVMThread; <init>([BLjava/lang/Thread;Ljava/lang/String;ZZI)V
      at [0x684fb914] Lorg/jikesrvm/scheduler/greenthreads/GreenThread; <init>([BLjava/lang/Thread;Ljava/lang/String;ZZI)V at line 134
      at [0x684fb914] Lorg/jikesrvm/scheduler/greenthreads/GreenThread; <init>(Ljava/lang/String;)V at line 99
      at [0x684fb914] Lorg/jikesrvm/scheduler/Scheduler$ThreadModel; <init>(Ljava/lang/String;)V at line 55
      at [0x684fb914] Lorg/jikesrvm/adaptive/controller/ControllerThread; <init>(Ljava/lang/Object;)V at line 52
      at [0x684fb914] Lorg/jikesrvm/adaptive/controller/Controller; createControllerThread()V at line 252
      at [0x684fb950] Lorg/jikesrvm/adaptive/controller/Controller; boot()V at line 177
      at [0x684fb990] Lorg/jikesrvm/VM; finishBooting()V at line 424
      at [0x684fb9c0] Lorg/jikesrvm/VM; boot()V at line 157

      Show
      David Grove added a comment - I looked at half a dozen; they all have this as the stacktrace: JikesRVM: internal error: recursive use of hardware exception registers (exiting) – Stack – at [0x684f39f8] Lorg/jikesrvm/runtime/RuntimeEntrypoints; deliverHardwareException(II)V at line 694 at [0x684f3a0c] <hardware trap> at [0x684fb8c4] Lorg/jikesrvm/scheduler/RVMThread; <init>([BLjava/lang/Thread;Ljava/lang/String;ZZI)V at [0x684fb914] Lorg/jikesrvm/scheduler/greenthreads/GreenThread; <init>([BLjava/lang/Thread;Ljava/lang/String;ZZI)V at line 134 at [0x684fb914] Lorg/jikesrvm/scheduler/greenthreads/GreenThread; <init>(Ljava/lang/String;)V at line 99 at [0x684fb914] Lorg/jikesrvm/scheduler/Scheduler$ThreadModel; <init>(Ljava/lang/String;)V at line 55 at [0x684fb914] Lorg/jikesrvm/adaptive/controller/ControllerThread; <init>(Ljava/lang/Object;)V at line 52 at [0x684fb914] Lorg/jikesrvm/adaptive/controller/Controller; createControllerThread()V at line 252 at [0x684fb950] Lorg/jikesrvm/adaptive/controller/Controller; boot()V at line 177 at [0x684fb990] Lorg/jikesrvm/VM; finishBooting()V at line 424 at [0x684fb9c0] Lorg/jikesrvm/VM; boot()V at line 157
      David Grove made changes -
      Field Original Value New Value
      Priority Critical [ 2 ] Blocker [ 1 ]
      Hide
      David Grove added a comment -

      This seems to have gone away in the last two regression runs, but I want to leave it open until I can have a chance to force GC to happen in exactly this place again by injecting an explicit System.gc() at the right program point. I'm suspicious that there is a GC map bug here that we just weren't hitting before, we hit it for a few sanity runs, then we stopped hitting it again.

      Show
      David Grove added a comment - This seems to have gone away in the last two regression runs, but I want to leave it open until I can have a chance to force GC to happen in exactly this place again by injecting an explicit System.gc() at the right program point. I'm suspicious that there is a GC map bug here that we just weren't hitting before, we hit it for a few sanity runs, then we stopped hitting it again.
      Hide
      David Grove added a comment -

      By applying this patch to a development image, I get a 100% reproducible crash in GC during VM bootup

      Index: /home/dgrove/rvm-trunk/rvm/src/org/jikesrvm/scheduler/RVMThread.java
      ===================================================================
      — /home/dgrove/rvm-trunk/rvm/src/org/jikesrvm/scheduler/RVMThread.java (revision 14600)
      +++ /home/dgrove/rvm-trunk/rvm/src/org/jikesrvm/scheduler/RVMThread.java (working copy)
      @@ -410,6 +410,8 @@
      contextRegisters = new Registers();
      exceptionRegisters = new Registers();

      + if (VM.fullyBooted) System.gc();
      +
      if(VM.VerifyAssertions) VM._assert(stack != null);
      // put self in list of threads known to scheduler and garbage collector
      if (!VM.runningVM)

      { @@ -466,6 +468,7 @@ // create wrapper Thread if doesn't exist this.thread = java.lang.JikesRVMSupport.createThread(this, name); }

      + if (VM.fullyBooted) System.gc();
      }
      }

      Show
      David Grove added a comment - By applying this patch to a development image, I get a 100% reproducible crash in GC during VM bootup Index: /home/dgrove/rvm-trunk/rvm/src/org/jikesrvm/scheduler/RVMThread.java =================================================================== — /home/dgrove/rvm-trunk/rvm/src/org/jikesrvm/scheduler/RVMThread.java (revision 14600) +++ /home/dgrove/rvm-trunk/rvm/src/org/jikesrvm/scheduler/RVMThread.java (working copy) @@ -410,6 +410,8 @@ contextRegisters = new Registers(); exceptionRegisters = new Registers(); + if (VM.fullyBooted) System.gc(); + if(VM.VerifyAssertions) VM._assert(stack != null); // put self in list of threads known to scheduler and garbage collector if (!VM.runningVM) { @@ -466,6 +468,7 @@ // create wrapper Thread if doesn't exist this.thread = java.lang.JikesRVMSupport.createThread(this, name); } + if (VM.fullyBooted) System.gc(); } }
      Hide
      David Grove added a comment -

      Have to stop for the day. I've traced it down to what looks like fairly convincing evidence that the call to AddressArray.create in JNIEnvironment.initializeState causes the TIB of the contextRegisters object of another thread (ie, not the executing one) to get smashed, even though the objects don't appear to overlap. This happens on a prototype-opt image.even if recompilation is disabled (as long as profiling is enabled, so we create the organizer thread).

      ./dist/prototype-opt_x86_64-linux/rvm -X:aos:enable_recompilation=false -X:aos:gather_profile_data=true -X:verbose -verbose:gc

      With this instrumentation (plus some more higher in the call stack).

      protected void initializeState(RVMThread target )

      { if (VM.runningVM) VM.sysWriteln("initializeState 1 ", Magic.objectAsAddress(target.contextRegisters), " TIB ", Magic.objectAsAddress(ObjectModel.getTIB(target.contextRegisters))); if (VM.runningVM) VM.sysWriteln("this is ", Magic.objectAsAddress(this)); JNIRefs = AddressArray.create(JNIREFS_ARRAY_LENGTH + JNIREFS_FUDGE_LENGTH); VM.sysWriteln("Allocated JNIRefs array is ", Magic.objectAsAddress(JNIRefs)); VM.sysWriteln("sized ", JNIREFS_ARRAY_LENGTH + JNIREFS_FUDGE_LENGTH); VM.sysWriteln("check size ", JNIRefs.length()); if (VM.runningVM) VM.sysWriteln("initializeState 10 ", Magic.objectAsAddress(target.contextRegisters), " TIB ", Magic.objectAsAddress(ObjectModel.getTIB(target.contextRegisters))); JNIRefsTop = 0; JNIRefsSavedFP = 0; if (VM.runningVM) VM.sysWriteln("initializeState 20 ", Magic.objectAsAddress(target.contextRegisters), " TIB ", Magic.objectAsAddress(ObjectModel.getTIB(target.contextRegisters))); JNIRefsMax = (JNIREFS_ARRAY_LENGTH - 1) << LOG_BYTES_IN_ADDRESS; alwaysHasNativeFrame = false; if (VM.runningVM) VM.sysWriteln("initializeState 100 ", Magic.objectAsAddress(target.contextRegisters), " TIB ", Magic.objectAsAddress(ObjectModel.getTIB(target.contextRegisters))); }

      I see the appended output.

      allocateEnvironment 1 0x9f80014c TIB 0x600c3df4
      allocateEnvironment 10 0x9f80014c TIB 0x600c3df4
      the allocated environment is 0x70016318
      initializeState 1 0x9f80014c TIB 0x600c3df4
      this is 0x70016318
      Allocated JNIRefs array is 0x9f800018
      sized 150
      check size 150
      initializeState 10 0x9f80014c TIB 0x00000000
      initializeState 20 0x9f80014c TIB 0x00000000
      initializeState 100 0x9f80014c TIB 0x00000000
      allocateEnvironment 100 0x9f80014c TIB 0x00000000
      NEW THREAD cREG P30 0x9f80014c TIB 0x00000000

      It's 100% deterministic at least, and seems to be not impacted by adding more and more sysWrites....

      Show
      David Grove added a comment - Have to stop for the day. I've traced it down to what looks like fairly convincing evidence that the call to AddressArray.create in JNIEnvironment.initializeState causes the TIB of the contextRegisters object of another thread (ie, not the executing one) to get smashed, even though the objects don't appear to overlap. This happens on a prototype-opt image.even if recompilation is disabled (as long as profiling is enabled, so we create the organizer thread). ./dist/prototype-opt_x86_64-linux/rvm -X:aos:enable_recompilation=false -X:aos:gather_profile_data=true -X:verbose -verbose:gc With this instrumentation (plus some more higher in the call stack). protected void initializeState(RVMThread target ) { if (VM.runningVM) VM.sysWriteln("initializeState 1 ", Magic.objectAsAddress(target.contextRegisters), " TIB ", Magic.objectAsAddress(ObjectModel.getTIB(target.contextRegisters))); if (VM.runningVM) VM.sysWriteln("this is ", Magic.objectAsAddress(this)); JNIRefs = AddressArray.create(JNIREFS_ARRAY_LENGTH + JNIREFS_FUDGE_LENGTH); VM.sysWriteln("Allocated JNIRefs array is ", Magic.objectAsAddress(JNIRefs)); VM.sysWriteln("sized ", JNIREFS_ARRAY_LENGTH + JNIREFS_FUDGE_LENGTH); VM.sysWriteln("check size ", JNIRefs.length()); if (VM.runningVM) VM.sysWriteln("initializeState 10 ", Magic.objectAsAddress(target.contextRegisters), " TIB ", Magic.objectAsAddress(ObjectModel.getTIB(target.contextRegisters))); JNIRefsTop = 0; JNIRefsSavedFP = 0; if (VM.runningVM) VM.sysWriteln("initializeState 20 ", Magic.objectAsAddress(target.contextRegisters), " TIB ", Magic.objectAsAddress(ObjectModel.getTIB(target.contextRegisters))); JNIRefsMax = (JNIREFS_ARRAY_LENGTH - 1) << LOG_BYTES_IN_ADDRESS; alwaysHasNativeFrame = false; if (VM.runningVM) VM.sysWriteln("initializeState 100 ", Magic.objectAsAddress(target.contextRegisters), " TIB ", Magic.objectAsAddress(ObjectModel.getTIB(target.contextRegisters))); } I see the appended output. allocateEnvironment 1 0x9f80014c TIB 0x600c3df4 allocateEnvironment 10 0x9f80014c TIB 0x600c3df4 the allocated environment is 0x70016318 initializeState 1 0x9f80014c TIB 0x600c3df4 this is 0x70016318 Allocated JNIRefs array is 0x9f800018 sized 150 check size 150 initializeState 10 0x9f80014c TIB 0x00000000 initializeState 20 0x9f80014c TIB 0x00000000 initializeState 100 0x9f80014c TIB 0x00000000 allocateEnvironment 100 0x9f80014c TIB 0x00000000 NEW THREAD cREG P30 0x9f80014c TIB 0x00000000 It's 100% deterministic at least, and seems to be not impacted by adding more and more sysWrites....
      Hide
      Daniel Frampton added a comment -

      This error was introduced with the read barrier branch – sorry for the time it took tracking it down.

      The register fields are explicitly traced by GC through the thread table, so we must keep the new register objects alive until the thread slot has been assigned.

      Fixed in 14644.

      Show
      Daniel Frampton added a comment - This error was introduced with the read barrier branch – sorry for the time it took tracking it down. The register fields are explicitly traced by GC through the thread table, so we must keep the new register objects alive until the thread slot has been assigned. Fixed in 14644.
      Daniel Frampton made changes -
      Resolution Fixed [ 1 ]
      Status Open [ 1 ] Resolved [ 5 ]
      Hide
      David Grove added a comment -

      reopening so I can modify fix target to 3.0

      Show
      David Grove added a comment - reopening so I can modify fix target to 3.0
      David Grove made changes -
      Resolution Fixed [ 1 ]
      Status Resolved [ 5 ] Reopened [ 4 ]
      David Grove made changes -
      Fix Version/s 2.9.4 [ 14162 ]
      Fix Version/s 3.0 [ 13530 ]
      David Grove made changes -
      Resolution Fixed [ 1 ]
      Status Reopened [ 4 ] Closed [ 6 ]

        People

        • Assignee:
          Unassigned
          Reporter:
          David Grove
        • Votes:
          0 Vote for this issue
          Watchers:
          0 Start watching this issue

          Dates

          • Created:
            Updated:
            Resolved: