Details
-
Type:
Bug
-
Status:
Closed
-
Priority:
Critical
-
Resolution: Fixed
-
Affects Version/s: 2.9.3
-
Fix Version/s: 3.0
-
Component/s: Instruction Architecture: 64bit
-
Labels:None
-
Number of attachments :
Description
The stack appended below is a fairly wide-spread crash symptom on the latest ppc64-aix runs. (http://jikesrvm.anu.edu.au/cattrack/results/excalibur.watson.ibm.com/core-ppc64/3750). This particular one is in _213_javac, but very similar stacks are showing up as the cause of most of our failures.
I'm a little surprised that we think we're using addresses as high as 30cc7c57fffffff0, so my initial wild guess is that there is some 32 bit assumption in and around the code in MMtk in this stack trace. I'll do a little poking around to see if I can determine what is happening.
mem=30cc7c57fffffff0
fp=000000004000ea68
pr=00000000310a6d60
trap/exception: type=Segmentation fault
ip=00000000343342c8
instr=0x7c64182a
exn_handler=0000000034023af8
lr=0000000034299654
pthread_self=0000000000000001
JikesRVM: internal error trap
Fatal error: Unknown hardware trap within uninterruptible region.
Died in GC:
Exiting virtual machine due to uninterruptibility violation.
– Stack –
at [0x000000004000e960] Lorg/jikesrvm/VM; sysFail(Ljava/lang/String;)V at line 2044
at [0x000000004000e990] Lorg/jikesrvm/runtime/VM_Runtime; deliverHardwareException(II)V at line 773
at [0x000000004000ea50] <hardware trap>
at [0x000000004000ea68] Lorg/jikesrvm/objectmodel/VM_JavaHeader; readAvailableBitsWord(Ljava/lang/Object;)Lorg/vmmagic/unboxed/Word; at line 617
at [0x000000004000eaa0] Lorg/jikesrvm/objectmodel/VM_ObjectModel; readAvailableBitsWord(Ljava/lang/Object;)Lorg/vmmagic/unboxed/Word; at line 504
at [0x000000004000ead0] Lorg/jikesrvm/mm/mmtk/ObjectModel; readAvailableBitsWord(Lorg/vmmagic/unboxed/ObjectReference;)Lorg/vmmagic/unboxed/Word; at line 358
at [0x000000004000eb10] Lorg/mmtk/policy/LargeObjectSpace; isInNursery(Lorg/vmmagic/unboxed/ObjectReference;)Z at line 258
at [0x000000004000eb58] Lorg/mmtk/policy/LargeObjectSpace; traceObject(Lorg/mmtk/plan/TransitiveClosure;Lorg/vmmagic/unboxed/ObjectReference;)Lorg/vmmagic/unboxed/ObjectReference; at line 166
at [0x000000004000ebc8] Lorg/mmtk/plan/generational/GenMatureTraceLocal; traceObject(Lorg/vmmagic/unboxed/ObjectReference;)Lorg/vmmagic/unboxed/ObjectReference; at line 116
at [0x000000004000ec20] Lorg/mmtk/plan/generational/marksweep/GenMSMatureTraceLocal; traceObject(Lorg/vmmagic/unboxed/ObjectReference;)Lorg/vmmagic/unboxed/ObjectReference; at line 55
at [0x000000004000ec70] Lorg/mmtk/plan/TraceLocal; traceObject(Lorg/vmmagic/unboxed/ObjectReference;Z)Lorg/vmmagic/unboxed/ObjectReference; at line 301
at [0x000000004000ecc8] Lorg/mmtk/plan/TraceLocal; processRootEdge(Lorg/vmmagic/unboxed/Address;Z)V at line 123
at [0x000000004000ed48] Lorg/jikesrvm/mm/mmtk/ScanBootImage; processChunk(Lorg/vmmagic/unboxed/Address;Lorg/vmmagic/unboxed/Address;Lorg/vmmagic/unboxed/Address;Lorg/vmmagic/unboxed/Address;Lorg/mmtk/plan/TraceLocal;)V at line 140
at [0x000000004000ee28] Lorg/jikesrvm/mm/mmtk/ScanBootImage; scanBootImage(Lorg/mmtk/plan/TraceLocal;)V at line 79
at [0x000000004000eee8] Lorg/jikesrvm/mm/mmtk/Scanning; computeBootImageRoots(Lorg/mmtk/plan/TraceLocal;)V at line 333
at [0x000000004000ef28] Lorg/mmtk/plan/generational/GenCollector; collectionPhase(SZ)V at line 99
at [0x000000004000ef88] Lorg/mmtk/plan/generational/marksweep/GenMSCollector; collectionPhase(SZ)V at line 143
at [0x000000004000efe8] Lorg/mmtk/plan/Phase; processPhaseStack(Z)Z at line 477
at [0x000000004000f0f8] Lorg/mmtk/plan/Phase; beginNewPhaseStack(I)Z at line 390
at [0x000000004000f140] Lorg/mmtk/plan/StopTheWorldCollector; collect()V at line 39
at [0x000000004000f170] Lorg/jikesrvm/memorymanagers/mminterface/VM_CollectorThread; run()V at line 385
at [0x000000004000f300] Lorg/jikesrvm/scheduler/VM_Thread; startoff()V at line 617
I've looked at 10 different failures. They all have the appended portion of the stack in common:
at [0x000000004001a168] Lorg/jikesrvm/mm/mmtk/ScanBootImage; processChunk(Lorg/vmmagic/unboxed/Address;Lorg/vmmagic/unboxed/Address;Lorg/vmmagic/unboxed/Address;Lorg/vmmagic/unboxed/Address;Lorg/mmtk/plan/TraceLocal;)V at line 140
at [0x000000004001a248] Lorg/jikesrvm/mm/mmtk/ScanBootImage; scanBootImage(Lorg/mmtk/plan/TraceLocal;)V at line 79
at [0x000000004001a308] Lorg/jikesrvm/mm/mmtk/Scanning; computeBootImageRoots(Lorg/mmtk/plan/TraceLocal;)V at line 333
So, I think there's a very good chance that the problem is that the code that is building up references from the encoded bootimage map is not correct on 64 bit platforms. There are a couple of suspicious 4's and int/word conversions in org.jikesrvm.mm.mmtk.ScanBootImage.