RVM

Crash when running Production_Opt0_perf portion of compiler-dna test run

Details

  • Type: Bug Bug
  • Status: Closed Closed
  • Priority: Blocker Blocker
  • Resolution: Fixed
  • Affects Version/s: None
  • Fix Version/s: 3.0
  • Component/s: None
  • Labels:
    None
  • Number of attachments :
    0

Description

Running this portion of the compiler-dna test run on linux-ia32 fairly reliably results in a crash a few iterations into_213_javac.

This is a non-adaptive configuration that compiles everything at O0 and runs 20 iterations of every SPECjvm98 benchmark in a single JVM instance. Since the crash is not in the first iteration, it's probably not an optimization/codegen bug. It smells more like a GC map problem.

This is not a configuration/test we normally run and my recollection is that I saw the same crash in July 2007 (last time I gathered the DNA). In 2007, I worked around it by manually running the O0 perf runs 1 benchmark at a time.

Issue Links

Activity

Hide
David Grove added a comment -

The crash occurs here:

– Stack –
at [0x7000fa00] Lorg/jikesrvm/VM; sysFail(Ljava/lang/String;)V at line 2116
at [0x7000fa38] Lorg/jikesrvm/runtime/RuntimeEntrypoints; deliverHardwareException(II)V at line 682
at [0x7000fa4c] <hardware trap>
at [0x70015e04] Lorg/jikesrvm/mm/mmtk/ObjectModel; copy(Lorg/vmmagic/unboxed/ObjectReference;I)Lorg/vmmagic/unboxed/ObjectReference; at line 50
at [0x70015e38] Lorg/mmtk/policy/CopySpace; traceObject(Lorg/mmtk/plan/TransitiveClosure;Lorg/vmmagic/unboxed/ObjectReference;I)Lorg/vmmagic/unboxed/ObjectReference; at line 187
at [0x70015e60] Lorg/mmtk/plan/generational/GenNurseryTraceLocal; traceObject(Lorg/vmmagic/unboxed/ObjectReference;)Lorg/vmmagic/unboxed/ObjectReference; at line 87
at [0x70015e80] Lorg/mmtk/plan/TraceLocal; retainForFinalize(Lorg/vmmagic/unboxed/ObjectReference;)Lorg/vmmagic/unboxed/ObjectReference; at line 416
at [0x70015ec0] Lorg/mmtk/utility/Finalizer; moveToFinalizable(Lorg/mmtk/plan/TraceLocal;)I at line 236
at [0x70015ef0] Lorg/mmtk/plan/SimpleCollector; collectionPhase(SZ)V at line 110
at [0x70015f1c] Lorg/mmtk/plan/generational/GenCollector; collectionPhase(SZ)V at line 120
at [0x70015f50] Lorg/mmtk/plan/generational/marksweep/GenMSCollector; collectionPhase(SZ)V at line 144
at [0x70015fa8] Lorg/mmtk/plan/Phase; processPhaseStack(Z)Z at line 477
at [0x70015fc8] Lorg/mmtk/plan/Phase; beginNewPhaseStack(I)Z at line 390
at [0x70015fdc] Lorg/mmtk/plan/StopTheWorldCollector; collect()V at line 39
at [0x70016018] Lorg/jikesrvm/memorymanagers/mminterface/CollectorThread; run()V at line 385
at [0x70016040] Lorg/jikesrvm/scheduler/RVMThread; startoff()V at line 620

and happens even if -X:aos:enable_recompilation=false is given on the command line. Therefore, if this is an optimizaing compiler problem, it is due to Opt0 compilation of the bootimage (presumably code related to finalization), not to O0 compilation at runtime.

The crash tends to occur 15-25 iterations into _213_javac.

../rvm-trunk/dist/production_Opt_0_x86_64-linux/rvm -X:aos:enable_recompilation=false SpecApplication -s100 -m100 -M100 -a _213_javac

Show
David Grove added a comment - The crash occurs here: – Stack – at [0x7000fa00] Lorg/jikesrvm/VM; sysFail(Ljava/lang/String;)V at line 2116 at [0x7000fa38] Lorg/jikesrvm/runtime/RuntimeEntrypoints; deliverHardwareException(II)V at line 682 at [0x7000fa4c] <hardware trap> at [0x70015e04] Lorg/jikesrvm/mm/mmtk/ObjectModel; copy(Lorg/vmmagic/unboxed/ObjectReference;I)Lorg/vmmagic/unboxed/ObjectReference; at line 50 at [0x70015e38] Lorg/mmtk/policy/CopySpace; traceObject(Lorg/mmtk/plan/TransitiveClosure;Lorg/vmmagic/unboxed/ObjectReference;I)Lorg/vmmagic/unboxed/ObjectReference; at line 187 at [0x70015e60] Lorg/mmtk/plan/generational/GenNurseryTraceLocal; traceObject(Lorg/vmmagic/unboxed/ObjectReference;)Lorg/vmmagic/unboxed/ObjectReference; at line 87 at [0x70015e80] Lorg/mmtk/plan/TraceLocal; retainForFinalize(Lorg/vmmagic/unboxed/ObjectReference;)Lorg/vmmagic/unboxed/ObjectReference; at line 416 at [0x70015ec0] Lorg/mmtk/utility/Finalizer; moveToFinalizable(Lorg/mmtk/plan/TraceLocal;)I at line 236 at [0x70015ef0] Lorg/mmtk/plan/SimpleCollector; collectionPhase(SZ)V at line 110 at [0x70015f1c] Lorg/mmtk/plan/generational/GenCollector; collectionPhase(SZ)V at line 120 at [0x70015f50] Lorg/mmtk/plan/generational/marksweep/GenMSCollector; collectionPhase(SZ)V at line 144 at [0x70015fa8] Lorg/mmtk/plan/Phase; processPhaseStack(Z)Z at line 477 at [0x70015fc8] Lorg/mmtk/plan/Phase; beginNewPhaseStack(I)Z at line 390 at [0x70015fdc] Lorg/mmtk/plan/StopTheWorldCollector; collect()V at line 39 at [0x70016018] Lorg/jikesrvm/memorymanagers/mminterface/CollectorThread; run()V at line 385 at [0x70016040] Lorg/jikesrvm/scheduler/RVMThread; startoff()V at line 620 and happens even if -X:aos:enable_recompilation=false is given on the command line. Therefore, if this is an optimizaing compiler problem, it is due to Opt0 compilation of the bootimage (presumably code related to finalization), not to O0 compilation at runtime. The crash tends to occur 15-25 iterations into _213_javac. ../rvm-trunk/dist/production_Opt_0_x86_64-linux/rvm -X:aos:enable_recompilation=false SpecApplication -s100 -m100 -M100 -a _213_javac
Hide
David Grove added a comment -

I can run hundreds of iterations of _213_javac using either baseline, Opt1, or Opt2 compiled bootimaged, so it looks to be very likely that the problem is in O0 compilation of the bootimage.

Show
David Grove added a comment - I can run hundreds of iterations of _213_javac using either baseline, Opt1, or Opt2 compiled bootimaged, so it looks to be very likely that the problem is in O0 compilation of the bootimage.
Hide
David Grove added a comment -

Forcing org.mmtk.utility.FInalizer.addCandidate to be re-compiled at O1 instead of at O0 before we start running _213_javac by using OptTestHarness appears to be sufficient to avoid the crash.

Show
David Grove added a comment - Forcing org.mmtk.utility.FInalizer.addCandidate to be re-compiled at O1 instead of at O0 before we start running _213_javac by using OptTestHarness appears to be sufficient to avoid the crash.
Hide
David Grove added a comment - - edited

Specifically, this command line results in a successful execution:

../rvm-trunk/dist/production_Opt_0_x86_64-linux/rvm -X:aos:enable_recompilation=false org.jikesrvm.tools.oth.OptTestHarness -oc:O1 -oc:verbose=true -oc:O1 -method org.mmtk.utility.Finalizer addCandidate - -er SpecApplication main - -s100 -m100 -M100 -a _213_javac

While this command line results in the crash:

../rvm-trunk/dist/production_Opt_0_x86_64-linux/rvm -X:aos:enable_recompilation=false org.jikesrvm.tools.oth.OptTestHarness -oc:O0 -oc:verbose=true -oc:O0 -method org.mmtk.utility.Finalizer addCandidate - -er SpecApplication main - -s100 -m100 -M100 -a _213_javac

Both behaviors are 100% reproducible.

Show
David Grove added a comment - - edited Specifically, this command line results in a successful execution: ../rvm-trunk/dist/production_Opt_0_x86_64-linux/rvm -X:aos:enable_recompilation=false org.jikesrvm.tools.oth.OptTestHarness -oc:O1 -oc:verbose=true -oc:O1 -method org.mmtk.utility.Finalizer addCandidate - -er SpecApplication main - -s100 -m100 -M100 -a _213_javac While this command line results in the crash: ../rvm-trunk/dist/production_Opt_0_x86_64-linux/rvm -X:aos:enable_recompilation=false org.jikesrvm.tools.oth.OptTestHarness -oc:O0 -oc:verbose=true -oc:O0 -method org.mmtk.utility.Finalizer addCandidate - -er SpecApplication main - -s100 -m100 -M100 -a _213_javac Both behaviors are 100% reproducible.
Hide
David Grove added a comment -

crash still occurs even with -oc:inline=false

Show
David Grove added a comment - crash still occurs even with -oc:inline=false
Hide
David Grove added a comment -

Disabling local_copy_prop makes the crash go away. So either the bug is in this optimization pass or it is tickling some other downstream bug in the opt compiler.

ie, this command line results in a successful execution:

../rvm-trunk/dist/production_Opt_0_x86_64-linux/rvm -X:aos:enable_recompilation=false org.jikesrvm.tools.oth.OptTestHarness -oc:O1 -oc:verbose=true -oc:O0 -oc:inline=false -oc:local_copy_prop=false -oc:phases=true -method org.mmtk.utility.Finalizer addCandidate - -er SpecApplication main - -s100 -m100 -M100 -a _213_java

while this one results in a crash:
../rvm-trunk/dist/production_Opt_0_x86_64-linux/rvm -X:aos:enable_recompilation=false org.jikesrvm.tools.oth.OptTestHarness -oc:O1 -oc:verbose=true -oc:O0 -oc:inline=false -oc:phases=true -method org.mmtk.utility.Finalizer addCandidate - -er SpecApplication main - -s100 -m100 -M100 -a _213_java

Show
David Grove added a comment - Disabling local_copy_prop makes the crash go away. So either the bug is in this optimization pass or it is tickling some other downstream bug in the opt compiler. ie, this command line results in a successful execution: ../rvm-trunk/dist/production_Opt_0_x86_64-linux/rvm -X:aos:enable_recompilation=false org.jikesrvm.tools.oth.OptTestHarness -oc:O1 -oc:verbose=true -oc:O0 -oc:inline=false -oc:local_copy_prop=false -oc:phases=true -method org.mmtk.utility.Finalizer addCandidate - -er SpecApplication main - -s100 -m100 -M100 -a _213_java while this one results in a crash: ../rvm-trunk/dist/production_Opt_0_x86_64-linux/rvm -X:aos:enable_recompilation=false org.jikesrvm.tools.oth.OptTestHarness -oc:O1 -oc:verbose=true -oc:O0 -oc:inline=false -oc:phases=true -method org.mmtk.utility.Finalizer addCandidate - -er SpecApplication main - -s100 -m100 -M100 -a _213_java
Hide
Ian Rogers added a comment -

Looking at the inline reports -oc:inline=false doesn't appear to work

Show
Ian Rogers added a comment - Looking at the inline reports -oc:inline=false doesn't appear to work
Hide
David Grove added a comment -

hmm, oc:inline=false works for me. Be sure to give it after the -oc:O0 argument (the command lines are applied in order and O0 enables inlining).

Show
David Grove added a comment - hmm, oc:inline=false works for me. Be sure to give it after the -oc:O0 argument (the command lines are applied in order and O0 enables inlining).
Hide
David Grove added a comment -

It will take a few hours more testing to be positive but it looks almost certain that the problem was that local constant propagation was propagating through the move instruction introduced by a toAddress() call on an ObjectReference.

I'm testing a fix to local constant prop to prevent it from GENing based on move instructions where one operand is a reference type and the other side is not.

Show
David Grove added a comment - It will take a few hours more testing to be positive but it looks almost certain that the problem was that local constant propagation was propagating through the move instruction introduced by a toAddress() call on an ObjectReference. I'm testing a fix to local constant prop to prevent it from GENing based on move instructions where one operand is a reference type and the other side is not.
Hide
David Grove added a comment -

was in fact resolved by local copy prop bug fix in r14688.

Show
David Grove added a comment - was in fact resolved by local copy prop bug fix in r14688.

People

Vote (0)
Watch (0)

Dates

  • Created:
    Updated:
    Resolved: