RVM
  1. RVM
  2. RVM-328

Magic array stores are inefficient on IA32

    Details

    • Type: Improvement Improvement
    • Status: Open Open
    • Priority: Minor Minor
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: 1000
    • Component/s: Compiler: Optimizing
    • Labels:
      None
    • Number of attachments :
      0

      Description

      The magic call to store a primitive at an offset is sub-optimal on IA32. For example, from MMTk's Log:

      VM.barriers.setArrayNoBarrier(buffer, bufferIndex++, c);

      becomes:

      9 int_shl t127i(I) = l124i(I), 1
      10 int_2addrze.ext t128a(Lorg/vmmagic/unboxed/Offset = t127i(I)
      14 short_store 10, l123a([C), t128a(Lorg/vmmagic/unboxed/Offset, <unused>, <unused>

      which becomes:

      14 ia32_mov EAX([C) = <[EAX(Lorg/mmtk/utility/Log;)]+-4>DW (<mem loc: Lorg/mmtk/utility/Log;.buffer>, t450sv(GUARD))
      10 ia32_lea EDX(Lorg/vmmagic/unboxed/Offset = <0+[EDX(I)*2]>DW
      14 ia32_mov <[EAX([C)]+[EDX(Lorg/vmmagic/unboxed/Offset;)]>W = 10

      which could more optimally be:

      14 ia32_mov EAX([C) = <[EAX(Lorg/mmtk/utility/Log;)]+-4>DW (<mem loc: Lorg/mmtk/utility/Log;.buffer>, t450sv(GUARD))
      14 ia32_mov <[EAX([C)]+[EDX(Lorg/vmmagic/unboxed/Offset;)*2]>W = 10

      I believe the easiest way to implement this would be by magic array stores which can directly generate ALOADs in OPT_GenerateMagic.

        Activity

        Hide
        David Grove added a comment -

        Makes sense.

        We intentionally decided to keep arrayloads as ALOADS in IA32 LIR to avoid having to recognize the address calculation expression trees in the instruction selection (and to prevent the expression trees from being "optimized" into forms we couldn't put back together) for Java level array loads. Seems like having a magic API that allows the same thing to happen for runtime code is the right way to handle this. Recognizing these as arrayloads would baggage to the instruction selection and LIR optimizations that we decided to avoid in the non-magic case.

        Need to wait to see what MMTk people think before proceeding with an implementation though.

        Show
        David Grove added a comment - Makes sense. We intentionally decided to keep arrayloads as ALOADS in IA32 LIR to avoid having to recognize the address calculation expression trees in the instruction selection (and to prevent the expression trees from being "optimized" into forms we couldn't put back together) for Java level array loads. Seems like having a magic API that allows the same thing to happen for runtime code is the right way to handle this. Recognizing these as arrayloads would baggage to the instruction selection and LIR optimizations that we decided to avoid in the non-magic case. Need to wait to see what MMTk people think before proceeding with an implementation though.
        Hide
        Ian Rogers added a comment -

        Hi, it'd be great to get some feedback from MMTk folks on this.

        Show
        Ian Rogers added a comment - Hi, it'd be great to get some feedback from MMTk folks on this.
        Hide
        Robin Garner added a comment -

        The only use of this call in MMTk is in the Log class, ie it's purely in debugging/diagnostic code so performance is a non-issue.

        To me the cleanest approach would be a class- and method-level annotation that prevents insertion of barriers.

        Show
        Robin Garner added a comment - The only use of this call in MMTk is in the Log class, ie it's purely in debugging/diagnostic code so performance is a non-issue. To me the cleanest approach would be a class- and method-level annotation that prevents insertion of barriers.

          People

          • Assignee:
            Unassigned
            Reporter:
            Ian Rogers
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated: