RVM
  1. RVM
  2. RVM-884

MMTk stats can be hard for humans to read

    Details

    • Type: Improvement Improvement
    • Status: Open Open
    • Priority: Minor Minor
    • Resolution: Unresolved
    • Affects Version/s: 3.1.0
    • Fix Version/s: 3.1.4
    • Component/s: MMTk
    • Labels:
      None
    • Patch Submitted:
      Yes
    • Number of attachments :
      2

      Description

      MMTk attempts to print its statistics in a single row like so:

      {{
      ============================ MMTk Statistics Totals ============================
      GC time.mu time.gc perf.mu perf.gc refType scan finalize prepare precopy stacks root forward release init finish L1I_MISSES.mu L1I_MISSES.gc
      7 6751.76 5881.21 0 0 99.35 5038.20 22.58 0.28 0.92 4.99 713.75 0.00 0.65 0.12 0.14 2330019756995 1443109010784
      Total time: 12632.98 ms
      ------------------------------ End MMTk Statistics -----------------------------
      }}

      Hopefully the problems with this approach are clear to see from a human readers perspective: i) Headers do not always line up with values, ii) It's even harder to read with many counters as the output becomes wider than your terminal

      Attached is a trivial patch that instead prints one statistic per line, like so:

      {{
      ============================ MMTk Statistics Totals ============================
      GC: 7
      time.mu: 6872.53
      time.gc: 5854.95
      perf.mu: 0
      perf.gc: 0
      refType: 101.65
      scan: 5008.27
      finalize: 22.64
      prepare: 0.30
      precopy: 0.89
      stacks: 4.79
      root: 715.20
      forward: 0.00
      release: 0.66
      init: 0.16
      finish: 0.18
      L1I_MISSES.mu: 1569810545957(SCALED)
      L1I_MISSES.gc: 951335255621(SCALED)
      Total time: 12727.48 ms
      ------------------------------ End MMTk Statistics -----------------------------
      }}

      As a human I certainly prefer the second output, however I have no idea how many scripts this change would break. If a consensus can be reached and more human readability is desired then perhaps this patch can be applied.

      This work is motivated by another patch that I am about to submit that increases the number of statistics MMTk reports (thus increasing problems of wide output)

      Kind regards
      Laurence

      1. makeMMTkStatsHumanReadable.patch
        3 kB
        Laurence Hellyer
      2. statementOfContribution.txt
        0.5 kB
        Laurence Hellyer

        Activity

        Hide
        Steve Blackburn added a comment -

        Hi Laurence,

        Yeah, I understand how the perf counters work

        I'm just talking about the formatting. We currently have a descriptive field. I was suggesting that you use that descriptive field to convey this information rather than introduce a new field to the format.

        --Steve

        Show
        Steve Blackburn added a comment - Hi Laurence, Yeah, I understand how the perf counters work I'm just talking about the formatting. We currently have a descriptive field. I was suggesting that you use that descriptive field to convey this information rather than introduce a new field to the format. --Steve
        Hide
        Richard Jones added a comment -

        I agree that machine-readability is the most important factor here. But I'd hope that we can come up with some format that is also easy for people to read. I think Laurence has a good point about the current stats being hard to read sometimes. Maybe they are also slightly tricky for a program to read (e.g. it must read the heading row and then use that to interpret subsequent rows). But as Steve says, cut works as well.

        The problem with Andreas's solution is that it overloads tabs, using them to separate records and fields. I'd argue for a format that distinguished these, e.g. by using =s, commas and tabs.

        Steve asked, why introduce a new field, e.g. ("SCALED"), rather than place it in the name of the value. I'd argue against that as SCALED is just another attribute like the value.

        For example (putting spaces around \t here just to make it a little easier to read),

        GC=7 \t time.mu=6872.53 \t time.gc=5854.95 \t L1I_MISSES.gc=951335255621,SCALED \t Total time=12727.48

        This format is easily parsable, e.g. with 3 lines of perl. It's also amenable to cut (assuming that the order is fixed). It's pretty human readable. although output may flow over several lines on the screen), and names and other attributes are tightly tied together.

        Richard

        Show
        Richard Jones added a comment - I agree that machine-readability is the most important factor here. But I'd hope that we can come up with some format that is also easy for people to read. I think Laurence has a good point about the current stats being hard to read sometimes. Maybe they are also slightly tricky for a program to read (e.g. it must read the heading row and then use that to interpret subsequent rows). But as Steve says, cut works as well. The problem with Andreas's solution is that it overloads tabs, using them to separate records and fields. I'd argue for a format that distinguished these, e.g. by using =s, commas and tabs. Steve asked, why introduce a new field, e.g. ("SCALED"), rather than place it in the name of the value. I'd argue against that as SCALED is just another attribute like the value. For example (putting spaces around \t here just to make it a little easier to read), GC=7 \t time.mu=6872.53 \t time.gc=5854.95 \t L1I_MISSES.gc=951335255621,SCALED \t Total time=12727.48 This format is easily parsable, e.g. with 3 lines of perl. It's also amenable to cut (assuming that the order is fixed). It's pretty human readable. although output may flow over several lines on the screen), and names and other attributes are tightly tied together. Richard
        Hide
        David Grove added a comment -

        bulk defer open issues to 3.1.2

        Show
        David Grove added a comment - bulk defer open issues to 3.1.2
        Hide
        David Grove added a comment -

        Bulk defer to 3.1.3; not essential to address for 3.1.2.

        Show
        David Grove added a comment - Bulk defer to 3.1.3; not essential to address for 3.1.2.
        Hide
        David Grove added a comment -

        bulk defer issues to 3.1.4

        Show
        David Grove added a comment - bulk defer issues to 3.1.4

          People

          • Assignee:
            Unassigned
            Reporter:
            Laurence Hellyer
          • Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

            • Created:
              Updated: