Details

    • Type: Improvement Improvement
    • Status: Closed Closed
    • Priority: Minor Minor
    • Resolution: Fixed
    • Affects Version/s: 1.3.3
    • Fix Version/s: 2.0.0
    • Labels:
      None
    • Environment:
      Mac OS X 10.6.2, Java 6
    • Patch Submitted:
      Yes
    • Number of attachments :
      3

      Description

      During some recent profiling of our application, the profiler (jprofiler) put several bitronix functions at the top of it's "hotspot" list. Not that these functions were particularly inefficient, but because they are called millions of times they end up consuming more resources than you might imagine.

      A couple of tweaks to these (two) classes resulted in them disappearing completely from the "hotspot" list. The two classes in question are Uid.java and Encoder.java. I have attached them here.

      Just a few comments on the (overall minor) changes:

      Encoding.java
      Basically unrolled some (most) of the loops. Old trick, very effective. Hotspot compiler doesn't seem smart enough to unroll these itself. Probably 5-10x faster, no loop comparator or incrementer involved. Either way, they disappeared from the jprofiler radar as these tweaks.

      Uid.java
      This class is used A LOT, therefore I gave it a workover. Notes:

      equals() - removed the use of "instanceof", it's slow. In 99% (or 100%) of the cases you are comparing apples-to-apples, better to just catch the ClassCastException if it happens and return false.

      arrayHashCode() - the hash algorithm was not good. It used a loop that basically did: total += uid[i]. This means that for an 8-byte UID (theoretically a 64-bit number with a range of trillions) the maximum achievable hash was 8*256 = 2048. Granted ultimately the hash is an int, which is 32-bit, but still the range is +/-2 billion. The new code implements a "classic" oldie but goodie hash. Simple, reasonably dispersed over the range of int (few hash collisions), and fast.

      arrayToString() - since arrayToString() and arrayHashCode() are both called in the Uid constructor, their performance is important. The new arrayToString() avoids StringBuilder.append() entirely and employs a lookup table of only 16 values. By using a char array and avoiding method invocation overhead, this method is many MANY times faster than the previous.

      1. Encoder.java
        2 kB
        Brett Wooldridge
      2. EqualityTest.java
        3 kB
        Brett Wooldridge
      3. Uid.java
        3 kB
        Brett Wooldridge

        Activity

        Hide
        Ludovic Orban added a comment -

        I've carefully reviewed your changes and committed them on the trunk, thanks!

        I did not expect those two classes to cause any kind of performance issue. As you noticed the Uid class already was reasonably optimized, you 'just' pushed the optimizations to an extreme level.

        I'm quite happy with this change as it helps you and cannot harm anyone. I'm just (positively) surprised that you got performance issues on this code and not anywhere else, I guess this proves I did a fair job at keeping the overhead low which sounds reassuring. I also wonder what kind of load you have to run into that kind of performance problem.

        One final thing: I wonder if starting the equals() method with:

        if (!Uid.class.equals(obj.getClass()))
        return false;

        instead of the try-catch ClassCastException wouldn't be even faster. Any opinion on this?

        Thanks a lot for the report and the patch!

        Show
        Ludovic Orban added a comment - I've carefully reviewed your changes and committed them on the trunk, thanks! I did not expect those two classes to cause any kind of performance issue. As you noticed the Uid class already was reasonably optimized, you 'just' pushed the optimizations to an extreme level. I'm quite happy with this change as it helps you and cannot harm anyone. I'm just (positively) surprised that you got performance issues on this code and not anywhere else, I guess this proves I did a fair job at keeping the overhead low which sounds reassuring. I also wonder what kind of load you have to run into that kind of performance problem. One final thing: I wonder if starting the equals() method with: if (!Uid.class.equals(obj.getClass())) return false; instead of the try-catch ClassCastException wouldn't be even faster. Any opinion on this? Thanks a lot for the report and the patch!
        Hide
        Brett Wooldridge added a comment -

        Ludovic, your question prompted me to write a JUnit test (attached). It certainly used to be true that instanceof was slow, but it appears that Sun has remedied that situation. So I stand corrected, I can't dispute the test results. Here is the output of the test:

        Same type using instanceof: 2449000 nanos
        Different type using instanceof: 2785000 nanos
        Same type using class comparison: 2877000 nanos
        Different type using class comparison: 7894000 nanos
        Same type using cast check: 2934000 nanos
        Different type using cast check: 77363000 nanos

        Note, you get dramatically different results between the client and server VM. I ran these test with the server VM as that is the most likely environment in which BTM would be used. The most important numbers are the "Same type" comparisons. It appears that instanceof is indeed fastest.

        So, please revert my change to the equals() method in the Uid class.

        Show
        Brett Wooldridge added a comment - Ludovic, your question prompted me to write a JUnit test (attached). It certainly used to be true that instanceof was slow, but it appears that Sun has remedied that situation. So I stand corrected, I can't dispute the test results. Here is the output of the test: Same type using instanceof: 2449000 nanos Different type using instanceof: 2785000 nanos Same type using class comparison: 2877000 nanos Different type using class comparison: 7894000 nanos Same type using cast check: 2934000 nanos Different type using cast check: 77363000 nanos Note, you get dramatically different results between the client and server VM. I ran these test with the server VM as that is the most likely environment in which BTM would be used. The most important numbers are the "Same type" comparisons. It appears that instanceof is indeed fastest. So, please revert my change to the equals() method in the Uid class.
        Hide
        Brett Wooldridge added a comment -

        Performance test of varying equals() method implementations.

        Show
        Brett Wooldridge added a comment - Performance test of varying equals() method implementations.
        Hide
        Ludovic Orban added a comment -

        Ok, I reverted the equals() implementation in the trunk.

        The Uid and Encoder are now much more efficient than they were, thanks again!

        Show
        Ludovic Orban added a comment - Ok, I reverted the equals() implementation in the trunk. The Uid and Encoder are now much more efficient than they were, thanks again!

          People

          • Assignee:
            Ludovic Orban
            Reporter:
            Brett Wooldridge
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved: