BTM
  1. BTM
  2. BTM-105

SchedulerNaturalOrderIterator causes infinite loop

    Details

    • Type: Bug Bug
    • Status: Closed Closed
    • Priority: Critical Critical
    • Resolution: Fixed
    • Affects Version/s: 2.1.0
    • Fix Version/s: 2.1.1
    • Labels:
      None
    • Number of attachments :
      0

      Description

      Application using BTM get stuck quite quickly when run on Sun JVM 1.6.0_24. This does not happen on JVM 1.6.0_18.

      Eventually, all threads get stuck in the method XAResourceManager.collectUniqueNames(). It seems they are running in an infinite loop, so I suspect the SchedulerNaturalOrderIterator is the culprit, here is a stack dump of a stuck thread:

      "Thread-12" prio=10 tid=0x9f049c00 nid=0xdf6 runnable [0x9dd01000]
      java.lang.Thread.State: RUNNABLE
      at java.util.HashMap.put(HashMap.java:372)
      at java.util.HashSet.add(HashSet.java:200)
      at bitronix.tm.internal.XAResourceManager.collectUniqueNames(XAResourceManager.java:272)
      at bitronix.tm.BitronixTransaction.setStatus(BitronixTransaction.java:323)
      at bitronix.tm.twopc.Preparer.prepare(Preparer.java:64)
      at bitronix.tm.BitronixTransaction.commit(BitronixTransaction.java:225)
      at bitronix.tm.BitronixTransactionManager.commit(BitronixTransactionManager.java:120)

      Some extra observations:

      1) It happens only in a thread with the most complicated transaction. In my case the most complicated transaction means two datasources (JMS [swiftmq ], DB [oracle]) . The JMS datasource is enlisted 4 times as 4 different JMS queues participate in the transaction (i.e. TMJOIN is used)
      2) It is possible to reproduce it with a single thread, so the concurrency is not the problem.
      3) It happens when operations XAResourceManager.enlist() or XAResourceManager.delist() are called
      4) It fails quite fast, 3000 transactions should do the job

        Activity

        Hide
        Ludovic Orban added a comment -

        This is now fixed and reported as such by two impacted users.

        Apparently this bug is not strictly speaking a race condition but has the same symptoms: a lack of synchronization in the iterator of a collection shared by different threads in very special conditions was the root cause (ie: SchedulerNaturalOrderIterator and SchedulerReverseOrderIterator need synchronized blocks on Scheduler.this).

        Only some aggressive memory optimizations in the JVM could make this bug surface which probably is what happened here because:

        • the problem only shows up with the very latest JVM version (1.6.0_24), older ones are unaffected
        • experimentation shown that the problem occurs only when and as soon as XAResourceManager.collectUniqueNames gets JIT'ed

        See: http://old.nabble.com/SchedulerNaturalOrderIterator-causes-infinite-loop-to31136251.html

        Show
        Ludovic Orban added a comment - This is now fixed and reported as such by two impacted users. Apparently this bug is not strictly speaking a race condition but has the same symptoms: a lack of synchronization in the iterator of a collection shared by different threads in very special conditions was the root cause (ie: SchedulerNaturalOrderIterator and SchedulerReverseOrderIterator need synchronized blocks on Scheduler.this). Only some aggressive memory optimizations in the JVM could make this bug surface which probably is what happened here because: the problem only shows up with the very latest JVM version (1.6.0_24), older ones are unaffected experimentation shown that the problem occurs only when and as soon as XAResourceManager.collectUniqueNames gets JIT'ed See: http://old.nabble.com/SchedulerNaturalOrderIterator-causes-infinite-loop-to31136251.html

          People

          • Assignee:
            Ludovic Orban
            Reporter:
            Ludovic Orban
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved: