BTM
  1. BTM
  2. BTM-67

Transaction interleaving support broken

    Details

    • Type: Bug Bug
    • Status: Closed Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 1.3.3
    • Fix Version/s: 2.0.0
    • Labels:
      None
    • Environment:
      Only Informix can be affected AFAIK
    • Number of attachments :
      1

      Description

      When properly configured, Informix do support TX interleaving meaning 'deferConnectionRelease' can be disabled. Unfortunately the XA pool's handling of TX interleaving has been broken between releases 1.3.2 and 1.3.3.

      This needs to be fixed for next release.

        Issue Links

          Activity

          Hide
          Ludovic Orban added a comment -

          Quite a serious change need to happen with the AbstractXAResourceHolder at its core. Some internal design mistake has been made when this class has been implemented which caused BTM-33. The committed fix does work but is awkward (it just patches around the design mistake) and caused this bug.

          The attached patch is a design cleanup of the XAResourceHolder, its abstract implementation (AbstractXAResourceHolder) all all impacted code.

          The basic idea is that since a XAResourceHolder can potentially be involved in multiple transactions in parallel (be it using TX interleaving or by TX suspension) it needs to keep track of its transactions as soon as it gets enlisted in a new one. The current code keeps a list of all running transactions with the latest reported one marked with special 'privileges': the current one. This works fine in a sequential TX handling mode (when only a single TX can be active at any time) but fails for the case where multiple TX have to run in parallel. The new code will simply use a map of transaction states with the transactions' GTRIDs as keys: simple, elegant, robust and helps pruning some of the TM's most complex code.

          Show
          Ludovic Orban added a comment - Quite a serious change need to happen with the AbstractXAResourceHolder at its core. Some internal design mistake has been made when this class has been implemented which caused BTM-33 . The committed fix does work but is awkward (it just patches around the design mistake) and caused this bug. The attached patch is a design cleanup of the XAResourceHolder, its abstract implementation (AbstractXAResourceHolder) all all impacted code. The basic idea is that since a XAResourceHolder can potentially be involved in multiple transactions in parallel (be it using TX interleaving or by TX suspension) it needs to keep track of its transactions as soon as it gets enlisted in a new one. The current code keeps a list of all running transactions with the latest reported one marked with special 'privileges': the current one. This works fine in a sequential TX handling mode (when only a single TX can be active at any time) but fails for the case where multiple TX have to run in parallel. The new code will simply use a map of transaction states with the transactions' GTRIDs as keys: simple, elegant, robust and helps pruning some of the TM's most complex code.
          Hide
          Ludovic Orban added a comment -

          Part of the fix has been committed to the trunk, it's mostly a refactoring that enables the proper fix to be written. Unfortunately there are two bad news:

          1) the suspend/resume support is now broken in the trunk

          2) while comparing the 1.3.3 logs vs the trunk logs I just noticed there is yet another uncovered bug in 1.3.3: when a TX is suspended and contains an enlisted resource which does not support TMJOIN (can be emulated by setting useTmJoin to false) the TM starts a new branch on the resource, ending up with two branches for the same global TX on the same resource. When this happens the 2PC engine should send two prepare/commit messages to the resource (one for each branch) but only the last branchs gets included in the 2PC cycle which leaves pending work on the resource.

          Show
          Ludovic Orban added a comment - Part of the fix has been committed to the trunk, it's mostly a refactoring that enables the proper fix to be written. Unfortunately there are two bad news: 1) the suspend/resume support is now broken in the trunk 2) while comparing the 1.3.3 logs vs the trunk logs I just noticed there is yet another uncovered bug in 1.3.3: when a TX is suspended and contains an enlisted resource which does not support TMJOIN (can be emulated by setting useTmJoin to false) the TM starts a new branch on the resource, ending up with two branches for the same global TX on the same resource. When this happens the 2PC engine should send two prepare/commit messages to the resource (one for each branch) but only the last branchs gets included in the 2PC cycle which leaves pending work on the resource.
          Hide
          Ludovic Orban added a comment -

          Fixed in trunk.

          Show
          Ludovic Orban added a comment - Fixed in trunk.

            People

            • Assignee:
              Ludovic Orban
              Reporter:
              Ludovic Orban
            • Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: