BTM
  1. BTM
  2. BTM-24

recovery engine does not fully supports clustering

    Details

    • Type: Improvement Improvement
    • Status: Closed Closed
    • Priority: Minor Minor
    • Resolution: Fixed
    • Affects Version/s: 1.3
    • Fix Version/s: 1.3.1
    • Labels:
      None
    • Patch Submitted:
      Yes
    • Number of attachments :
      1

      Description

      The BTM datasources are completely cluster-safe. Unfortunately, the recovery engine isn't so in case of a crash, the cluster node that crashed must be the one performing recovery.

      This isn't guaranteed for now so you have to be very careful that if a node crash, it must be restarted before you can restart any other node. If you don't, you might get heuristic errors during recovery !

        Activity

        Hide
        Ludovic Orban added a comment - - edited

        See: http://www.nabble.com/BTM-and-clustering-to18752764.html

        This can be done in an elegant and easy way by telling the recovery engine to filter out recovered transactions where the ServerId part of the XID does not match the current TM's ServerId.

        This would make all recovery engines to only recover their own transactions. This is mandatory as the global TX outcome is contained in the TX logs of the node that started the transaction.

        An extra check in RecoveryHelper.recover(XAResourceHolderState resourceHolderState, Set alreadyRecoveredXids, int flags) to filter out XIDs should do the trick. An extra configuration parameter to enable / disable this should be added as well.

        Show
        Ludovic Orban added a comment - - edited See: http://www.nabble.com/BTM-and-clustering-to18752764.html This can be done in an elegant and easy way by telling the recovery engine to filter out recovered transactions where the ServerId part of the XID does not match the current TM's ServerId. This would make all recovery engines to only recover their own transactions. This is mandatory as the global TX outcome is contained in the TX logs of the node that started the transaction. An extra check in RecoveryHelper.recover(XAResourceHolderState resourceHolderState, Set alreadyRecoveredXids, int flags) to filter out XIDs should do the trick. An extra configuration parameter to enable / disable this should be added as well.
        Hide
        Ludovic Orban added a comment -

        Attached patch adds basic support for clustering. It should work but is missing the new configuration parameter.

        Show
        Ludovic Orban added a comment - Attached patch adds basic support for clustering. It should work but is missing the new configuration parameter.
        Hide
        Ludovic Orban added a comment -

        implemented in trunk but did not add an extra configuration flag. This new behavior is sane as the serverId is meant to differentiate different servers.

        Show
        Ludovic Orban added a comment - implemented in trunk but did not add an extra configuration flag. This new behavior is sane as the serverId is meant to differentiate different servers.
        Hide
        Ludovic Orban added a comment -

        After second thought, a new currentNodeOnlyRecovery property has been added to the configuration class. The main reason is to keep the current behavior per default.

        Show
        Ludovic Orban added a comment - After second thought, a new currentNodeOnlyRecovery property has been added to the configuration class. The main reason is to keep the current behavior per default.

          People

          • Assignee:
            Ludovic Orban
            Reporter:
            Arnaud Cogoluegnes
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved: