BTM
  1. BTM
  2. BTM-44

TransactionLogRecord does not get updated when modified, leading to transaction log corruption

    Details

    • Type: Bug Bug
    • Status: Closed Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 1.3.3
    • Fix Version/s: 1.3.3
    • Labels:
      None
    • Number of attachments :
      0

      Description

      (This was originally reported in BTM-43, but I'm convinced now that it is a different bug, if possible, please move the comments from BTM-43 to this one; I'll just continue from where I left off in BTM-43)

      I haven't written a test case yet, but I think I know what causes the corruption.

      In DiskJournal:355, if some (but not all) uniqueNames in danglingRecord are removed, the danglingRecord isn't updated correctly: neither the crc32 nor the record length are recalculated, resulting in a corrupt log entry to be written.

      (BTW: is there a reason .size() == 0 is used instead of isEmpty()?)

        Issue Links

          Activity

          Hide
          Ludovic Orban added a comment -

          This can definitely explain the disk corruption but I'm puzzled why you get partial commit results in the journal - ie: why does this piece of code gets triggered as it should only happen when one of the resources did not commit and has to be picked up by the recoverer later on. Can you confirm that you run into this use case ?

          BTW there was no particular reason for the size() == 0 check so I've replaced it with a isEmpty() check instead.

          I've nevertheless fixed this problem, committed the changes in the trunk and released this build: http://snapshots.repository.codehaus.org/org/codehaus/btm/btm/1.3.3-20090321/btm-1.3.3-20090321.jar

          Show
          Ludovic Orban added a comment - This can definitely explain the disk corruption but I'm puzzled why you get partial commit results in the journal - ie: why does this piece of code gets triggered as it should only happen when one of the resources did not commit and has to be picked up by the recoverer later on. Can you confirm that you run into this use case ? BTW there was no particular reason for the size() == 0 check so I've replaced it with a isEmpty() check instead. I've nevertheless fixed this problem, committed the changes in the trunk and released this build: http://snapshots.repository.codehaus.org/org/codehaus/btm/btm/1.3.3-20090321/btm-1.3.3-20090321.jar
          Hide
          Ludovic Orban added a comment -

          A code review also showed that enabling async 2PC might cause the 2PC engine to run into race conditions that might make it forget to log some resource in the journal. This seems to be the actual root cause of this bug.

          Here is another snapshot build with latest fixes: http://snapshots.repository.codehaus.org/org/codehaus/btm/btm/1.3.3-20090322/btm-1.3.3-20090322.jar

          Show
          Ludovic Orban added a comment - A code review also showed that enabling async 2PC might cause the 2PC engine to run into race conditions that might make it forget to log some resource in the journal. This seems to be the actual root cause of this bug. Here is another snapshot build with latest fixes: http://snapshots.repository.codehaus.org/org/codehaus/btm/btm/1.3.3-20090322/btm-1.3.3-20090322.jar
          Hide
          Ludovic Orban added a comment -

          At least parts of this bug are related to the changes made for BTM-39.

          Show
          Ludovic Orban added a comment - At least parts of this bug are related to the changes made for BTM-39 .
          Hide
          Ludovic Orban added a comment -

          User reported that this bug seems to be fixed. This will need to be asserted before the issue can be considered as resolved.

          Show
          Ludovic Orban added a comment - User reported that this bug seems to be fixed. This will need to be asserted before the issue can be considered as resolved.
          Hide
          Ludovic Orban added a comment -

          reported as fixed by user + test added

          Show
          Ludovic Orban added a comment - reported as fixed by user + test added

            People

            • Assignee:
              Ludovic Orban
              Reporter:
              Dennis Brakhane
            • Votes:
              1 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: