uDIG
  1. uDIG
  2. UDIG-1727

"Full Extent" deadlocks in Tile.disposeSWTImage() due to concurrent locks on org.eclipse.swt.internal.gtk.OS

    Details

    • Type: Bug Bug
    • Status: Open Open
    • Priority: Major Major
    • Resolution: Unresolved
    • Affects Version/s: UDIG 1.2.0
    • Fix Version/s: None
    • Component/s: application
    • Labels:
      None
    • Environment:
      eclipse=3.6
      java.version=1.6.0_20
      java.vendor=Sun Microsystems Inc.
      BootLoader constants: OS=linux, ARCH=x86_64, WS=gtk, NL=nb_NO
      Framework arguments: -product net.refractions.udig.product
    • Patch attached:
      Yes

      Description

      Do the following to reproduce deadlock:

      1. Run udig.product
      2. Add any layer to a map
      3. Ensure that the preference "Use Tiled Rendering System" is checked
      4. Press "Full Extent" command rapidly until uDig freezes

      The root cause of this bug is the same as the UDIG-1726 issue; the tiled rendering system does not always honor the SWT multi-threading policy. Tile.getSWTImage() and Tile.disposeSWTImage() methods in the net.refractions.udig.project.render.Tile class access the SWT subsystem by invoking methods on the SWT image, that in turn access methods in the org.eclipse.swt.internal.gtk.OS class. If the (main) user-interface thread also concurrently access a method in the OS class that requires a lock, a deadlock will occur once a synchronized statement is invoked in the Tile class. Yikes! This is hard to explain concisely. Just reproduce the bug and see for your self by inspecting the stack trace in debug mode. You will see that the threads that are deadlocked both have method(s) that locks in the OS class in the call stack. One of the two threads should be the user-interface (main) thread.

      I believe that the root cause described above has been confirmed by applying the patch described next. This patch ensures that the SWT image instance in the Tile instance is only accessed asynchronously from the user-interface thread by wrapping access with the Display.asyncExec() method. However, this required that a Display instance is readily available, which it is not. Since the Tile instance does not have any information about the user-interface thread, the following "hack" is used to get it:

      iterate over all threads until Display.findDisplay(Thread) returns a non-null reference to a Display instance

      This method has some shortcomings though. I believe that the tiled rendering system should always honor the SWT multi-threading policy from the top down, by ensuring that all calls that might result in invoking methods in the OS class that requires a lock, as the Title.getSWTImage() and Title.disposeSWTImage() methods do, is run on the user-interface thread using the Display.asyncExec() method. The hack I have used, is based on the following assumptions: 1. only one user-interface thread exists the thread-group of the invoking thread (reasonable I think) 2. if found, it is assumed that the thread is alive as long as the application is running, thus once found, the instance is returned directly thereafter 3. if not found, it is safe to invoke Title.getSWTImage() and Title.disposeSWTImage() methods from the calling thread I hope that somebody can confirm that this issue exist on other platforms than Linux, and that my hack actually solves the problem.

      The hack above is attached as a patch.

        Activity

        Hide
        Kenneth Gulbrands°y added a comment -
        Digging further into the root cause of this bug, it seems like the error only occurs on linux (GTK) platforms. I've tried to reproduce the error on win32, with no luck (all Win32 users rejoice!). Looking into the SWT code, further evidence for GTK being the only platform that this bug occurs is found. I found that the locking behavior of the org.eclipse.swt.internal.gtk.OS class differs from the one for win32. I did not find any evidence of locks in the org.eclipse.swt.internal.win32.OS class, the same applies to the carbon and cocoa OS classes, which underpins the assumption that this bug only exists on GTK systems.

        Note that the patch is not tested on the win32, carbon or cocoa platforms as I currently only have time to do uDig development on the GTK platform. Before the patch is committed to branches and trunk, members in the uDig team currently developing on win32 and cocoa should confirm that the bug does not exist, and that the patch does not introduce unwanted behavior on these platforms. I have tested the patch on linux and it works without any problems. I'll run through the project test procedures now to confirm that the patch is OK on the GTK platform.
        Show
        Kenneth Gulbrands°y added a comment - Digging further into the root cause of this bug, it seems like the error only occurs on linux (GTK) platforms. I've tried to reproduce the error on win32, with no luck (all Win32 users rejoice!). Looking into the SWT code, further evidence for GTK being the only platform that this bug occurs is found. I found that the locking behavior of the org.eclipse.swt.internal.gtk.OS class differs from the one for win32. I did not find any evidence of locks in the org.eclipse.swt.internal.win32.OS class, the same applies to the carbon and cocoa OS classes, which underpins the assumption that this bug only exists on GTK systems. Note that the patch is not tested on the win32, carbon or cocoa platforms as I currently only have time to do uDig development on the GTK platform. Before the patch is committed to branches and trunk, members in the uDig team currently developing on win32 and cocoa should confirm that the bug does not exist, and that the patch does not introduce unwanted behavior on these platforms. I have tested the patch on linux and it works without any problems. I'll run through the project test procedures now to confirm that the patch is OK on the GTK platform.

          People

          • Assignee:
            Unassigned
            Reporter:
            Kenneth Gulbrands°y
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated: