Details

    • Type: Bug Bug
    • Status: Resolved Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 6.1.25
    • Fix Version/s: 6.1.26
    • Component/s: Servlet
    • Labels:
      None
    • Environment:
      Windows Server 2008 x64, Java HotSpot(TM) 64-Bit Server VM (14.0-b16, mixed mode)
    • Number of attachments :
      0

      Description

      I have encountered a deadlock with connection timeouts. It appears two locks are being acquired by the two threads in opposite order, which leads to a classic deadlock.

      This deadlock is intermittent, and I am unable to make a consistent test case. It looks like the SelectChannelConnector$ConnectorEndPoint.blockWritable() method is attempting to scheduleIdle a TimeoutTask the same time the Acceptor1 thread is expiring that task (called from Timeout.tick()).

      I've attached the thread dumps below.

      Regards,
      Stephen

      Thread Dumps:

      "831453930@qtp-211225614-8 - Acceptor1 SelectChannelConnector@0.0.0.0:8080" - Thread t@25
      java.lang.Thread.State: BLOCKED on org.mortbay.jetty.nio.SelectChannelConnector$ConnectorEndPoint@5f35a7a1 owned by: 1835465145@qtp-211225614-7
      at org.mortbay.io.nio.SelectChannelEndPoint.updateKey(SelectChannelEndPoint.java:322)
      at org.mortbay.io.nio.SelectChannelEndPoint.close(SelectChannelEndPoint.java:456)
      at org.mortbay.jetty.nio.SelectChannelConnector$ConnectorEndPoint.close(SelectChannelConnector.java:362)
      at org.mortbay.io.nio.SelectChannelEndPoint.idleExpired(SelectChannelEndPoint.java:174)
      at org.mortbay.io.nio.SelectChannelEndPoint$IdleTask.expire(SelectChannelEndPoint.java:489)
      at org.mortbay.thread.Timeout.tick(Timeout.java:137)

      • locked org.mortbay.io.nio.SelectorManager$SelectSet@229e90f9
        at org.mortbay.thread.Timeout.tick(Timeout.java:153)
        at org.mortbay.io.nio.SelectorManager$SelectSet.doSelect(SelectorManager.java:762)
        at org.mortbay.io.nio.SelectorManager.doSelect(SelectorManager.java:192)
        at org.mortbay.jetty.nio.SelectChannelConnector.accept(SelectChannelConnector.java:124)
        at org.mortbay.jetty.AbstractConnector$Acceptor.run(AbstractConnector.java:708)
        at org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582)

      Locked ownable synchronizers:

      • None

      "1835465145@qtp-211225614-7" - Thread t@24
      java.lang.Thread.State: BLOCKED on org.mortbay.io.nio.SelectorManager$SelectSet@229e90f9 owned by: 831453930@qtp-211225614-8 - Acceptor1 SelectChannelConnector@0.0.0.0:8080
      at org.mortbay.io.nio.SelectorManager$SelectSet.scheduleIdle(SelectorManager.java:797)
      at org.mortbay.io.nio.SelectChannelEndPoint.scheduleIdle(SelectChannelEndPoint.java:159)
      at org.mortbay.io.nio.SelectChannelEndPoint.blockWritable(SelectChannelEndPoint.java:293)

      • locked org.mortbay.jetty.nio.SelectChannelConnector$ConnectorEndPoint@5f35a7a1
        at org.mortbay.jetty.AbstractGenerator$Output.blockForOutput(AbstractGenerator.java:544)
        at org.mortbay.jetty.AbstractGenerator$Output.flush(AbstractGenerator.java:571)
        at org.mortbay.jetty.HttpConnection$Output.flush(HttpConnection.java:1010)
        at sun.nio.cs.StreamEncoder.implFlush(StreamEncoder.java:278)
        at sun.nio.cs.StreamEncoder.flush(StreamEncoder.java:122)
      • locked java.io.OutputStreamWriter@67770a61
        at java.io.OutputStreamWriter.flush(OutputStreamWriter.java:212)
        at java.io.PrintWriter.flush(PrintWriter.java:276)
      • locked java.io.OutputStreamWriter@67770a61
        ... (snip)
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:616)
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:689)
        at org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:511)
        at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1166)
        ... (snip)
        at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1157)
        at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:388)
        at org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
        at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)
        at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:765)
        at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:440)
        at org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230)
        at org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java:114)
        at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
        at org.mortbay.jetty.Server.handle(Server.java:326)
        at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542)
        at org.mortbay.jetty.HttpConnection$RequestHandler.content(HttpConnection.java:943)
        at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:756)
        at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212)
        at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404)
        at org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:410)
        at org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582)

      Locked ownable synchronizers:

      • None

        Activity

        Hide
        Greg Wilkins added a comment -

        Wow - looks like a classic deadlock.
        you've seen this more than once?

        Show
        Greg Wilkins added a comment - Wow - looks like a classic deadlock. you've seen this more than once?
        Hide
        Greg Wilkins added a comment -

        The problem is that the idle timeout handler is currently extending expire() method (called with a lock held) instead of the expired() method. So the fix is simple, but I'm going to analyse how this problem occurred, because I'm 99% sure it was fixed years ago!

        Show
        Greg Wilkins added a comment - The problem is that the idle timeout handler is currently extending expire() method (called with a lock held) instead of the expired() method. So the fix is simple, but I'm going to analyse how this problem occurred, because I'm 99% sure it was fixed years ago!
        Hide
        Greg Wilkins added a comment -

        fixed in 6171.

        It looks like it was never fixed in jetty-6, only in jetty-7.
        Hopefully that indicates it is a very rare issue.

        Show
        Greg Wilkins added a comment - fixed in 6171. It looks like it was never fixed in jetty-6, only in jetty-7. Hopefully that indicates it is a very rare issue.
        Hide
        Stephen Allen added a comment -

        Hi Greg,

        Just another update, but I encountered this deadlock again in a fairly low volume development environment, so it may not be as rare as we'd hope. Any ETA on the 6.1.26 release?

        Thanks,
        Stephen

        Show
        Stephen Allen added a comment - Hi Greg, Just another update, but I encountered this deadlock again in a fairly low volume development environment, so it may not be as rare as we'd hope. Any ETA on the 6.1.26 release? Thanks, Stephen
        Hide
        Todd Lipcon added a comment -

        I've run into a similar but not-quite-same deadlock on a 6.1.26-ish build:

        "main" prio=10 tid=0x0000000040a99000 nid=0x318b runnable [0x00007f02d30ac000]
        java.lang.Thread.State: RUNNABLE
        at org.mortbay.io.nio.SelectorManager$SelectSet.stop(SelectorManager.java:929)

        • locked <0x000000050de2de48> (a org.mortbay.io.nio.SelectorManager$SelectSet)
          at org.mortbay.io.nio.SelectorManager.doStop(SelectorManager.java:241)
          at org.mortbay.component.AbstractLifeCycle.stop(AbstractLifeCycle.java:76)
        • locked <0x000000050def8488> (a java.lang.Object)
          at org.mortbay.jetty.nio.SelectChannelConnector.close(SelectChannelConnector.java:136)
        • locked <0x000000050de2df38> (a org.mortbay.jetty.nio.SelectChannelConnector)
          at org.apache.hadoop.http.HttpServer.stop(HttpServer.java:751)

        "1987744292@qtp-664112322-1 - Acceptor0 SelectChannelConnector@0.0.0.0:60931" prio=10 tid=0x00000000413d2800 nid=0x322b waiting for monitor entry [0x00007f02c3052000]
        java.lang.Thread.State: BLOCKED (on object monitor)
        at org.mortbay.thread.Timeout.tick(Timeout.java:132)

        • waiting to lock <0x000000050de2de48> (a org.mortbay.io.nio.SelectorManager$SelectSet)
          at org.mortbay.thread.Timeout.tick(Timeout.java:153)
          at org.mortbay.io.nio.SelectorManager$SelectSet.doSelect(SelectorManager.java:784)
          at org.mortbay.io.nio.SelectorManager.doSelect(SelectorManager.java:193)
          at org.mortbay.jetty.nio.SelectChannelConnector.accept(SelectChannelConnector.java:124)
          at org.mortbay.jetty.AbstractConnector$Acceptor.run(AbstractConnector.java:708)
          at org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582)

        Should I open a new JIRA?

        Show
        Todd Lipcon added a comment - I've run into a similar but not-quite-same deadlock on a 6.1.26-ish build: "main" prio=10 tid=0x0000000040a99000 nid=0x318b runnable [0x00007f02d30ac000] java.lang.Thread.State: RUNNABLE at org.mortbay.io.nio.SelectorManager$SelectSet.stop(SelectorManager.java:929) locked <0x000000050de2de48> (a org.mortbay.io.nio.SelectorManager$SelectSet) at org.mortbay.io.nio.SelectorManager.doStop(SelectorManager.java:241) at org.mortbay.component.AbstractLifeCycle.stop(AbstractLifeCycle.java:76) locked <0x000000050def8488> (a java.lang.Object) at org.mortbay.jetty.nio.SelectChannelConnector.close(SelectChannelConnector.java:136) locked <0x000000050de2df38> (a org.mortbay.jetty.nio.SelectChannelConnector) at org.apache.hadoop.http.HttpServer.stop(HttpServer.java:751) "1987744292@qtp-664112322-1 - Acceptor0 SelectChannelConnector@0.0.0.0:60931" prio=10 tid=0x00000000413d2800 nid=0x322b waiting for monitor entry [0x00007f02c3052000] java.lang.Thread.State: BLOCKED (on object monitor) at org.mortbay.thread.Timeout.tick(Timeout.java:132) waiting to lock <0x000000050de2de48> (a org.mortbay.io.nio.SelectorManager$SelectSet) at org.mortbay.thread.Timeout.tick(Timeout.java:153) at org.mortbay.io.nio.SelectorManager$SelectSet.doSelect(SelectorManager.java:784) at org.mortbay.io.nio.SelectorManager.doSelect(SelectorManager.java:193) at org.mortbay.jetty.nio.SelectChannelConnector.accept(SelectChannelConnector.java:124) at org.mortbay.jetty.AbstractConnector$Acceptor.run(AbstractConnector.java:708) at org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582) Should I open a new JIRA?
        Hide
        Jan Bartel added a comment -

        Todd,
        are you sure that's a deadlock? The 1st thread is marked Runnable and not blocked waiting.
        Jan

        Show
        Jan Bartel added a comment - Todd, are you sure that's a deadlock? The 1st thread is marked Runnable and not blocked waiting. Jan
        Hide
        Todd Lipcon added a comment -

        The first thread was spinning for about 6 hours in that stack before I finally killed it. It looks like it's a busy-wait trying to wake up the selector thread, but if the selector thread is blocked on this lock, it'll just busy loop forever.

        Show
        Todd Lipcon added a comment - The first thread was spinning for about 6 hours in that stack before I finally killed it. It looks like it's a busy-wait trying to wake up the selector thread, but if the selector thread is blocked on this lock, it'll just busy loop forever.
        Hide
        Jan Bartel added a comment -

        Todd,

        the linenumbers don't quite align with 6.1.26, so a bit difficult to follow up. Open a new bug for this, providing the full thread dump and any logs you have. I have to warn you though that it is highly unlikely that we will make another jetty-6 release, as we're only doing that for security fixes. If this is vital to you, can I suggest you contact www.webtide.com about some commercial support? Alternatively, an upgrade to jetty-7 or jetty-8 (almost same codebases, 7 is a subset of 8 without support for servlet 3.0) would be the way to go.

        Jan

        Show
        Jan Bartel added a comment - Todd, the linenumbers don't quite align with 6.1.26, so a bit difficult to follow up. Open a new bug for this, providing the full thread dump and any logs you have. I have to warn you though that it is highly unlikely that we will make another jetty-6 release, as we're only doing that for security fixes. If this is vital to you, can I suggest you contact www.webtide.com about some commercial support? Alternatively, an upgrade to jetty-7 or jetty-8 (almost same codebases, 7 is a subset of 8 without support for servlet 3.0) would be the way to go. Jan
        Hide
        Todd Lipcon added a comment -

        Hi Jan. Yea, sorry about the line numbers - this is a fork based on some code that Greg W sent my way a year or two ago. The branch is here: https://github.com/cloudera/jetty-hadoop-fix/

        I'll open another JIRA, but I don't think either commercial support or a major version upgrade is going to happen. I am here representing Hadoop, another open source project - this isn't a production issue from any one customer, but rather a Jetty issue I identified doing some Hadoop stress testing. As for upgrade, it's unfortunate that we're stuck on the old version, but we have a lot of dependent projects which pull in transitive Jetty 6 dependencies, and no good classloader isolation to speak of, so an upgrade would be more painful than it's worth.

        Show
        Todd Lipcon added a comment - Hi Jan. Yea, sorry about the line numbers - this is a fork based on some code that Greg W sent my way a year or two ago. The branch is here: https://github.com/cloudera/jetty-hadoop-fix/ I'll open another JIRA, but I don't think either commercial support or a major version upgrade is going to happen. I am here representing Hadoop, another open source project - this isn't a production issue from any one customer, but rather a Jetty issue I identified doing some Hadoop stress testing. As for upgrade, it's unfortunate that we're stuck on the old version, but we have a lot of dependent projects which pull in transitive Jetty 6 dependencies, and no good classloader isolation to speak of, so an upgrade would be more painful than it's worth.
        Hide
        Todd Lipcon added a comment -

        Filed JETTY-1546

        Show
        Todd Lipcon added a comment - Filed JETTY-1546

          People

          • Assignee:
            Greg Wilkins
            Reporter:
            Stephen Allen
          • Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved: