Details

    • Type: Bug Bug
    • Status: Resolved Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 7.1.4
    • Fix Version/s: 7.4.0
    • Component/s: Bayeux, Continuations, HTTP, NIO, Servlet
    • Labels:
      None
    • Environment:
      Linux 2.6.21.7-2.ec2.v1.2.fc8xen
    • Number of attachments :
      0

      Description

      I have a CometD 1.1.1 application running on Jetty 7.1.4.

      I am currently load testing this application using the CometD java client. The test setup is as follows:

      1) 3500 clients (split over two machines)
      2) Broadcasts to all clients every second, each broadcast message data size is approx. 1000 bytes unencrypted, not including the other CometD message data or HTTP request and response headers.

      I am using the SslSelectChannelConnector, 4 acceptors, 50 threads.

      The Jetty startup parameters are:

       
      -Xms1g
      -Xmx2g
      -XX:NewRatio=2
      -XX:+UseParallelGC
      -XX:+UseParallelOldGC
      -XX:MaxGCPauseMillis=25
      

      With this setup, the Jetty server resident memory grows and grows very quickly until it starts producing errors like:

       
      [2010-07-02 21:07:28,762][qtp2661678-76][WARN ][org.eclipse.jetty.util.log] handle failed
      java.lang.OutOfMemoryError: null
              at sun.misc.Unsafe.allocateMemory(Native Method) ~[na:1.6.0_20]
              at java.nio.DirectByteBuffer.<init>(DirectByteBuffer.java:99) ~[na:1.6.0_20]
              at java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:288) ~[na:1.6.0_20]
              at org.eclipse.jetty.io.nio.DirectNIOBuffer.<init>(DirectNIOBuffer.java:46) ~[jetty-io-7.1.4.v20100610.jar:7.1.4.v20100610]
              at org.eclipse.jetty.server.nio.AbstractNIOConnector.newRequestBuffer(AbstractNIOConnector.java:53) ~[jetty-server-7.1.4.v20100610.jar:7.1.4.v20100610]
              at org.eclipse.jetty.http.HttpBuffers$1.newBuffer(HttpBuffers.java:34) ~[jetty-http-7.1.4.v20100610.jar:7.1.4.v20100610]
              at org.eclipse.jetty.io.ThreadLocalBuffers.getBuffer(ThreadLocalBuffers.java:60) ~[jetty-io-7.1.4.v20100610.jar:7.1.4.v20100610]
              at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:557) ~[jetty-http-7.1.4.v20100610.jar:7.1.4.v20100610]
              at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:211) ~[jetty-http-7.1.4.v20100610.jar:7.1.4.v20100610]
              at org.eclipse.jetty.server.HttpConnection.handle(HttpConnection.java:424) ~[jetty-server-7.1.4.v20100610.jar:7.1.4.v20100610]
              at org.eclipse.jetty.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:489) ~[jetty-io-7.1.4.v20100610.jar:7.1.4.v20100610]
              at org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:436) [jetty-util-7.1.4.v20100610.jar:7.1.4.v20100610]
              at java.lang.Thread.run(Thread.java:619) [na:1.6.0_20]
      

      and finally the JVM usually crashes with:

       
      #
      # A fatal error has been detected by the Java Runtime Environment:
      #
      # java.lang.OutOfMemoryError: requested 32756 bytes for ChunkPool::allocate. Out of swap space?
      #
      #  Internal Error (allocation.cpp:117), pid=5925, tid=1771887504
      #  Error: ChunkPool::allocate
      #
      # JRE version: 6.0_20-b02
      # Java VM: Java HotSpot(TM) Server VM (16.3-b01 mixed mode linux-x86 )
      # An error report file with more information is saved as:
      

      I have even tried this with a 64-bit JVM, setting the -XX:MaxDirectMemorySize=6g and the error still happened once the JVM resident size had grown to approximately 6g plus the heap size.

      I noted that if the maximum heap memory size (-Xmx) is set small enough that the system has to do Full GC's every few seconds (or Full GC's are forced to run every 30s or so via System.gc()), then the problem does not happen. The problem only happens if the system runs for about a minute without a Full GC (with my normal application setting of 2G, Full GC's only happen every 15-20 minutes or so).

      With a smaller message size (between 100-130 bytes) this did not seem to be a consistent problem, though I did get the OOM error occasionally. However, with the larger message size I can consistently trigger the OOM.

      This error is very similar to http://jira.codehaus.org/browse/JETTY-102, and I suspect a direct buffer leak somewhere.

        Activity

        Hide
        David Phillips added a comment -

        SslSelectChannelConnector always allocates DIRECT bytebuffers on a per thread basis despite setting
        <Set name="useDirectBuffers">false</Set>
        in the config file.

        ThreadLocalBuffers buffers = new ThreadLocalBuffers(){
        @Override
        protected Buffer newBuffer(int size)

        { // TODO indirect? return new DirectNIOBuffer(size); }
        @Override
        protected Buffer newHeader(int size) { // TODO indirect? return new DirectNIOBuffer(size); }

        @Override
        protected boolean isHeader(Buffer buffer)

        { return true; }

        };

        Show
        David Phillips added a comment - SslSelectChannelConnector always allocates DIRECT bytebuffers on a per thread basis despite setting <Set name="useDirectBuffers">false</Set> in the config file. ThreadLocalBuffers buffers = new ThreadLocalBuffers(){ @Override protected Buffer newBuffer(int size) { // TODO indirect? return new DirectNIOBuffer(size); } @Override protected Buffer newHeader(int size) { // TODO indirect? return new DirectNIOBuffer(size); } @Override protected boolean isHeader(Buffer buffer) { return true; } };
        Hide
        Greg Wilkins added a comment -

        I've fixed those TODO's for jetty 7.2

        Show
        Greg Wilkins added a comment - I've fixed those TODO's for jetty 7.2
        Hide
        Greg Wilkins added a comment -

        Simone did an analysis that indicates part of this problem comes back to the poor hit/miss ratios of the ThreadLocalBuffers implementation, resulting in too many direct buffers being discarded for no purpose.

        At his prompting, I've completely reviewed and refactored this code:

        + There is now a static BuffersFactory class that will allow different Buffers implementations to be used.
        + The ThreadLocalBuffers and HttpBuffers classes are now not abstract and instead of using overridden methods to pick buffer types, a buffer types are now constructor injected into the Buffers instance.
        + There is a new PooledBuffers class that uses concurrent queues for headers,buffers,others and that has a maxSize applied to all. The buffers that are arbitrary sizes are stored in a single queue, and buffers are consumed until one of the right size is found. This makes sure that no strange sized buffers are put in the pool and never taken out.

        check in coming soon.

        Show
        Greg Wilkins added a comment - Simone did an analysis that indicates part of this problem comes back to the poor hit/miss ratios of the ThreadLocalBuffers implementation, resulting in too many direct buffers being discarded for no purpose. At his prompting, I've completely reviewed and refactored this code: + There is now a static BuffersFactory class that will allow different Buffers implementations to be used. + The ThreadLocalBuffers and HttpBuffers classes are now not abstract and instead of using overridden methods to pick buffer types, a buffer types are now constructor injected into the Buffers instance. + There is a new PooledBuffers class that uses concurrent queues for headers,buffers,others and that has a maxSize applied to all. The buffers that are arbitrary sizes are stored in a single queue, and buffers are consumed until one of the right size is found. This makes sure that no strange sized buffers are put in the pool and never taken out. check in coming soon.
        Hide
        Greg Wilkins added a comment -

        committed r2885

        probably needs more exposure of max pool size so that it can be configured.
        If the pool size is set to -1, then the ThreadLocalBuffers impl is used.

        Show
        Greg Wilkins added a comment - committed r2885 probably needs more exposure of max pool size so that it can be configured. If the pool size is set to -1, then the ThreadLocalBuffers impl is used.
        Hide
        Simone Bordet added a comment -

        I tested the latest code from Greg, and direct memory usage is now stable and low, and the new Buffers implementation effectively pools and reuses buffers efficiently.

        Show
        Simone Bordet added a comment - I tested the latest code from Greg, and direct memory usage is now stable and low, and the new Buffers implementation effectively pools and reuses buffers efficiently.

          People

          • Assignee:
            Simone Bordet
            Reporter:
            Raman Gupta
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved: