Details

    • Type: Bug Bug
    • Status: Resolved Resolved
    • Priority: Critical Critical
    • Resolution: Fixed
    • Affects Version/s: JRuby-OpenSSL 0.7, JRuby 1.6.7
    • Fix Version/s: JRuby 1.7.0.pre1
    • Labels:
      None
    • Environment:
    • Number of attachments :
      2

      Description

      $ rvm use system
      $ ruby --version
      ruby 1.8.7 (2011-06-30 patchlevel 352) [x86_64-linux]
      
      $ ruby server.rb &
      [1] 5924
      Listening on port 8043
      
      $ time curl -m 3 -k https://localhost:8043
      "GET / HTTP/1.1\r\n"
      curl: (28) Operation timed out after 3001 milliseconds with 0 bytes received
      
      real	0m3.011s
      user	0m0.008s
      sys	0m0.012s
      
      $ time ruby client.rb
      "POST /foo HTTP/1.1\r\n"
      /usr/lib/ruby/1.8/timeout.rb:64:in `rbuf_fill': execution expired (Timeout::Error)
      	from /usr/lib/ruby/1.8/net/protocol.rb:134:in `rbuf_fill'
      	from /usr/lib/ruby/1.8/net/protocol.rb:116:in `readuntil'
      	from /usr/lib/ruby/1.8/net/protocol.rb:126:in `readline'
      	from /usr/lib/ruby/1.8/net/http.rb:2028:in `read_status_line'
      	from /usr/lib/ruby/1.8/net/http.rb:2017:in `read_new'
      	from /usr/lib/ruby/1.8/net/http.rb:1051:in `request'
      	from client.rb:38
      	from /usr/lib/ruby/1.8/net/http.rb:543:in `start'
      	from client.rb:38
      	from /usr/lib/ruby/1.8/timeout.rb:67:in `timeout'
      	from /usr/lib/ruby/1.8/timeout.rb:101:in `timeout'
      	from client.rb:37
      
      real	0m2.075s
      user	0m0.056s
      sys	0m0.012s
      
      $ rvm use jruby
      $ ruby --version
      jruby 1.6.7 (ruby-1.8.7-p357) (2012-02-22 3e82bc8) (OpenJDK 64-Bit Server VM 1.6.0_23) [linux-amd64-java]
      $ gem list
      *** LOCAL GEMS ***
      bouncy-castle-java (1.5.0146.1)
      bundler (1.0.22)
      jruby-openssl (0.7.6.1)
      rake (0.8.7)
      
      $ time ruby client.rb
      "POST /foo HTTP/1.1\r\n"
      ^C
      real	0m24.111s
      user	0m5.460s
      sys	0m0.140s
      
      1. client.rb
        1 kB
        zaadjis
      2. server.rb
        3 kB
        zaadjis

        Activity

        Hide
        zaadjis added a comment -

        For some background: sometimes a web service our app depends on gets restarted (or something), but the connections are not closed (or something). This causes our SOAP/HTTPS requests to hang (not do anything forever) and usually the stack trace looks like this:

           java.lang.Thread.State: RUNNABLE
        	at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
        	at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:210)
        	at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:65)
        	at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:69)
        	- locked <e7e449c> (a sun.nio.ch.Util$2)
        	- locked <3ca6533> (a java.util.Collections$UnmodifiableSet)
        	- locked <5cb9f936> (a sun.nio.ch.EPollSelectorImpl)
        	at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:80)
        	at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:84)
        	at org.jruby.ext.openssl.SSLSocket.waitSelect(SSLSocket.java:264)
        	at org.jruby.ext.openssl.SSLSocket.sysread(SSLSocket.java:512)
        	at org.jruby.ext.openssl.SSLSocket$i$0$1$sysread.call(SSLSocket$i$0$1$sysread.gen:65535)
        	at org.jruby.internal.runtime.methods.JavaMethod$JavaMethodN.call(JavaMethod.java:630)
        	at org.jruby.internal.runtime.methods.DynamicMethod.call(DynamicMethod.java:207)
        	at org.jruby.runtime.callsite.CachingCallSite.call(CachingCallSite.java:167)
        	at rubyjit.rbuf_fill_CEDA9ADE5B1F16E5F96164A527614757185A346E.block_0$RUBY$__file__(classpath:/META-INF/jruby.home/lib/ruby/1.8/net/protocol.rb:135)
        

        I'm assuming the timeouts don't kick in (just like in the attached client.rb). Since I don't have the skills to figure this out myself, I'm hoping someone can at least provide me with a work-around on how to make the timeout work.

        Seems JRUBY-5059 (fixed) is a bit similar.

        Show
        zaadjis added a comment - For some background: sometimes a web service our app depends on gets restarted (or something), but the connections are not closed (or something). This causes our SOAP/HTTPS requests to hang (not do anything forever) and usually the stack trace looks like this: java.lang.Thread.State: RUNNABLE at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method) at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:210) at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:65) at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:69) - locked <e7e449c> (a sun.nio.ch.Util$2) - locked <3ca6533> (a java.util.Collections$UnmodifiableSet) - locked <5cb9f936> (a sun.nio.ch.EPollSelectorImpl) at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:80) at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:84) at org.jruby.ext.openssl.SSLSocket.waitSelect(SSLSocket.java:264) at org.jruby.ext.openssl.SSLSocket.sysread(SSLSocket.java:512) at org.jruby.ext.openssl.SSLSocket$i$0$1$sysread.call(SSLSocket$i$0$1$sysread.gen:65535) at org.jruby.internal.runtime.methods.JavaMethod$JavaMethodN.call(JavaMethod.java:630) at org.jruby.internal.runtime.methods.DynamicMethod.call(DynamicMethod.java:207) at org.jruby.runtime.callsite.CachingCallSite.call(CachingCallSite.java:167) at rubyjit.rbuf_fill_CEDA9ADE5B1F16E5F96164A527614757185A346E.block_0$RUBY$__file__(classpath:/META-INF/jruby.home/lib/ruby/1.8/net/protocol.rb:135) I'm assuming the timeouts don't kick in (just like in the attached client.rb ). Since I don't have the skills to figure this out myself, I'm hoping someone can at least provide me with a work-around on how to make the timeout work. Seems JRUBY-5059 (fixed) is a bit similar.
        Hide
        David Kellum added a comment -

        I've seen what I believe is the identical problem via twitter client gem, which is also using https; but I wasn't able to isolate it as well as zaadjis has here. Thanks.

        Show
        David Kellum added a comment - I've seen what I believe is the identical problem via twitter client gem, which is also using https; but I wasn't able to isolate it as well as zaadjis has here. Thanks.
        Hide
        zaadjis added a comment - - edited

        Seems there is a work-around (found it here by searching related keywords):

        require 'net/protocol'
        class Net::BufferedIO
          def rbuf_fill
            if IO.select [@io], nil, nil, @read_timeout
              @io.sysread BUFSIZE, @rbuf
            else
              raise Timeout::Error.new 'execution expired'
            end
          end
        end
        

        Edit: the work around can't work around some other jruby socket/ssl bug (JRUBY-6516):

        #!/usr/bin/env ruby
        
        require 'timeout'
        require 'net/https'
        
        class Net::BufferedIO
          def rbuf_fill
            if IO.select [@io], nil, nil, @read_timeout
              @io.sysread BUFSIZE, @rbuf
            else
              raise Timeout::Error.new 'execution expired'
            end
          end
        end
        
        http = Net::HTTP.new 'encrypted.google.com', 443
        http.use_ssl = true
        http.verify_mode = OpenSSL::SSL::VERIFY_NONE
        http.open_timeout = 60
        http.read_timeout = 60
        
        p http.start{ |h| h.request Net::HTTP::Get.new '/' }
        

        This example works OK in "ruby 1.8.7 (2011-06-30 patchlevel 352) [x86_64-linux]" but always times out in "jruby 1.6.7 (ruby-1.8.7-p357) (2012-02-22 3e82bc8) (OpenJDK 64-Bit Server VM 1.6.0_23) [linux-amd64-java]"

        Edit 2: this seems to work as a work-around (notice the 2nd arg to IO.select):

        class Net::BufferedIO
          def rbuf_fill
            if IO.select [@io], [@io], nil, @read_timeout
              @io.sysread BUFSIZE, @rbuf
            else
              raise Timeout::Error.new 'execution expired'
            end
          end
        end
        
        Show
        zaadjis added a comment - - edited Seems there is a work-around (found it here by searching related keywords): require 'net/protocol' class Net::BufferedIO def rbuf_fill if IO.select [@io], nil, nil, @read_timeout @io.sysread BUFSIZE, @rbuf else raise Timeout::Error.new 'execution expired' end end end Edit: the work around can't work around some other jruby socket/ssl bug ( JRUBY-6516 ): #!/usr/bin/env ruby require 'timeout' require 'net/https' class Net::BufferedIO def rbuf_fill if IO.select [@io], nil, nil, @read_timeout @io.sysread BUFSIZE, @rbuf else raise Timeout::Error.new 'execution expired' end end end http = Net::HTTP.new 'encrypted.google.com', 443 http.use_ssl = true http.verify_mode = OpenSSL::SSL::VERIFY_NONE http.open_timeout = 60 http.read_timeout = 60 p http.start{ |h| h.request Net::HTTP::Get.new '/' } This example works OK in "ruby 1.8.7 (2011-06-30 patchlevel 352) [x86_64-linux] " but always times out in "jruby 1.6.7 (ruby-1.8.7-p357) (2012-02-22 3e82bc8) (OpenJDK 64-Bit Server VM 1.6.0_23) [linux-amd64-java] " Edit 2: this seems to work as a work-around (notice the 2nd arg to IO.select): class Net::BufferedIO def rbuf_fill if IO.select [@io], [@io], nil, @read_timeout @io.sysread BUFSIZE, @rbuf else raise Timeout::Error.new 'execution expired' end end end
        Hide
        zaadjis added a comment -

        Reproducable with jruby master:

        $ ruby --version
        jruby 1.7.0.dev (ruby-2.0.0dev) (2012-03-15 860f429) (OpenJDK 64-Bit Server VM 1.7.0_147-icedtea) [linux-amd64-java]
        
        $ gem list
        *** LOCAL GEMS ***
        bouncy-castle-java (1.5.0146.1)
        bundler (1.1.1 ruby)
        jruby-launcher (1.0.12 java)
        jruby-openssl (0.7.6.1)
        rake (0.9.2, 0.8.7)
        
        $ ruby server.rb &
        [1] 23870
        Listening on port 8043
        
        $ time ruby client.rb
        "POST /foo HTTP/1.1\r\n"
        ^C
        real	0m35.435s
        user	0m7.340s
        sys	0m0.176s
        
        Show
        zaadjis added a comment - Reproducable with jruby master: $ ruby --version jruby 1.7.0.dev (ruby-2.0.0dev) (2012-03-15 860f429) (OpenJDK 64-Bit Server VM 1.7.0_147-icedtea) [linux-amd64-java] $ gem list *** LOCAL GEMS *** bouncy-castle-java (1.5.0146.1) bundler (1.1.1 ruby) jruby-launcher (1.0.12 java) jruby-openssl (0.7.6.1) rake (0.9.2, 0.8.7) $ ruby server.rb & [1] 23870 Listening on port 8043 $ time ruby client.rb "POST /foo HTTP/1.1\r\n" ^C real 0m35.435s user 0m7.340s sys 0m0.176s
        Hide
        Hiroshi Nakamura added a comment -

        JRuby-OSSL has its own Thread.select (SSLSocket.waitSelect) for a workaround to avoid non-blocking issue. Something should be changed by JRuby side...

        I'll look into this.

        Show
        Hiroshi Nakamura added a comment - JRuby-OSSL has its own Thread.select (SSLSocket.waitSelect) for a workaround to avoid non-blocking issue. Something should be changed by JRuby side... I'll look into this.
        Hide
        Hiroshi Nakamura added a comment -

        Adding IO.select timeout param would work as "timeout until it gets readable". This can not be what users want... though JRUBY-6516 is worth investigating.

        Show
        Hiroshi Nakamura added a comment - Adding IO.select timeout param would work as "timeout until it gets readable". This can not be what users want... though JRUBY-6516 is worth investigating.
        Hide
        Hiroshi Nakamura added a comment -

        Aargh, 4fd558e27df6fb1a6019bc9322d96361e5ae536a is the culprit.

        commit 4fd558e27df6fb1a6019bc9322d96361e5ae536a
        Author: Hiroshi Nakamura <nahi@ruby-lang.org>
        Date:   Thu Dec 2 23:43:38 2010 +0900
        
            Proof of concept fix for JRUBY-5200: configureBlocking free select.
            
            RubyThread.select for both sysread and syswrite is a source of deadlock
            when 2 operations are running by different Thread. RubyThread.select try
            to synchronize SelectableChannel.blockingLock first so it cannot be used
            for different SelectionKey.OP_* operations.
            
            This commit copies RubyThread.select to SSLSocket and remove
            blockingLock. This impl just set
            SelectableChannel.configureBlocking(false) permanently instead of
            setting temporarily. SSLSocket requires wrapping IO to be selectable so
            it should be OK to set configureBlocking(false) permanently I think...
        

        I copied Thread.select to SSLSocket.waitSelect. As the result, Timeout task cannot interrupt the selector used in SSLSocket (The task interrupts Thread#currentSelector.) So, we cannot copy Thread.select to avoid the deadlock that is fixed by above commit.

        JRuby 1.7 should fix this issue by merging SSLSocket.waitSelect back into Thread.select. We need to find a workaround for jruby-1_6...

        Show
        Hiroshi Nakamura added a comment - Aargh, 4fd558e27df6fb1a6019bc9322d96361e5ae536a is the culprit. commit 4fd558e27df6fb1a6019bc9322d96361e5ae536a Author: Hiroshi Nakamura <nahi@ruby-lang.org> Date: Thu Dec 2 23:43:38 2010 +0900 Proof of concept fix for JRUBY-5200: configureBlocking free select. RubyThread.select for both sysread and syswrite is a source of deadlock when 2 operations are running by different Thread. RubyThread.select try to synchronize SelectableChannel.blockingLock first so it cannot be used for different SelectionKey.OP_* operations. This commit copies RubyThread.select to SSLSocket and remove blockingLock. This impl just set SelectableChannel.configureBlocking(false) permanently instead of setting temporarily. SSLSocket requires wrapping IO to be selectable so it should be OK to set configureBlocking(false) permanently I think... I copied Thread.select to SSLSocket.waitSelect. As the result, Timeout task cannot interrupt the selector used in SSLSocket (The task interrupts Thread#currentSelector.) So, we cannot copy Thread.select to avoid the deadlock that is fixed by above commit. JRuby 1.7 should fix this issue by merging SSLSocket.waitSelect back into Thread.select. We need to find a workaround for jruby-1_6...
        Hide
        zaadjis added a comment - - edited

        About work-arounds:

        At first I assumed IO.select might be a work-around, because the socket is dead and would never become readable and hopefully the select timeout would work. But it turned out IO.select is buggy too (JRUBY-6516).

        But using a new thread for each @io.sysread works as a work-around: https://gist.github.com/2602014

        Show
        zaadjis added a comment - - edited About work-arounds: At first I assumed IO.select might be a work-around, because the socket is dead and would never become readable and hopefully the select timeout would work. But it turned out IO.select is buggy too ( JRUBY-6516 ). But using a new thread for each @io.sysread works as a work-around: https://gist.github.com/2602014
        Hide
        Charles Oliver Nutter added a comment -

        I have fixed this on master@db95241 by using the interrupt-capable RubyThread.select method rather than doing the selection logic directly in SSLSocket. I'm sure this can be improved further, but it's working for now and better than what we had before.

        Nahi: Have a look at this and let me know if I missed anything. We may want to backport this into jruby-ossl (and we need to figure out the right way to keep them in sync, too).

        Show
        Charles Oliver Nutter added a comment - I have fixed this on master@db95241 by using the interrupt-capable RubyThread.select method rather than doing the selection logic directly in SSLSocket. I'm sure this can be improved further, but it's working for now and better than what we had before. Nahi: Have a look at this and let me know if I missed anything. We may want to backport this into jruby-ossl (and we need to figure out the right way to keep them in sync, too).
        Hide
        Xavier Shay added a comment -

        I can still reproduce this bug on JRuby master as of today, on both Java 1.6 and 1.7.

        > time ruby jruby_ssl_test.rb
        ^Cruby jruby_ssl_test.rb  1.67s user 0.12s system 22% cpu 8.035 total
        > cat jruby_ssl_test.rb 
        require 'net/https'
        
        # open another terminal and run `nc -l 1234`
        client = Net::HTTP.new('localhost', 1234)
        client.use_ssl = true
        client.read_timeout = 1
        client.open_timeout = 1
        
        client.get("/")
        > ruby -v
        jruby 1.7.0.preview1 (ruby-1.9.3-p203) (2012-05-26 7554093) (Java HotSpot(TM) Client VM 1.6.0_32) [darwin-i386-java]
        > uname -a
        Darwin koopa.local 11.4.0 Darwin Kernel Version 11.4.0: Mon Apr  9 19:32:15 PDT 2012; root:xnu-1699.26.8~1/RELEASE_X86_64 x86_64
        
        Show
        Xavier Shay added a comment - I can still reproduce this bug on JRuby master as of today, on both Java 1.6 and 1.7. > time ruby jruby_ssl_test.rb ^Cruby jruby_ssl_test.rb 1.67s user 0.12s system 22% cpu 8.035 total > cat jruby_ssl_test.rb require 'net/https' # open another terminal and run `nc -l 1234` client = Net::HTTP.new('localhost', 1234) client.use_ssl = true client.read_timeout = 1 client.open_timeout = 1 client.get("/") > ruby -v jruby 1.7.0.preview1 (ruby-1.9.3-p203) (2012-05-26 7554093) (Java HotSpot(TM) Client VM 1.6.0_32) [darwin-i386-java] > uname -a Darwin koopa.local 11.4.0 Darwin Kernel Version 11.4.0: Mon Apr 9 19:32:15 PDT 2012; root:xnu-1699.26.8~1/RELEASE_X86_64 x86_64
        Hide
        Emil Ong added a comment -

        Will this be backported to JRuby 1.6? Or has anyone found a workaround?

        Show
        Emil Ong added a comment - Will this be backported to JRuby 1.6? Or has anyone found a workaround?
        Hide
        Charles Oliver Nutter added a comment -

        Xavier: I could not reproduce it... running it locally for me does time out properly:

        $ jruby ssl_test.rb 
        Timeout::Error: execution expired
        

        This could be due to fixes for 1.7.1. We'll leave this resolved.

        Emil: I have rolled the OpenSSL changes back into the jruby-openssl gem and released previews of 0.8.0. Perhaps you could test that with JRuby 1.6 (gem install --pre jruby-openssl) and let me know if it looks better for you?

        Show
        Charles Oliver Nutter added a comment - Xavier: I could not reproduce it... running it locally for me does time out properly: $ jruby ssl_test.rb Timeout::Error: execution expired This could be due to fixes for 1.7.1. We'll leave this resolved. Emil: I have rolled the OpenSSL changes back into the jruby-openssl gem and released previews of 0.8.0. Perhaps you could test that with JRuby 1.6 (gem install --pre jruby-openssl) and let me know if it looks better for you?
        Hide
        Ivasyk Telesyk added a comment -

        Still having this problem on my system.

        jruby -v
        jruby 1.7.0 (1.9.3p203) 2012-10-22 ff1ebbe on OpenJDK 64-Bit Server VM 1.7.0_06-b24 [FreeBSD-amd64]
        uname -a
        FreeBSD Pym.local 8.3-STABLE FreeBSD 8.3-STABLE #3: Mon Oct 15 14:43:10 EEST 2012 root@Pym.local:/usr/obj/usr/src/sys/SUGOI amd64

        And script posted by Xavier Shay hugs for me.

        Show
        Ivasyk Telesyk added a comment - Still having this problem on my system. jruby -v jruby 1.7.0 (1.9.3p203) 2012-10-22 ff1ebbe on OpenJDK 64-Bit Server VM 1.7.0_06-b24 [FreeBSD-amd64] uname -a FreeBSD Pym.local 8.3-STABLE FreeBSD 8.3-STABLE #3: Mon Oct 15 14:43:10 EEST 2012 root@Pym.local:/usr/obj/usr/src/sys/SUGOI amd64 And script posted by Xavier Shay hugs for me.

          People

          • Assignee:
            Hiroshi Nakamura
            Reporter:
            zaadjis
          • Votes:
            2 Vote for this issue
            Watchers:
            7 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved: