JRuby (please use github issues at http://bugs.jruby.org)
  1. JRuby (please use github issues at http://bugs.jruby.org)
  2. JRUBY-6528

Socket#connect_nonblock and IO::select appear to be misbehaving?

    Details

    • Type: Bug Bug
    • Status: Resolved Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: JRuby 1.6.6, JRuby 1.6.7
    • Fix Version/s: JRuby 1.7.0.pre1
    • Component/s: None
    • Labels:
      None
    • Number of attachments :
      0

      Description

      I'm working on some socket code that uses IO.select() and Socket#connect_nonblock to do connects with timeout, pretty much in the same way as the Ruby MRI docs have example code(1) for such:

      I have sample code that works fine in MRI 1.9.x And 1.8.x, but not JRuby 1.6.7 (in either 1.8 or 1.9 modes)

      Sample code: https://github.com/jordansissel/experiments/blob/master/ruby/sockets/connect-nonblock.rb

      Here is the output running under many different rubies:

      {:version=>"1.9.2 @ jruby-1.6.5", :socket=>nil}
      {:version=>"1.9.2 @ ruby", :socket=>#<Socket:fd 3>}
      {:version=>"1.9.2 @ jruby-1.6.6", :socket=>nil}
      {:version=>"1.9.3 @ ruby", :socket=>#<Socket:fd 5>}
      {:version=>"1.9.2 @ jruby-1.6.7", :socket=>nil}
      

      The problem appears to be the behavior of IO.select. In my sample code, I expect the 'writer' to be an array containing the socket - when the socket is connected. JRuby seems to not do this:

          reader, writer, error = IO.select(nil, [socket], nil, timeout)
          p :version => PLATFORM, :writer => writer
      

      output:

      {:version=>"1.8.7 @ jruby-1.6.5", :writer=>nil}
      {:version=>"1.9.2 @ ruby", :writer=>[#<Socket:fd 3>]}
      {:version=>"1.8.7 @ jruby-1.6.6", :writer=>nil}
      {:version=>"1.9.3 @ ruby", :writer=>[#<Socket:fd 5>]}
      {:version=>"1.8.7 @ jruby-1.6.7", :writer=>nil}
      

      (1) See 'example code' from MRI docs http://ruby-doc.org/stdlib-1.9.3/libdoc/socket/rdoc/Socket.html#method-i-connect_nonblock

        Activity

        Hide
        Jordan Sissel added a comment - - edited

        I have been randomly able to get IO.select to behave, but I haven't figured out a sample case yet. Of course, I could be hallucinating, been hacking at this for a while

        Show
        Jordan Sissel added a comment - - edited I have been randomly able to get IO.select to behave, but I haven't figured out a sample case yet. Of course, I could be hallucinating, been hacking at this for a while
        Hide
        Charles Oliver Nutter added a comment -

        I know there have been some oddities with nonblocking Socket operations and IO.select, so you're probably not hallucinating. I'll have a look.

        Show
        Charles Oliver Nutter added a comment - I know there have been some oddities with nonblocking Socket operations and IO.select, so you're probably not hallucinating. I'll have a look.
        Hide
        Charles Oliver Nutter added a comment -

        Ok...so what it looks like to me is that it's either waking up spuriously or there's something goofy with the selection logic in the JVM. I can see the socket getting registered with the selector for WRITE, and the select operation proceeds correctly and wakes up...but nothing has been selected at that point.

        Still exploring.

        Show
        Charles Oliver Nutter added a comment - Ok...so what it looks like to me is that it's either waking up spuriously or there's something goofy with the selection logic in the JVM. I can see the socket getting registered with the selector for WRITE, and the select operation proceeds correctly and wakes up...but nothing has been selected at that point. Still exploring.
        Hide
        Charles Oliver Nutter added a comment -

        So, there turned out to be a fairly simple explanation. When we're selecting for read or write, we need to also select for accept and connect, respectively. We were only selecting for write, but write and connect are both triggered by a native write selection event. The selector would wake up because of the native write event, see that we were selecting against an unconnected socket, notice also we weren't interested in connect events, and not include our selection key among those that were triggered. As a result, we did not add the socket to the outgoing writable array, and you got the result you did.

        The fix is simple...for a Ruby "write" select, select for both "write" and "connect". For a Ruby "read" select, select for both "read" and "accept".

        I also made an additional fix to the selection logic that finishes the connect for a socket that has successfully been selected. This logic existed before, but it only ran before the select, and only when selecting for a timeout (see JRUBY-5165).

        Commits (to master) follow:

        commit bcf56db16119d2a82923a272677c14d90ff1d14b
        Author: Charles Oliver Nutter <headius@headius.com>
        Date:   Tue Mar 6 18:36:59 2012 -0600
        
            Fix JRUBY-6528
            
            Socket#connect_nonblock and IO::select appear to be misbehaving?
            
            When selecting for "write" in Ruby terms, we need to select for
            both "write" and "connect" in Java selector terms. On most impls
            of selectors, there's only "read" and "write" to be used for
            read/accept and write/connect. We were only registering interest
            in the Java "write" event", but the JDK knows we're doing select
            against an unconnected socket, so it's looking for "connect"
            interest. It wakes up, because at the native level "write" and
            "connect" are both just "write", sees that we're not interested
            in "connect", and we don't end up adding anything to the outgoing
            writable array.
            
            By modifying "write" selection events to do write + connect (and
            fixing "read" selection events to do read + accept) the given
            script works as expected.
            
            This code still needs some improvement and refactoring, but I'm
            closer to understanding how to do it right now.
        
        commit 715877c10cb58a42374cfa8469259795c0730294
        Author: Charles Oliver Nutter <headius@headius.com>
        Date:   Tue Mar 6 21:03:52 2012 -0600
        
            Additional fixes for connect_nonblock and IO.select.
            
            * Move finishConnect logic into select's write ops handling
        
        Show
        Charles Oliver Nutter added a comment - So, there turned out to be a fairly simple explanation. When we're selecting for read or write, we need to also select for accept and connect, respectively. We were only selecting for write, but write and connect are both triggered by a native write selection event. The selector would wake up because of the native write event, see that we were selecting against an unconnected socket, notice also we weren't interested in connect events, and not include our selection key among those that were triggered. As a result, we did not add the socket to the outgoing writable array, and you got the result you did. The fix is simple...for a Ruby "write" select, select for both "write" and "connect". For a Ruby "read" select, select for both "read" and "accept". I also made an additional fix to the selection logic that finishes the connect for a socket that has successfully been selected. This logic existed before, but it only ran before the select, and only when selecting for a timeout (see JRUBY-5165 ). Commits (to master) follow: commit bcf56db16119d2a82923a272677c14d90ff1d14b Author: Charles Oliver Nutter <headius@headius.com> Date: Tue Mar 6 18:36:59 2012 -0600 Fix JRUBY-6528 Socket#connect_nonblock and IO::select appear to be misbehaving? When selecting for "write" in Ruby terms, we need to select for both "write" and "connect" in Java selector terms. On most impls of selectors, there's only "read" and "write" to be used for read/accept and write/connect. We were only registering interest in the Java "write" event", but the JDK knows we're doing select against an unconnected socket, so it's looking for "connect" interest. It wakes up, because at the native level "write" and "connect" are both just "write", sees that we're not interested in "connect", and we don't end up adding anything to the outgoing writable array. By modifying "write" selection events to do write + connect (and fixing "read" selection events to do read + accept) the given script works as expected. This code still needs some improvement and refactoring, but I'm closer to understanding how to do it right now. commit 715877c10cb58a42374cfa8469259795c0730294 Author: Charles Oliver Nutter <headius@headius.com> Date: Tue Mar 6 21:03:52 2012 -0600 Additional fixes for connect_nonblock and IO.select. * Move finishConnect logic into select's write ops handling
        Hide
        Jordan Sissel added a comment -

        Awesome, thanks for the fix!

        In the meantime, there is a decent workaround available for the current (1.6.7) behavior; documenting here for posterity; the trick is to call Socket#connect_nonblock again and check for EISCONN vs EINPROGRESS exceptions.

          begin
            socket.connect_nonblock(sockaddr)
          rescue Errno::EINPROGRESS
            # Block until the socket is ready, then try again
            reader, writer, error = IO.select([socket], [socket], [socket], timeout)
        
            # JRuby (at least as of 1.6.7) returns [nil,nil,nil] on IO.select when the
            # socket is finished connecting *and* on timeout, so let's hack around this
            # and try to find out if we're really connected or not.
            if RUBY_PLATFORM == "java"
              begin
                socket.connect_nonblock(sockaddr)
              rescue Errno::EISCONN
                # Already connected, do nothing
              rescue Errno::EINPROGRESS
                # Connection still in progress, this means we timed out given
                # our IO.select has returned.
                socket.close
                return nil
              end
            end
          end
        format}
        
        
        Show
        Jordan Sissel added a comment - Awesome, thanks for the fix! In the meantime, there is a decent workaround available for the current (1.6.7) behavior; documenting here for posterity; the trick is to call Socket#connect_nonblock again and check for EISCONN vs EINPROGRESS exceptions. begin socket.connect_nonblock(sockaddr) rescue Errno::EINPROGRESS # Block until the socket is ready, then try again reader, writer, error = IO.select([socket], [socket], [socket], timeout) # JRuby (at least as of 1.6.7) returns [nil,nil,nil] on IO.select when the # socket is finished connecting *and* on timeout, so let's hack around this # and try to find out if we're really connected or not. if RUBY_PLATFORM == "java" begin socket.connect_nonblock(sockaddr) rescue Errno::EISCONN # Already connected, do nothing rescue Errno::EINPROGRESS # Connection still in progress, this means we timed out given # our IO.select has returned. socket.close return nil end end end format}

          People

          • Assignee:
            Charles Oliver Nutter
            Reporter:
            Jordan Sissel
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved: