Details
-
Type:
Bug
-
Status:
Resolved
-
Priority:
Major
-
Resolution: Fixed
-
Affects Version/s: JRuby 1.6.5
-
Fix Version/s: JRuby 1.7.0.pre2
-
Component/s: None
-
Labels:None
-
Number of attachments :1
Description
IO#read_nonblock always raises Errno::EAGAIN.
$ jruby -Xerrno.backtrace=true -e 'IO.popen("date").read_nonblock(10)'
#<Class:0x2bab5dae>: Resource temporarily unavailable -
read_nonblock at org/jruby/RubyIO.java:2515
(root) at -e:1
-
- test_net_ssh.rb
- 28/Apr/12 9:59 AM
- 0.5 kB
- Brett Porter
Activity
Seems that it fails for the onliner but it is definitely a bug.
jruby-1.6.5 :001 > io = IO.popen("date")
=> #<IO:0x24df39bc>
jruby-1.6.5 :002 > io.read_nonblock(10)
Errno::EAGAIN: Resource temporarily unavailable -
from org/jruby/RubyIO.java:2515:in `read_nonblock'
from (irb):2:in `evaluate'
from org/jruby/RubyKernel.java:1093:in `eval'
from org/jruby/RubyKernel.java:1420:in `loop'
from org/jruby/RubyKernel.java:1206:in `catch'
from org/jruby/RubyKernel.java:1206:in `catch'
from /Users/i822055/.rvm/rubies/jruby-1.6.5/bin/irb:17:in `(root)'
ruby-1.9.3-p0 :001 > io = IO.popen("date")
=> #<IO:fd 5>
ruby-1.9.3-p0 :002 > io.read_nonblock(10)
=> "Thu 24 Nov"
I ran into this bug with the net-ssh gem because they use IO.popen for the proxy command. https://github.com/net-ssh/net-ssh/blob/master/lib/net/ssh/proxy/command.rb#L50-70
Of course this is a bug. How can non-blocking IO code throw an exception when there is no data ready to be read. The Ruby code calling into JRuby do not know how to handle this correctly and if CRuby has the same issue in its non-blocking code then it is also a bug.
Maybe Ruby must have a look at Java Nio or other implementations like twisted to figure out how to do it in an efficient (no exceptions) and elegant way.
If you absolutely must use exceptions, wouldn't it make more sense to use EWOULDBLOCK?
It is not a bug in the sense that MRI behaves the same way. If you think this is a flaw in Ruby itself, please open a ticket with MRI itself. http://bugs.ruby-lang.org/
$ ruby2.0 -v -e 'IO.popen("date").read_nonblock(10)'
ruby 2.0.0dev (2011-12-31 trunk 34165) [x86_64-darwin11.2.0]
-e:1:in `read_nonblock': Resource temporarily unavailable - read would block (Errno::EAGAIN)
from -e:1:in `<main>'
Note that EWOULDBLOCK is identical to EAGAIN on glibc: http://www.gnu.org/software/libc/manual/html_node/Error-Codes.html I haven't looked at other implementations.
I have the same net-ssh issue as Samuel, which works under MRI but fails under JRuby 1.6.7.
I'm unclear why you thought his example didn't illustrate that they didn't behave the same way?
It just requires some amount of time to have elapsed between the two calls:
1.9.3p194 :001 > io = IO.popen("date"); sleep 1; io.read_nonblock(10) => "Fri Apr 27"
and
jruby-1.6.7 :004 > io = IO.popen("date"); sleep 1; io.read_nonblock(10) Errno::EAGAIN: Resource temporarily unavailable - errno backtraces disabled; run with -Xerrno.backtrace=true to enable
I'm reopening this because I was able to do some work towards fixing the problem, which exists both in 1.6.5 and master.
The difference in results above is because of the process output being a (blocking) BufferedInputStream - which becomes a ReadableByteChannel but not a FileChannel or SelectableChannel. Therefore readnonblock in ChannelStream always returns null, causing the EAGAIN error.
I've fixed this bug here: https://github.com/brettporter/jruby/commit/ba94a82914270aed4a2e1bc96c07fccd9c5120b2
This results in it being a FileChannel as with the other streams and the simple example here works.
However, a consequence of this is that readnonblock is actually blocking (through fread). So examples that both read and write to the process do not work.
I could alleviate that for a simple example I developed with this patch:
diff --git a/src/org/jruby/util/io/ChannelStream.java b/src/org/jruby/util/io/ChannelStream.java index a416cc5..6e8ef13 100644 --- a/src/org/jruby/util/io/ChannelStream.java +++ b/src/org/jruby/util/io/ChannelStream.java @@ -1296,7 +1296,13 @@ public class ChannelStream implements Stream, Finalizable { } } } else if (descriptor.getChannel() instanceof FileChannel) { - return fread(number); + if (hasBufferedInputBytes()) { + // already have some bytes buffered, just return those + return bufferedRead(Math.min(bufferedInputBytesRemaining(), number)); + } else { + // otherwise, we try an unbuffered read to get whatever's available + return read(number); + } } else { return null; }
(readPartial could not be used as it redirects to fread for a FileChannel)
However, going back to the net-ssh example with a ProxyCommand - while it gets much further than before it still blocks reading midway through - it appears to be because it is switching from blocking reading to non-blocking reading in net-ssh and something is out of sync. I will do a little more investigation before I need to move on from it. Unfortunately I'm confused why that still happens - and I'm happy to provide a couple of test cases that illustrate the problem if someone wants to help work on it.
Does anyone have any ideas on the best way to proceed here? The fundamental problem seems to be that the process can have non-blocking IO, but Java gives you a derivative of a FileChannel which is assumed to be blocking IO throughout Java.
It seems that fread, when called from readnonblock might need to emulate a non-blocking channel - which is what I started with the patch above. I could extend that to only call read() when there are bytes available on the stream, but returning 0 bytes triggers EAGAIN. I'm unclear under what circumstances IO.read_nonblock should treat that as EAGAIN/EWOULDBLOCK vs. just returning 0 bytes.
Attaching a test script I have reduced it down to - which works for me under ruby-1.9.3 but not jruby-master. I have tried both net-ssh-2.2.1 and net-ssh-2.3.0. I've also tried jruby-1.6.5.
With unpatched jruby I get:
jruby 1.7.0.dev (ruby-1.9.3-p139) (2012-04-29 91f4a35) (Java HotSpot(TM) 64-Bit Server VM 1.6.0_31) [darwin-x86_64-java]
D, [2012-04-29T00:56:24.370000 #94350] DEBUG -- net.ssh.transport.session[7d0]: establishing connection to github.com:22
D, [2012-04-29T00:56:24.410000 #94350] DEBUG -- net.ssh.transport.session[7d0]: connection established
I, [2012-04-29T00:56:24.413000 #94350] INFO -- net.ssh.transport.server_version[7d2]: negotiating protocol version
D, [2012-04-29T00:56:25.075000 #94350] DEBUG -- net.ssh.transport.server_version[7d2]: remote is `SSH-2.0-OpenSSH_5.5p1 Debian-6+squeeze1+github2'
D, [2012-04-29T00:56:25.076000 #94350] DEBUG -- net.ssh.transport.server_version[7d2]: local is `SSH-2.0-Ruby/Net::SSH_2.3.0 java'
JRuby::EAGAINReadable: Resource temporarily unavailable - errno backtraces disabled; run with -Xerrno.backtrace=true to enable
With the changes I mentioned above I get:
jruby 1.7.0.dev (ruby-1.9.3-p139) (2012-04-28 ba94a82) (Java HotSpot(TM) 64-Bit Server VM 1.6.0_31) [darwin-x86_64-java] D, [2012-04-29T00:52:01.778000 #93951] DEBUG -- net.ssh.transport.session[7d0]: establishing connection to github.com:22 D, [2012-04-29T00:52:01.820000 #93951] DEBUG -- net.ssh.transport.session[7d0]: connection established I, [2012-04-29T00:52:01.823000 #93951] INFO -- net.ssh.transport.server_version[7d2]: negotiating protocol version D, [2012-04-29T00:52:02.362000 #93951] DEBUG -- net.ssh.transport.server_version[7d2]: remote is `SSH-2.0-OpenSSH_5.1p1 Debian-5github2' D, [2012-04-29T00:52:02.363000 #93951] DEBUG -- net.ssh.transport.server_version[7d2]: local is `SSH-2.0-Ruby/Net::SSH_2.3.0 java' D, [2012-04-29T00:52:02.713000 #93951] DEBUG -- io[7d4]: read 784 bytes D, [2012-04-29T00:52:02.720000 #93951] DEBUG -- io[7d4]: received packet nr 0 type 20 len 780 I, [2012-04-29T00:52:02.722000 #93951] INFO -- net.ssh.transport.algorithms[7d6]: got KEXINIT from server I, [2012-04-29T00:52:02.726000 #93951] INFO -- net.ssh.transport.algorithms[7d6]: sending KEXINIT D, [2012-04-29T00:52:02.730000 #93951] DEBUG -- io[7d4]: queueing packet nr 0 type 20 len 716 D, [2012-04-29T00:52:02.731000 #93951] DEBUG -- io[7d4]: sent 720 bytes I, [2012-04-29T00:52:02.732000 #93951] INFO -- net.ssh.transport.algorithms[7d6]: negotiating algorithms D, [2012-04-29T00:52:02.735000 #93951] DEBUG -- net.ssh.transport.algorithms[7d6]: negotiated: * kex: diffie-hellman-group-exchange-sha1 * host_key: ssh-rsa * encryption_server: aes128-cbc * encryption_client: aes128-cbc * hmac_client: hmac-sha1 * hmac_server: hmac-sha1 * compression_client: none * compression_server: none * language_client: * language_server: D, [2012-04-29T00:52:02.736000 #93951] DEBUG -- net.ssh.transport.algorithms[7d6]: exchanging keys D, [2012-04-29T00:52:03.213000 #93951] DEBUG -- io[7d4]: queueing packet nr 1 type 34 len 20 D, [2012-04-29T00:52:03.215000 #93951] DEBUG -- io[7d4]: sent 24 bytes D, [2012-04-29T00:52:03.487000 #93951] DEBUG -- io[7d4]: read 152 bytes D, [2012-04-29T00:52:03.490000 #93951] DEBUG -- io[7d4]: received packet nr 1 type 31 len 148 D, [2012-04-29T00:52:03.510000 #93951] DEBUG -- io[7d4]: queueing packet nr 2 type 32 len 140 D, [2012-04-29T00:52:03.511000 #93951] DEBUG -- io[7d4]: sent 144 bytes D, [2012-04-29T00:52:03.840000 #93951] DEBUG -- io[7d4]: read 720 bytes D, [2012-04-29T00:52:03.842000 #93951] DEBUG -- io[7d4]: received packet nr 2 type 33 len 700 D, [2012-04-29T00:52:03.855000 #93951] DEBUG -- io[7d4]: queueing packet nr 3 type 21 len 20 D, [2012-04-29T00:52:03.856000 #93951] DEBUG -- io[7d4]: sent 24 bytes D, [2012-04-29T00:52:03.859000 #93951] DEBUG -- io[7d4]: received packet nr 3 type 21 len 12
this pauses, then eventually times out:
EOFError: End of file reached
read_nonblock at org/jruby/RubyIO.java:2594
recv at /Users/brett/.rvm/gems/jruby-1.6.5@ssh/gems/net-ssh-2.3.0/lib/net/ssh/proxy/command.rb:68
fill at /Users/brett/.rvm/gems/jruby-1.6.5@ssh/gems/net-ssh-2.3.0/lib/net/ssh/buffered_io.rb:65
next_packet at /Users/brett/.rvm/gems/jruby-1.6.5@ssh/gems/net-ssh-2.3.0/lib/net/ssh/transport/packet_stream.rb:85
poll_message at /Users/brett/.rvm/gems/jruby-1.6.5@ssh/gems/net-ssh-2.3.0/lib/net/ssh/transport/session.rb:171
loop at org/jruby/RubyKernel.java:1353
poll_message at /Users/brett/.rvm/gems/jruby-1.6.5@ssh/gems/net-ssh-2.3.0/lib/net/ssh/transport/session.rb:166
wait at /Users/brett/.rvm/gems/jruby-1.6.5@ssh/gems/net-ssh-2.3.0/lib/net/ssh/transport/session.rb:203
loop at org/jruby/RubyKernel.java:1353
wait at /Users/brett/.rvm/gems/jruby-1.6.5@ssh/gems/net-ssh-2.3.0/lib/net/ssh/transport/session.rb:201
initialize at /Users/brett/.rvm/gems/jruby-1.6.5@ssh/gems/net-ssh-2.3.0/lib/net/ssh/transport/session.rb:80
start at /Users/brett/.rvm/gems/jruby-1.6.5@ssh/gems/net-ssh-2.3.0/lib/net/ssh.rb:186
(root) at /Users/brett/RubymineProjects/jruby-block/test_net_ssh.rb:12
I've narrowed the problem with SSH down to this in net-ssh:
def available_for_read? result = Net::SSH::Compat.io_select([self], nil, nil, 0) result && result.first.any? end
When running under MRI, result will be nil at the point this blocks. JRuby returns true, as a result of this code in SelectBlob:
private void trySelectRead(ThreadContext context, Map<Character,Integer> attachment, RubyIO ioObj) throws IOException {
if (ioObj.getChannel() instanceof SelectableChannel && registerSelect(context, getSelector(context), attachment, ioObj, SelectionKey.OP_READ | SelectionKey.OP_ACCEPT)) {
selectedReads++;
if (ioObj.writeDataBuffered()) {
getPendingReads()[(Integer)attachment.get('r')] = true;
}
} else {
if ((ioObj.getOpenFile().getMode() & OpenFile.READABLE) != 0) {
getUnselectableReads()[(Integer)attachment.get('r')] = true;
}
}
}
This triggers a read that will block - though the next operation on the channel should be a write. I'm not sure what the right circumstances are to not add the unselectable read, though.
Still failing on master, probably due to read_nonblock not being wired up quite right for ssl/ssh. Fix for 1.7.
I improved this for pre2, so it at least doesn't always raise EAGAIN. There are other bugs for continued issues with nonblocking reads on SSL sockets, so I'll mark this particular issue resolved.
Do you have a reference to the other issues to follow? I still get the same result running the test case on 1.7.0.RC2, and I'm unable to build master at the moment to check if my other changes still apply/help.
This issue is marked as Fixed... but it still seems broken to me.
On 1.7.0, running the attached script:
D, [2013-02-18T04:03:14.713000 #67462] DEBUG -- net.ssh.transport.session[3f7a]: establishing connection to github.com:22
D, [2013-02-18T04:03:14.792000 #67462] DEBUG -- net.ssh.transport.session[3f7a]: connection established
I, [2013-02-18T04:03:14.793000 #67462] INFO -- net.ssh.transport.server_version[3f7c]: negotiating protocol version
D, [2013-02-18T04:03:15.294000 #67462] DEBUG -- net.ssh.transport.server_version[3f7c]: remote is `SSH-2.0-OpenSSH_5.5p1 Debian-6+squeeze1+github8'
D, [2013-02-18T04:03:15.294000 #67462] DEBUG -- net.ssh.transport.server_version[3f7c]: local is `SSH-2.0-Ruby/Net::SSH_2.6.5 java'
JRuby::EAGAINReadable: Resource temporarily unavailable - errno backtraces disabled; run with -Xerrno.backtrace=true to enable
And on 1.7.2:
D, [2013-02-18T04:04:52.161000 #67462] DEBUG -- net.ssh.transport.session[81c6]: establishing connection to github.com:22
D, [2013-02-18T04:04:52.205000 #67462] DEBUG -- net.ssh.transport.session[81c6]: connection established
I, [2013-02-18T04:04:52.208000 #67462] INFO -- net.ssh.transport.server_version[81c8]: negotiating protocol version
D, [2013-02-18T04:04:52.745000 #67462] DEBUG -- net.ssh.transport.server_version[81c8]: remote is `SSH-2.0-OpenSSH_5.5p1 Debian-6+squeeze1+github8'
D, [2013-02-18T04:04:52.745000 #67462] DEBUG -- net.ssh.transport.server_version[81c8]: local is `SSH-2.0-Ruby/Net::SSH_2.6.5 java'
JRuby::EAGAINReadable: Resource temporarily unavailable - errno backtraces disabled; run with -Xerrno.backtrace=true to enable
Maybe I'm doing something wrong... but this is manifesting in being unable to use JRuby with Capistrano due to my ssh config file using ProxyCommand's. Unfortunately, this is a total blocker for me right now, going to have to come up with another deployment solution or ditch jruby ![]()
I am also stuck due to this bug which I don't think is fixed yet. I have tried the attached test with jruby-1.7.1, jruby-1.7.2 with the same error.
The application I am working on already uses java libraries. So moving to MRI it out of question ![]()
Charles, could you please point us to a passing test case where EAGAIN is not raised?
Thanks
Since this issue is resolved and seems to have gotten lost... I've opened a new issue as a dup of this one: JRUBY-7110
MRI raises the identical error. I am resolving this as "Not a Bug".
$ jruby -Xerrno.backtrace=true -e 'IO.popen("date").read_nonblock(10)' Errno::EAGAIN: Resource temporarily unavailable - read_nonblock at org/jruby/RubyIO.java:2545 (root) at -e:1 $ ruby -e 'IO.popen("date").read_nonblock(10)' -e:1:in `read_nonblock': Resource temporarily unavailable (Errno::EAGAIN) from -e:1If you have extra information that I missed, feel free to reopen it.