Details

    • Type: Bug Bug
    • Status: Reopened Reopened
    • Priority: Major Major
    • Resolution: Unresolved
    • Affects Version/s: JRuby 1.6.2, JRuby 1.7.0.pre1
    • Fix Version/s: JRuby 1.6.4
    • Component/s: Interpreter
    • Labels:
      None
    • Environment:
      Linux host 2.6.32-5-openvz-amd64 #1 SMP Wed May 18 23:53:57 UTC 2011 x86_64 GNU/Linux
      Debian squeeze
      OpenJDK Runtime Environment (IcedTea6 1.8.7) (6b18-1.8.7-2~squeeze1)
      OpenJDK 64-Bit Server VM (build 14.0-b16, mixed mode)
    • Number of attachments :
      1

      Description

      Running programs with Process.spawn and then subsequently waiting for them with Process.waitpid garbles $?.exitstatus.

      The demo program attached shows that the exitstatus is somehow random. When using "sleep", as in the demo program, this should always be 0.

      Also, I'm seeing ECHILD errors when the subprocess finishes before waitpid is called, probably because the child is reaped too early/automatically?

      1. test_waitpid.rb
        0.9 kB
        Christian Hofstädtler

        Activity

        Hide
        Charles Oliver Nutter added a comment -

        I believe this is fixed...running your script against both JRuby master and jruby-1_6 branches produces matching pids. Perhaps you can confirm it with a snapshot from http://ci.jruby.org/snapshots please?

        Show
        Charles Oliver Nutter added a comment - I believe this is fixed...running your script against both JRuby master and jruby-1_6 branches produces matching pids. Perhaps you can confirm it with a snapshot from http://ci.jruby.org/snapshots please?
        Hide
        Christian Hofstädtler added a comment -

        The test script prints the /return code/ of the program, not it's pid. This should always print zero - but it still prints various values instead.

        I've tested with self-built binaries from:
        jruby-1_6 HEAD c9444d41a392e89c269ec6e6b1b4837d8a3aa434
        and:
        master HEAD ac992dfaa803bc18c7dae0d02c14055b1888b281

        Show
        Christian Hofstädtler added a comment - The test script prints the /return code/ of the program, not it's pid. This should always print zero - but it still prints various values instead. I've tested with self-built binaries from: jruby-1_6 HEAD c9444d41a392e89c269ec6e6b1b4837d8a3aa434 and: master HEAD ac992dfaa803bc18c7dae0d02c14055b1888b281
        Hide
        Charles Oliver Nutter added a comment -

        Ahh, thanks for clarifying...I obviously wasn't paying attention.

        Show
        Charles Oliver Nutter added a comment - Ahh, thanks for clarifying...I obviously wasn't paying attention.
        Hide
        Charles Oliver Nutter added a comment -

        Ok, had a closer look at your script.

        There's a number of things that aren't going to work right here:

        • JRuby's waitpid does not support flags like WNOHANG. Not sure if it's important in this case or not.
        • Our spawn impl does not currently handle the options you're providing it; they may also be tricky to support since the JVM does not provide a means of launching processes with specific in/out streams until Java 7.

        These features could possibly be added, but that's not what this bug is about.

        We should be able to give the proper status, and I'm looking into why that's not the case right now.

        Show
        Charles Oliver Nutter added a comment - Ok, had a closer look at your script. There's a number of things that aren't going to work right here: JRuby's waitpid does not support flags like WNOHANG. Not sure if it's important in this case or not. Our spawn impl does not currently handle the options you're providing it; they may also be tricky to support since the JVM does not provide a means of launching processes with specific in/out streams until Java 7. These features could possibly be added, but that's not what this bug is about. We should be able to give the proper status, and I'm looking into why that's not the case right now.
        Hide
        Charles Oliver Nutter added a comment -

        Apologies...it does look like our waitpid supports flags. Still investigating why the status is garbled.

        Show
        Charles Oliver Nutter added a comment - Apologies...it does look like our waitpid supports flags. Still investigating why the status is garbled.
        Hide
        Charles Oliver Nutter added a comment -

        Well here's what I have so far, since I have to stop for the night:

        • We call waitpid as normal via the native function. Status is acquired by passing in int[].
        • The resulting int[0] we then >> 8 and & 0xFF. The 0xFF is obviously to get the low-order 8 bits, but the shift confuses me; there seems to be no reason for it, since WEXITSTATUS macros for the process status is supposed to simply mask the lower 8 bits.

        If I remove the masking, the status is 160 (10100000), which still is not right. I have to wonder if our waitpid call is working properly at all now.

        Show
        Charles Oliver Nutter added a comment - Well here's what I have so far, since I have to stop for the night: We call waitpid as normal via the native function. Status is acquired by passing in int[]. The resulting int [0] we then >> 8 and & 0xFF. The 0xFF is obviously to get the low-order 8 bits, but the shift confuses me; there seems to be no reason for it, since WEXITSTATUS macros for the process status is supposed to simply mask the lower 8 bits. If I remove the masking, the status is 160 (10100000), which still is not right. I have to wonder if our waitpid call is working properly at all now.
        Hide
        Charles Oliver Nutter added a comment -

        More info...

        This may be a problem with the race condition caused by going through Java's process-launching logic. Java launches the child process and spins up a separate thread to wait for it to complete. As a result of that, our call to waitpid can often be the second call, resulting in it returning bogus results. This also sometimes causes ECHILD errors to be raised, since the JDK's waitpid has been triggered before our waitpid, and wins the day.

        When our waitpid fires, we do get the proper exit status:

        ~/projects/jruby ➔ jruby --1.9 -e 'pid = Process.spawn("sleep 5"); Process.waitpid pid; p $?'
        Errno::ECHILD: No child processes - No child processes
          waitpid at org/jruby/RubyProcess.java:505
          waitpid at org/jruby/RubyProcess.java:490
           (root) at -e:1
        
        ~/projects/jruby ➔ jruby --1.9 -e 'pid = Process.spawn("sleep 5"); Process.waitpid pid; p $?'
        Errno::ECHILD: No child processes - No child processes
          waitpid at org/jruby/RubyProcess.java:505
          waitpid at org/jruby/RubyProcess.java:490
           (root) at -e:1
        
        ~/projects/jruby ➔ jruby --1.9 -e 'pid = Process.spawn("sleep 5"); Process.waitpid pid; p $?'
        #<Process::Status: pid=99495,exited(0)>
        

        This is another case where we need to simply swap out the JDK calls for real native calls and do spawn "right". It will take more work than simply fixing exit status

        Show
        Charles Oliver Nutter added a comment - More info... This may be a problem with the race condition caused by going through Java's process-launching logic. Java launches the child process and spins up a separate thread to wait for it to complete. As a result of that, our call to waitpid can often be the second call, resulting in it returning bogus results. This also sometimes causes ECHILD errors to be raised, since the JDK's waitpid has been triggered before our waitpid, and wins the day. When our waitpid fires, we do get the proper exit status: ~/projects/jruby &#10132; jruby --1.9 -e 'pid = Process.spawn("sleep 5"); Process.waitpid pid; p $?' Errno::ECHILD: No child processes - No child processes waitpid at org/jruby/RubyProcess.java:505 waitpid at org/jruby/RubyProcess.java:490 (root) at -e:1 ~/projects/jruby &#10132; jruby --1.9 -e 'pid = Process.spawn("sleep 5"); Process.waitpid pid; p $?' Errno::ECHILD: No child processes - No child processes waitpid at org/jruby/RubyProcess.java:505 waitpid at org/jruby/RubyProcess.java:490 (root) at -e:1 ~/projects/jruby &#10132; jruby --1.9 -e 'pid = Process.spawn("sleep 5"); Process.waitpid pid; p $?' #<Process::Status: pid=99495,exited(0)> This is another case where we need to simply swap out the JDK calls for real native calls and do spawn "right". It will take more work than simply fixing exit status

          People

          • Assignee:
            Charles Oliver Nutter
            Reporter:
            Christian Hofstädtler
          • Votes:
            1 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

            • Created:
              Updated: