Details

    • Type: Bug Bug
    • Status: Resolved Resolved
    • Priority: Minor Minor
    • Resolution: Fixed
    • Affects Version/s: JRuby 1.5.2
    • Fix Version/s: JRuby 1.7.0.pre2
    • Component/s: Core Classes/Modules
    • Labels:
      None
    • Environment:
      Mac OS X Snow Leopard running JRuby via RVM
    • Testcase included:
      yes
    • Number of attachments :
      0

      Description

      exec() doesn't actually replace the current process as seen by this code:

      p $$
      exec('echo $$')

      This causes issues when working with things like PID files, since they can point to the wrong process.

      I don't see the advantage of having this not work where it can, since I can always run processes normally using system(), ``, IO.popen(), open3, etc.

        Activity

        Hide
        Charles Oliver Nutter added a comment -

        We originally implemented exec to simply spawn and wait because of the destructive native of a "real" exec call. JRuby is often deployed into servers where there will be other apps running, or even when run as a command-line server (like with the GlassFish or Trinidad gems), any exec calls would wipe out the running server and all instances of JRuby in memory. There are probably cases where we could reasonably do the "real" exec, and we've talked about doing it for a long time (I'm personally a fan of the idea), but we need a good way to delimit where a "real" exec is safe and appropriate, so we don't blow up a multi-instance or multi-app JVM.

        This is also complicated by the fact that JRuby normally tries to keep in the same JVM any system/exec/backquote calls that start with "ruby". Because so many applications spawn new Ruby instances by shelling out with the "ruby" command, we special-case such executions. If we did not, "ruby" instead of "jruby" would launch, and those applications would pay a substantial startup cost for spinning a whole new JVM.

        I personally feel like exec should "really" exec, but I'm very reluctant to do it across the board. Like "exit" which would take down the whole JVM if we made it a "real" exit, JRuby needs to behave in all typical deployment scenarios. This means not doing things that could destroy the host JVM if it's not ours to destroy. So perhaps this would work:

        • Iff the current JRuby instance was started at the command line (via org.jruby.Main.main), both exit and exec will do what they're "really" supposed to do.
        • In other cases, they'll continue to "pretend" to be their appropriate functions, which should isolate damage to that JRuby instance and not take down the entire JVM.

        What do you think?

        Show
        Charles Oliver Nutter added a comment - We originally implemented exec to simply spawn and wait because of the destructive native of a "real" exec call. JRuby is often deployed into servers where there will be other apps running, or even when run as a command-line server (like with the GlassFish or Trinidad gems), any exec calls would wipe out the running server and all instances of JRuby in memory. There are probably cases where we could reasonably do the "real" exec, and we've talked about doing it for a long time (I'm personally a fan of the idea), but we need a good way to delimit where a "real" exec is safe and appropriate, so we don't blow up a multi-instance or multi-app JVM. This is also complicated by the fact that JRuby normally tries to keep in the same JVM any system/exec/backquote calls that start with "ruby". Because so many applications spawn new Ruby instances by shelling out with the "ruby" command, we special-case such executions. If we did not, "ruby" instead of "jruby" would launch, and those applications would pay a substantial startup cost for spinning a whole new JVM. I personally feel like exec should "really" exec, but I'm very reluctant to do it across the board. Like "exit" which would take down the whole JVM if we made it a "real" exit, JRuby needs to behave in all typical deployment scenarios. This means not doing things that could destroy the host JVM if it's not ours to destroy. So perhaps this would work: Iff the current JRuby instance was started at the command line (via org.jruby.Main.main), both exit and exec will do what they're "really" supposed to do. In other cases, they'll continue to "pretend" to be their appropriate functions, which should isolate damage to that JRuby instance and not take down the entire JVM. What do you think?
        Hide
        Charles Oliver Nutter added a comment -

        I've mocked up a simple "exec" impl using FFI to call the real execv(3) libc function: http://gist.github.com/579505

        I believe this mostly duplicates the behavior of MRI's Kernel#exec, and it does what you want it to do.

        I'm still on the fence about making exec do this by default, but I'm leaning that way.

        Show
        Charles Oliver Nutter added a comment - I've mocked up a simple "exec" impl using FFI to call the real execv(3) libc function: http://gist.github.com/579505 I believe this mostly duplicates the behavior of MRI's Kernel#exec, and it does what you want it to do. I'm still on the fence about making exec do this by default, but I'm leaning that way.
        Hide
        Charles Oliver Nutter added a comment -

        Immediately after my comment, I realized that making exec do it the real way doesn't address all the libraries out there that spawn "ruby" using exec. They won't actually launch JRuby unless we continue to intercept those calls and force them to run "jruby" or start a JRuby instance in the same JVM. Any suggestions for solving that?

        Show
        Charles Oliver Nutter added a comment - Immediately after my comment, I realized that making exec do it the real way doesn't address all the libraries out there that spawn "ruby" using exec. They won't actually launch JRuby unless we continue to intercept those calls and force them to run "jruby" or start a JRuby instance in the same JVM. Any suggestions for solving that?
        Hide
        James Edward Gray II added a comment -

        I trust you understand the complications of allowing this better than I do.

        I hear you saying things like we can't just wildly replace the JVM and I think, but that's what exec() does. (I do realize that's a bit of an over simplification, but hopefully you get the idea.)

        I think it's one of the main differences between Java thinking and Ruby thinking. Java has always believed that the right way is the way it can consistently work everywhere. Ruby just decides to trust you with the sharp scissors.

        I feel like if I exec() some code that nukes my server, well that's only a mistake I will need to make once. In that case, it sounds to me like exec() was the wrong tool for the job and I really wanted something like system().

        Even shelling out to a process named "ruby," feels like a bug to me. You can always get the proper name using rbconfig, do something like open("|- ...") (though I'm not sure if JRuby supports the latter), etc.

        I realize that's all my opinion though and, again, you are the expert here.

        Show
        James Edward Gray II added a comment - I trust you understand the complications of allowing this better than I do. I hear you saying things like we can't just wildly replace the JVM and I think, but that's what exec() does. (I do realize that's a bit of an over simplification, but hopefully you get the idea.) I think it's one of the main differences between Java thinking and Ruby thinking. Java has always believed that the right way is the way it can consistently work everywhere. Ruby just decides to trust you with the sharp scissors. I feel like if I exec() some code that nukes my server, well that's only a mistake I will need to make once. In that case, it sounds to me like exec() was the wrong tool for the job and I really wanted something like system(). Even shelling out to a process named "ruby," feels like a bug to me. You can always get the proper name using rbconfig, do something like open("|- ...") (though I'm not sure if JRuby supports the latter), etc. I realize that's all my opinion though and, again, you are the expert here.
        Hide
        James Tucker added a comment -

        I can understand the issue you have with calls out to "ruby" Charles, but this is a difficult one. It's caused me pain in the past too, where I actually /want/ to make a new process using the binary ruby (don't ask). It's obviously hard to balance user error vs. practicality, but I have to ask, how do you actually workaround this behavior cleanly?

        The fact is, users should be using RbConfig::CONFIG['ruby_install_name'] and so on (or Gem.ruby). I'd also say the ruby spec would be MASSIVELY enhanced if these kinds of details were better exposed by all interpreters. Furthermore, in this specific case, Kernel.ruby(*args) would also be appropriate to save reconstruction from rbconfig members.

        I'm also trying to move RubyGems toward more sanity in this area. I think this is important for the language as a whole, as these issues come up elsewhere too, for example on systems where people have two rubies installed, for example "/usr/local/bin/ruby1.9" and "/usr/local/bin/ruby1.8". These things have been a problem for a long time, and it's another of the things that's only been solved by symptom - rvm does that fantastically - but it's not appropriate for packagers.

        I've tried raising this before, maybe you have more clout with -core than I do, but IMO, this needs fixing there too, and in -spec.

        Show
        James Tucker added a comment - I can understand the issue you have with calls out to "ruby" Charles, but this is a difficult one. It's caused me pain in the past too, where I actually /want/ to make a new process using the binary ruby (don't ask). It's obviously hard to balance user error vs. practicality, but I have to ask, how do you actually workaround this behavior cleanly? The fact is, users should be using RbConfig::CONFIG ['ruby_install_name'] and so on (or Gem.ruby). I'd also say the ruby spec would be MASSIVELY enhanced if these kinds of details were better exposed by all interpreters. Furthermore, in this specific case, Kernel.ruby(*args) would also be appropriate to save reconstruction from rbconfig members. I'm also trying to move RubyGems toward more sanity in this area. I think this is important for the language as a whole, as these issues come up elsewhere too, for example on systems where people have two rubies installed, for example "/usr/local/bin/ruby1.9" and "/usr/local/bin/ruby1.8". These things have been a problem for a long time, and it's another of the things that's only been solved by symptom - rvm does that fantastically - but it's not appropriate for packagers. I've tried raising this before, maybe you have more clout with -core than I do, but IMO, this needs fixing there too, and in -spec.
        Hide
        Wayne Meissner added a comment -

        Don't forget, on older versions of MacOS (10.5 and below, I think), exec from a multi-threaded process will fail. And since the JVM always has multiple threads, it will fail. posix_spawn(3) or java's Runtime#exec are the only choices in that situation.

        Show
        Wayne Meissner added a comment - Don't forget, on older versions of MacOS (10.5 and below, I think), exec from a multi-threaded process will fail. And since the JVM always has multiple threads, it will fail. posix_spawn(3) or java's Runtime#exec are the only choices in that situation.
        Hide
        Jesse Hathaway added a comment -

        I think at least moving to Charles Nutter's dual approach as outlined in the first comment would at least provide consistency for most of the people who need true exec functionality, e.g. when using jruby as a wrapper to execute a program and keep track of the pid. But, I agree with James Gray that if you are using exec incorrectly then perhaps you should just be cut by the knife.

        Show
        Jesse Hathaway added a comment - I think at least moving to Charles Nutter's dual approach as outlined in the first comment would at least provide consistency for most of the people who need true exec functionality, e.g. when using jruby as a wrapper to execute a program and keep track of the pid. But, I agree with James Gray that if you are using exec incorrectly then perhaps you should just be cut by the knife.
        Hide
        Thomas E Enebo added a comment -

        We have started migrating all exec()'s to be real exec(). We started with Windows and still have a host of issues to work out. The code for doing it on Unix-boxen is much simpler (and mostly implemented in jnr-posix), but we need to do some refactoring to get things ironed out. We would like to have this for 1.7.0

        Show
        Thomas E Enebo added a comment - We have started migrating all exec()'s to be real exec(). We started with Windows and still have a host of issues to work out. The code for doing it on Unix-boxen is much simpler (and mostly implemented in jnr-posix), but we need to do some refactoring to get things ironed out. We would like to have this for 1.7.0
        Hide
        Charles Oliver Nutter added a comment -

        I believe it was 1.7pre2 that started doing true exec on all platforms.

        Show
        Charles Oliver Nutter added a comment - I believe it was 1.7pre2 that started doing true exec on all platforms.

          People

          • Assignee:
            Charles Oliver Nutter
            Reporter:
            James Edward Gray II
          • Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved: