groovy
  1. groovy
  2. GROOVY-1642

Script needs to be able to retrieve the full path of "its" file

    Details

    • Type: New Feature New Feature
    • Status: Closed Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 1.0
    • Fix Version/s: 2.3.0-beta-1
    • Component/s: None
    • Labels:
      None
    • Number of attachments :
      0

      Description

      Problem previously described on the mailing list: http://www.nabble.com/Script%3A-get-file-path-name-of-executing-script-tf2950940.html#a8253010

      It's a common need when using scripts to make some operations on the file system to get the name of the file containing the script beeing executed. The Script class doesn't provide this information currently but it seems possible to retrieve it with:

      URL scriptUrl = getClass().classLoader.resourceLoader.loadGroovySource(getClass().name)

      According to Jochen this may work... but not all the time.

      What we would need is a safe way (as property of Script) to access this information like Ruby/Perls _FILE_ const.

        Issue Links

          Activity

          Hide
          Marc Guillemot added a comment -

          A related improvement would be to add something like URL Class.getLocation() allowing to know from where a Class has been loaded like what does Ant's <whichResource class="..."/> task.

          Show
          Marc Guillemot added a comment - A related improvement would be to add something like URL Class.getLocation() allowing to know from where a Class has been loaded like what does Ant's <whichResource class="..."/> task.
          Hide
          Marc Guillemot added a comment - - edited

          As specified on the mailing list by David Budworth and Gerrit Geens, it would be interesting too to have access to the URL (or File) of the initially started script. This is not necessary the same as the name of the current script because a script A can execute a script B.

          As long as this information is not provided directly by Groovy, it's possible to modify the groovy.(sh|bat) starter script to make this property available as system property:

          For unix boxes just change $GROOVY_HOME/bin/groovy (the sh script) to do

          export JAVA_OPTS="$JAVA_OPTS -Dscript.name=$0"
          

          before calling startGroovy

          For Windows:
          In startGroovy.bat add the following 2 lines right after the line with
          the :init label (just before the parameter slurping starts):

          @rem get name of script to launch with full path
          set GROOVY_SCRIPT_NAME=%~f1
          

          A bit further down in the batch file after the line that says "set
          JAVA_OPTS=%JAVA_OPTS% -Dgroovy.starter.conf="%STARTER_CONF%" add the
          line

          set JAVA_OPTS=%JAVA_OPTS% -Dscript.name="%GROOVY_SCRIPT_NAME%" 
          
          Show
          Marc Guillemot added a comment - - edited As specified on the mailing list by David Budworth and Gerrit Geens, it would be interesting too to have access to the URL (or File) of the initially started script. This is not necessary the same as the name of the current script because a script A can execute a script B. As long as this information is not provided directly by Groovy, it's possible to modify the groovy.(sh|bat) starter script to make this property available as system property: For unix boxes just change $GROOVY_HOME/bin/groovy (the sh script) to do export JAVA_OPTS= "$JAVA_OPTS -Dscript.name=$0" before calling startGroovy For Windows: In startGroovy.bat add the following 2 lines right after the line with the :init label (just before the parameter slurping starts): @rem get name of script to launch with full path set GROOVY_SCRIPT_NAME=%~f1 A bit further down in the batch file after the line that says "set JAVA_OPTS=%JAVA_OPTS% -Dgroovy.starter.conf="%STARTER_CONF%" add the line set JAVA_OPTS=%JAVA_OPTS% -Dscript.name= "%GROOVY_SCRIPT_NAME%"
          Hide
          Paul King added a comment -

          Added in the script.name enhancement mentioned above. This doesn't resolve this issue but is useful information nonetheless (and definitely useful for the time being).

          Show
          Paul King added a comment - Added in the script.name enhancement mentioned above. This doesn't resolve this issue but is useful information nonetheless (and definitely useful for the time being).
          Hide
          Paul King added a comment -

          Should have mentioned, tested it out on ubuntu and winxp using 1.0.1 snapshot and java 1.5.08

          Show
          Paul King added a comment - Should have mentioned, tested it out on ubuntu and winxp using 1.0.1 snapshot and java 1.5.08
          Hide
          Daniel Serodio added a comment -

          This partial fix broke the shell script on Cygwin: http://jira.codehaus.org/browse/GROOVY-2375

          Show
          Daniel Serodio added a comment - This partial fix broke the shell script on Cygwin: http://jira.codehaus.org/browse/GROOVY-2375
          Hide
          Russel Winder added a comment -

          Not just Cygwin, the code added to the groovy script, which was not added to any of the other script (any reason why not?), breaks for any system where there is a space in path to the Groovy home. I am not sure there is any viable quoting that makes this work.

          Show
          Russel Winder added a comment - Not just Cygwin, the code added to the groovy script, which was not added to any of the other script (any reason why not?), breaks for any system where there is a space in path to the Groovy home. I am not sure there is any viable quoting that makes this work.
          Hide
          Daniel Serodio added a comment -

          What is script.name supposed to point to ? The .groovy script of the "groovy" shell script ?

          Show
          Daniel Serodio added a comment - What is script.name supposed to point to ? The .groovy script of the "groovy" shell script ?
          Hide
          Daniel Serodio added a comment -

          Oops, I meant ``The .groovy script OR the "groovy" shell script ?

          Show
          Daniel Serodio added a comment - Oops, I meant ``The .groovy script OR the "groovy" shell script ?
          Hide
          Russel Winder added a comment -

          I believe that the rearrangement prompted by a comment from Daniel Serodio on GROOVY-2375 has fixed this in an appropriate manner. If people could test it out and if it seems to work close the issue. Thanks.

          Show
          Russel Winder added a comment - I believe that the rearrangement prompted by a comment from Daniel Serodio on GROOVY-2375 has fixed this in an appropriate manner. If people could test it out and if it seems to work close the issue. Thanks.
          Hide
          Luke Daley added a comment -

          This doesn't solve the actual issue here. 'script.name' points to the groovy 'executable', this issue is about getting the path of the groovy script, like perl/ruby's FILE variable. This isn't resolved.

          Show
          Luke Daley added a comment - This doesn't solve the actual issue here. 'script.name' points to the groovy 'executable', this issue is about getting the path of the groovy script, like perl/ruby's FILE variable. This isn't resolved.
          Hide
          Graeme Rocher added a comment -

          Re-opend and scheduled as Luke has a valid point

          Show
          Graeme Rocher added a comment - Re-opend and scheduled as Luke has a valid point
          Hide
          Jim White added a comment -

          The trouble with File is that I don't think all Script objects will have them (particularly when compiled to .class and stored in a JAR file). The only thing you can count on is a URI (as it is part of the classpath regime).

          With a URI you can easily get a File (if there is one) in a platform-independent fashion and from there it's parent directory or whatever you're after.

          So what we should like to say is something like:

          File scriptFile = new File(script.getURI())

          Whether that should be called "toURI" for consistency with java.io.File I don't know. Obviously we want to be able to refer to it property-style in Groovy.

          Now if you want to have a getScriptFile() convenience method on Script to do that, I think that's fine.

          There is some need to be careful on Windows because there is a history of broken URLs there. But I think the modern URI stuff (JDK 1.4) is supposed to work properly. I know I've used that stuff, but not recently and I don't have a Windows box handy here on the road (actually there is one with me but it is stowed).

          Some notes on the trouble with File.toURL (and why we use File.toURI):

          http://weblogs.java.net/blog/kohsuke/archive/2007/04/how_to_convert.html

          http://www.jroller.com/santhosh/entry/converting_file_to_url

          As the initial comments discuss, this really is related to classloading, and I have stuff in that area in GroovyForOpenOffice and also some work on Ivy integration but I can't dig into those details right now.

          Show
          Jim White added a comment - The trouble with File is that I don't think all Script objects will have them (particularly when compiled to .class and stored in a JAR file). The only thing you can count on is a URI (as it is part of the classpath regime). With a URI you can easily get a File (if there is one) in a platform-independent fashion and from there it's parent directory or whatever you're after. So what we should like to say is something like: File scriptFile = new File(script.getURI()) Whether that should be called "toURI" for consistency with java.io.File I don't know. Obviously we want to be able to refer to it property-style in Groovy. Now if you want to have a getScriptFile() convenience method on Script to do that, I think that's fine. There is some need to be careful on Windows because there is a history of broken URLs there. But I think the modern URI stuff (JDK 1.4) is supposed to work properly. I know I've used that stuff, but not recently and I don't have a Windows box handy here on the road (actually there is one with me but it is stowed). Some notes on the trouble with File.toURL (and why we use File.toURI): http://weblogs.java.net/blog/kohsuke/archive/2007/04/how_to_convert.html http://www.jroller.com/santhosh/entry/converting_file_to_url As the initial comments discuss, this really is related to classloading, and I have stuff in that area in GroovyForOpenOffice and also some work on Ivy integration but I can't dig into those details right now.
          Hide
          blackdrag blackdrag added a comment -

          I think this issue is a bit messed up right now.In the beginning it was asked for the initial file. The property solution seems to be perfect for this to me. Only that I would not set the property in the shell script, but through GroovyMain.

          Now you are asking for something like _FILE_, which means the source location for every script. And while I have no idea yet how to add this, I would like to know what you need this for. In other words, I need examples for the usage, to see where it makes sense, and if other things are missing too.

          If the first is a solution, then I can fix it easily. If the second is wanted, then it should be a new issue.

          Show
          blackdrag blackdrag added a comment - I think this issue is a bit messed up right now.In the beginning it was asked for the initial file. The property solution seems to be perfect for this to me. Only that I would not set the property in the shell script, but through GroovyMain. Now you are asking for something like _ FILE _, which means the source location for every script. And while I have no idea yet how to add this, I would like to know what you need this for. In other words, I need examples for the usage, to see where it makes sense, and if other things are missing too. If the first is a solution, then I can fix it easily. If the second is wanted, then it should be a new issue.
          Hide
          Darren Davison added a comment -

          I'd at least like to see the ability for a script to figure out its own fully qualified path. Mostly to give it the ability to pick up resources from relative locations. If I deliver these scripts to clients, I can't ask them to modify their groovy installations so the workaround above (changing $GROOVY_HOME/bin/groovy) is out for me. I know I can do this in bash (or another shell) and wrap the call to the groovy script in it, but then I just have twice as many scripts as I actually need or want, so it would be nice to have this capability.

          Show
          Darren Davison added a comment - I'd at least like to see the ability for a script to figure out its own fully qualified path. Mostly to give it the ability to pick up resources from relative locations. If I deliver these scripts to clients, I can't ask them to modify their groovy installations so the workaround above (changing $GROOVY_HOME/bin/groovy) is out for me. I know I can do this in bash (or another shell) and wrap the call to the groovy script in it, but then I just have twice as many scripts as I actually need or want, so it would be nice to have this capability.
          Hide
          Guillaume Laforge added a comment -

          Do you have an idea how to implement that?

          Show
          Guillaume Laforge added a comment - Do you have an idea how to implement that?
          Hide
          blackdrag blackdrag added a comment -

          Darren, the script.name property does not work for you?

          Show
          blackdrag blackdrag added a comment - Darren, the script.name property does not work for you?
          Hide
          Darren Davison added a comment -

          no, it prints the name of the groovy executable.

          darren@hepburn ~ $ pwd
          /home/darren
          darren@hepburn ~ $ cat test.groovy
          #!/usr/bin/env groovy
          println System.getProperty("script.name")

          darren@hepburn ~ $ echo $GROOVY_HOME
          /opt/groovy
          darren@hepburn ~ $ ./test.groovy
          /opt/groovy/bin/groovy

          Show
          Darren Davison added a comment - no, it prints the name of the groovy executable. darren@hepburn ~ $ pwd /home/darren darren@hepburn ~ $ cat test.groovy #!/usr/bin/env groovy println System.getProperty("script.name") darren@hepburn ~ $ echo $GROOVY_HOME /opt/groovy darren@hepburn ~ $ ./test.groovy /opt/groovy/bin/groovy
          Hide
          Darren Davison added a comment -

          as Marc G. mentioned at the top, adding:

          export JAVA_OPTS="$JAVA_OPTS -Dscript.name=$0"

          in $GROOVY_HOME/bin/groovy would work,and I can of course do this locally, but some of these scripts are used by clients and I obviously need consistent behaviour without asking them to modify their runtimes. GROOVY-2648 is the same issue I think (only just came across this one)

          Show
          Darren Davison added a comment - as Marc G. mentioned at the top, adding: export JAVA_OPTS="$JAVA_OPTS -Dscript.name=$0" in $GROOVY_HOME/bin/groovy would work,and I can of course do this locally, but some of these scripts are used by clients and I obviously need consistent behaviour without asking them to modify their runtimes. GROOVY-2648 is the same issue I think (only just came across this one)
          Hide
          Jon Cox added a comment -

          I finally found a reliable way to get the directory containing the groovy script itself:

          new File(getClass().protectionDomain.codeSource.location.path).parent

          Hideous, but it does work.

          If instead you wanted the full path to the groovy script including the script name itself,
          you could say:

          getClass().protectionDomain.codeSource.location.path

          Having a script know its own full name and path SHOULD be made easier by Groovy
          (as easy as knowing the user's current working directory).

          Show
          Jon Cox added a comment - I finally found a reliable way to get the directory containing the groovy script itself: new File(getClass().protectionDomain.codeSource.location.path).parent Hideous, but it does work. If instead you wanted the full path to the groovy script including the script name itself, you could say: getClass().protectionDomain.codeSource.location.path Having a script know its own full name and path SHOULD be made easier by Groovy (as easy as knowing the user's current working directory).
          Hide
          blackdrag blackdrag added a comment -

          This location is a URL and could point to anything. There is no guarantee, that it will really point to the script file. Also there might be no script file... for example for precompiled classes... or location be a http address.

          How should we react in these cases?

          Show
          blackdrag blackdrag added a comment - This location is a URL and could point to anything. There is no guarantee, that it will really point to the script file. Also there might be no script file... for example for precompiled classes... or location be a http address. How should we react in these cases?
          Hide
          Darren Davison added a comment -

          > How should we react in these cases?

          For my purposes, I know that any script I write which uses a "#!/usr/bin/env groovy" invocation and is already interacting with the local file system is never going to be compiled, JAR'd or delivered over HTTP without major breakage anyway.

          So as long as whatever the mechanism for the script figuring out its own location works when it runs as a script that would do for me.

          @Jon : nice find (but you're right, it's hideous!)

          Show
          Darren Davison added a comment - > How should we react in these cases? For my purposes, I know that any script I write which uses a "#!/usr/bin/env groovy" invocation and is already interacting with the local file system is never going to be compiled, JAR'd or delivered over HTTP without major breakage anyway. So as long as whatever the mechanism for the script figuring out its own location works when it runs as a script that would do for me. @Jon : nice find (but you're right, it's hideous!)
          Hide
          blackdrag blackdrag added a comment -

          That was not exactly an answer I can work with... Should there be an exception, a null value, a constant value...?

          Show
          blackdrag blackdrag added a comment - That was not exactly an answer I can work with... Should there be an exception, a null value, a constant value...?
          Hide
          Darren Davison added a comment -

          well doesn't Marc's original suggestion of simply amending the shell script make this whole issue go away anyhow? Or is that not a complete answer... I don't know. It would work for me.

          <quote author="Marc G">
          for unix boxes just change $GROOVY_HOME/bin/groovy (the sh script) to do
          export JAVA_OPTS="$JAVA_OPTS -Dscript.name=$0"
          before calling startGroovy
          </quote>

          It would also resolve GROOVY-2648. Of course, it would be a breaking change for anyone using it to rely on current behaviour, but that's unlikely. Especially with the windows/'nix inconsistency in this respect.

          In answer to what Jon's code should return in cases where it makes no sense, null would be easier to handle than an exception in the script (no ugly try/catch needed). But I guess anything is fine so long as it's documented.

          Show
          Darren Davison added a comment - well doesn't Marc's original suggestion of simply amending the shell script make this whole issue go away anyhow? Or is that not a complete answer... I don't know. It would work for me. <quote author="Marc G"> for unix boxes just change $GROOVY_HOME/bin/groovy (the sh script) to do export JAVA_OPTS="$JAVA_OPTS -Dscript.name=$0" before calling startGroovy </quote> It would also resolve GROOVY-2648 . Of course, it would be a breaking change for anyone using it to rely on current behaviour, but that's unlikely. Especially with the windows/'nix inconsistency in this respect. In answer to what Jon's code should return in cases where it makes no sense, null would be easier to handle than an exception in the script (no ugly try/catch needed). But I guess anything is fine so long as it's documented.
          Hide
          Jon Cox added a comment -

          Wow, thanks for the fast feedback!

          My intention was to share a trick for those who need something that works in a fully backward-compatible
          way in the short term (like me!).

          I agree that the right thing to do is make Groovy work as compatibly as possible across platforms,
          so if there's a "script.name" that works on Windows, it should work the same way on Linux too.
          The fact that script.name is different between Windows and Linux is a bug on Linux, pure and simple.
          It should be fixed ASAP.

          Coming up with a new name for this value on Linux to maximize "backward bug compatibility" is
          probably misguided because the "script.name" strongly implies it will be the script's name,
          not the script engine's name (i.e.: groovy). Hence my vote: fix the bug and minimize confusion
          and cruft going forward at the minimal expense of messing up a few bug-dependent scripts today.
          If people are super worried, then you could make the installer issue a warning message describing
          the problem at install time. At that point, people can just grep their scripts for "script.name".
          Personally, I think that's overkill, and might create far more pain than it prevents. Currently, script.name
          is pretty much useless on Linux, so how may people actually use it? Not many, I'd guess.

          As for Marc G's suggestion, I think it's basically the right one, except that it doesn't quite work
          as stated for three reasons:

          [1] You need to pass along $1 not $0 (I'm assuming that was just a typo)
          [2] On Debian at least, whatever value you set for script.name in /usr/bin/groovy gets clobbered on line 18 of /usr/bin/startGroovy
          [3] If you want to cannonicalize the script name's path (which you should), then you need to use readlink

          Thus, the best you can do right now (until the fine folks in Groovy land fix the problem) is to modify /usr/bin/startGroovy
          and where it once said on line 18 "SCRIPT_PATH="$0" it should say:

          readLink=`which readlink`
          if [ "$readLink" == "" ] ; then
          SCRIPT_PATH="$1"
          else
          SCRIPT_PATH=`readlink -m "$1"`
          fi

          Why the fancy dance around readlink?
          Sadly, Solaris 10 does not ship with it, so in that case the best you can hope for is to limp along w/o cannonicalization.

          All of this is just a stopgap though, because Groovy should always try to do the same thing
          (i.e.: give you a cannonicalized path) on all platforms. The lack of readline on Solaris 10
          means that the ideal solution would probably be to push this logic into Groovy itself, so long
          as that won't mess up folks on Cygwin. If people don't care about Solaris 10 much and/or
          think those poor souls should just go out and install readlink one fine day, then the startGroovy
          script looks like a winner (unless there are meaningful cases where groovy scripts are not
          started by startGroovy.... then we're back to pushing the logic into Groovy itself).

          For now, the straightforward /usr/bin/startGroovy solution is probably the best one.
          What do you all think?

          Cheers,
          -Jon

          Show
          Jon Cox added a comment - Wow, thanks for the fast feedback! My intention was to share a trick for those who need something that works in a fully backward-compatible way in the short term (like me!). I agree that the right thing to do is make Groovy work as compatibly as possible across platforms, so if there's a "script.name" that works on Windows, it should work the same way on Linux too. The fact that script.name is different between Windows and Linux is a bug on Linux, pure and simple. It should be fixed ASAP. Coming up with a new name for this value on Linux to maximize "backward bug compatibility" is probably misguided because the "script.name" strongly implies it will be the script's name, not the script engine's name (i.e.: groovy). Hence my vote: fix the bug and minimize confusion and cruft going forward at the minimal expense of messing up a few bug-dependent scripts today. If people are super worried, then you could make the installer issue a warning message describing the problem at install time. At that point, people can just grep their scripts for "script.name". Personally, I think that's overkill, and might create far more pain than it prevents. Currently, script.name is pretty much useless on Linux, so how may people actually use it? Not many, I'd guess. As for Marc G's suggestion, I think it's basically the right one, except that it doesn't quite work as stated for three reasons: [1] You need to pass along $1 not $0 (I'm assuming that was just a typo) [2] On Debian at least, whatever value you set for script.name in /usr/bin/groovy gets clobbered on line 18 of /usr/bin/startGroovy [3] If you want to cannonicalize the script name's path (which you should), then you need to use readlink Thus, the best you can do right now (until the fine folks in Groovy land fix the problem) is to modify /usr/bin/startGroovy and where it once said on line 18 "SCRIPT_PATH="$0" it should say: readLink=`which readlink` if [ "$readLink" == "" ] ; then SCRIPT_PATH="$1" else SCRIPT_PATH=`readlink -m "$1"` fi Why the fancy dance around readlink? Sadly, Solaris 10 does not ship with it, so in that case the best you can hope for is to limp along w/o cannonicalization. All of this is just a stopgap though, because Groovy should always try to do the same thing (i.e.: give you a cannonicalized path) on all platforms. The lack of readline on Solaris 10 means that the ideal solution would probably be to push this logic into Groovy itself, so long as that won't mess up folks on Cygwin. If people don't care about Solaris 10 much and/or think those poor souls should just go out and install readlink one fine day, then the startGroovy script looks like a winner (unless there are meaningful cases where groovy scripts are not started by startGroovy.... then we're back to pushing the logic into Groovy itself). For now, the straightforward /usr/bin/startGroovy solution is probably the best one. What do you all think? Cheers, -Jon
          Hide
          Matthew Corby-Eaglen added a comment -

          What's the status of this now?

          Show
          Matthew Corby-Eaglen added a comment - What's the status of this now?
          Hide
          Phil Walker added a comment -

          This is an important issue for me, as well. My perspective is identical to Darren's, due to my use case:
          I write scripts that are intended to work identically on Linux, OS-X, solaris, and cygwin.
          Putting the fix in the script wrappers ($GROOVY_HOME/bin/groovy, or startGroovy) is less than ideal, due to
          the readlink issue mentioned above.

          Below is my suggested fix, in the form of a subversion diff (relative to subversion revision 19700).
          It works on cygwin, linux and OSX, and should work on any OS script. With this patch, a script would access
          it's name as show in this example script:

          #!/usr/bin/env groovy
          printf("this script is named [%s]\n",System.getProperty("script.file.name")

          Here's the patch:

          Index: src/main/groovy/ui/GroovyMain.java
          ===================================================================
          — src/main/groovy/ui/GroovyMain.java (revision 19700)
          +++ src/main/groovy/ui/GroovyMain.java (working copy)
          @@ -371,6 +371,14 @@
          if (!scriptFile.exists())

          { scriptFile = new File(scriptFileName); }

          + String canonicalName = scriptFileName;
          + try

          { + canonicalName = scriptFile.getCanonicalPath(); + }

          catch(IOException io)

          { + System.setProperty("script.file.exception",io.getMessage()); + }

          + System.setProperty("script.file.name",canonicalName);
          +
          return scriptFile;
          }

          Phil

          Show
          Phil Walker added a comment - This is an important issue for me, as well. My perspective is identical to Darren's, due to my use case: I write scripts that are intended to work identically on Linux, OS-X, solaris, and cygwin. Putting the fix in the script wrappers ($GROOVY_HOME/bin/groovy, or startGroovy) is less than ideal, due to the readlink issue mentioned above. Below is my suggested fix, in the form of a subversion diff (relative to subversion revision 19700). It works on cygwin, linux and OSX, and should work on any OS script. With this patch, a script would access it's name as show in this example script: #!/usr/bin/env groovy printf("this script is named [%s] \n",System.getProperty("script.file.name") Here's the patch: Index: src/main/groovy/ui/GroovyMain.java =================================================================== — src/main/groovy/ui/GroovyMain.java (revision 19700) +++ src/main/groovy/ui/GroovyMain.java (working copy) @@ -371,6 +371,14 @@ if (!scriptFile.exists()) { scriptFile = new File(scriptFileName); } + String canonicalName = scriptFileName; + try { + canonicalName = scriptFile.getCanonicalPath(); + } catch(IOException io) { + System.setProperty("script.file.exception",io.getMessage()); + } + System.setProperty("script.file.name",canonicalName); + return scriptFile; } Phil
          Hide
          Guillaume Laforge added a comment -

          Indeed a good approach sure to be working.
          Perhaps we could even add a variable in the binding of the scripts, like scriptFileName?

          Show
          Guillaume Laforge added a comment - Indeed a good approach sure to be working. Perhaps we could even add a variable in the binding of the scripts, like scriptFileName?
          Hide
          Jim White added a comment - - edited

          Please note my earlier comment that a file path string is not a fully general solution because a script may be in JAR file. The "name" needs to be a URI to handle that case, and is probably the right thing for paths that don't use the file system at all (scripts from strings or http or what-have-you). Even though this change will work for GroovyMain, it is important that the API be consistent for any kind of script (which is why this long standing issue remains open).

          This really should have been resolved as part of the reworking of the script/file/classloader stuff that was done not so long ago so that all the right meta level logic is properly addressed.

          Also I'm skeptical of calling getCanonicalPath. It is a funky function and isn't necessary if you have the File object. And it could break code that is looking at the file path to do package hierarchy stuff.

          I support this change if modified to do this:

          {{
          URI scriptURI = scriptFile.toURI();
          System.setProperty("script.uri", scriptURI.toASCIIString());
          }}

          File.toURI doesn't throw and exception to be caught, and it does AbsolutePath rather than CanonicalPath, which is far more likely to be what is intended.

          I haven't looked at the rest of the function being modified, but why is this only done when the script file doesn't exist?

          Show
          Jim White added a comment - - edited Please note my earlier comment that a file path string is not a fully general solution because a script may be in JAR file. The "name" needs to be a URI to handle that case, and is probably the right thing for paths that don't use the file system at all (scripts from strings or http or what-have-you). Even though this change will work for GroovyMain, it is important that the API be consistent for any kind of script (which is why this long standing issue remains open). This really should have been resolved as part of the reworking of the script/file/classloader stuff that was done not so long ago so that all the right meta level logic is properly addressed. Also I'm skeptical of calling getCanonicalPath. It is a funky function and isn't necessary if you have the File object. And it could break code that is looking at the file path to do package hierarchy stuff. I support this change if modified to do this: {{ URI scriptURI = scriptFile.toURI(); System.setProperty("script.uri", scriptURI.toASCIIString()); }} File.toURI doesn't throw and exception to be caught, and it does AbsolutePath rather than CanonicalPath, which is far more likely to be what is intended. I haven't looked at the rest of the function being modified, but why is this only done when the script file doesn't exist?
          Hide
          Phil Walker added a comment -

          The "!scriptFile.exists()" code is merely there for context (it's already present in GroovyMain).

          // if we still haven't found the file, point back to the originally specified filename
          if (!scriptFile.exists())

          { scriptFile = new File(scriptFileName); }

          I agree that the URI is a more general solution, and tested it to verify that it works in the
          script file context:

          URI scriptURI = scriptFile.toURI();
          System.setProperty("script.uri",scriptURI.toASCIIString());

          To convert the property string back to a File in a script context, you would do something like this:
          scriptUri = System.getProperty("script.uri")=
          File scriptFile = new File(scriptUri.substring("file:/".length()))

          It's a bit odd looking, but not too bad (maybe someone can suggest a more succinct usage?)

          I suspect that GroovyMain is not the only place (or perhaps not the best place) to initialize the property.
          I'll investigate to see if I can find a more general location.

          Phil

          Show
          Phil Walker added a comment - The "!scriptFile.exists()" code is merely there for context (it's already present in GroovyMain). // if we still haven't found the file, point back to the originally specified filename if (!scriptFile.exists()) { scriptFile = new File(scriptFileName); } I agree that the URI is a more general solution, and tested it to verify that it works in the script file context: URI scriptURI = scriptFile.toURI(); System.setProperty("script.uri",scriptURI.toASCIIString()); To convert the property string back to a File in a script context, you would do something like this: scriptUri = System.getProperty("script.uri")= File scriptFile = new File(scriptUri.substring("file:/".length())) It's a bit odd looking, but not too bad (maybe someone can suggest a more succinct usage?) I suspect that GroovyMain is not the only place (or perhaps not the best place) to initialize the property. I'll investigate to see if I can find a more general location. Phil
          Hide
          Phil Walker added a comment -

          A more readable way to consume the "script.uri" property:

          File scriptFile = new File(new URI(System.getProperty("script.uri"))

          Phil

          Show
          Phil Walker added a comment - A more readable way to consume the "script.uri" property: File scriptFile = new File(new URI(System.getProperty("script.uri")) Phil
          Hide
          Phil Walker added a comment - - edited

          After further investigation, there appear to be 5 use cases:

          1. script source is a File
          2. script source is a URL
          3. script source is a String
          4. script source is a Reader
          5. script source is an InputStream

          For each of these use cases, there is a corresponding constructor in groovy.lang.GroovyCodeSource.

          In the case of a String, Reader, or InputStream, there is no URI, although the constructors do have a "name" parameter.

          For the 1st 2 cases, we could add appropriate code to the File and URL constructors in groovy.lang.GroovyCodeSource. However, there are cases in which execution of a script would involve multiple GroovyCodeSource constructors, each overwriting the property value.

          To correctly support the script file use case, the purpose of which is to match the capabilities of perl/python/bash scripts, a script must be able to determine where it resides in the filesystem. In this context, a script is always sourced from a file, and it would represent broken behavior if a class loaded from a URL were to overwrite the "script.file" property. To prevent the most recent GroovyCodeSource constructor from overwriting this value, it should be initialized in GroovyMain.

          My suggestion, therefore, is that we create 2 properties:

          "script.file" (always the absolute name of the script file, or null)

          "script.uri" (the URI of the most recently constructed GroovyCodeSource)

          Thoughts?

          Phil

          Show
          Phil Walker added a comment - - edited After further investigation, there appear to be 5 use cases: 1. script source is a File 2. script source is a URL 3. script source is a String 4. script source is a Reader 5. script source is an InputStream For each of these use cases, there is a corresponding constructor in groovy.lang.GroovyCodeSource. In the case of a String, Reader, or InputStream, there is no URI, although the constructors do have a "name" parameter. For the 1st 2 cases, we could add appropriate code to the File and URL constructors in groovy.lang.GroovyCodeSource. However, there are cases in which execution of a script would involve multiple GroovyCodeSource constructors, each overwriting the property value. To correctly support the script file use case, the purpose of which is to match the capabilities of perl/python/bash scripts, a script must be able to determine where it resides in the filesystem. In this context, a script is always sourced from a file, and it would represent broken behavior if a class loaded from a URL were to overwrite the "script.file" property. To prevent the most recent GroovyCodeSource constructor from overwriting this value, it should be initialized in GroovyMain. My suggestion, therefore, is that we create 2 properties: "script.file" (always the absolute name of the script file, or null) "script.uri" (the URI of the most recently constructed GroovyCodeSource) Thoughts? Phil
          Hide
          blackdrag blackdrag added a comment -

          if you get the "its" location of a script, how does that work with multiple scripts in perl/python? Do you get only the one of the start script, or does it change depending on which script you are in currently?

          As for the recently constructed GroovyCodeSource.... does it make sense? When I execute a inlined script, then it will change. script.uri may then change all the time even.

          Show
          blackdrag blackdrag added a comment - if you get the "its" location of a script, how does that work with multiple scripts in perl/python? Do you get only the one of the start script, or does it change depending on which script you are in currently? As for the recently constructed GroovyCodeSource.... does it make sense? When I execute a inlined script, then it will change. script.uri may then change all the time even.
          Hide
          Phil Walker added a comment - - edited

          After reviewing recent comments added to this discussion, I think I like Guillaume's suggestion to add a binding to script classes (this.scriptFileName), which is unambiguous as to which file you're interested in.

          A perl or bash self reference usually refers to the script in which it occurs. If a script uses a package (another perl script), any references in the other script will still refer back to the original script that launched the process. Similarly, if a bash script "sources" another script, the other script is treated as if it is contained within the calling script, so all self references will resolve to the originally called script.

          Here is a perl script that prints its filename:

          #!/usr/bin/perl
          printf("this script file is named [%s]\n",$0)
          

          Here's the equivalent bash script:

          #!/bin/bash
          echo "this script file is named [$0]"
          

          There are other useful scripting techniques that involve self-reference. For example, in perl, you can append data to the end of a script file, by preceding it with a single line that starts with _DATA_, which is quite difficult to simulate in groovy (impossible unless you can derive the filename).

          This script calculates the average of numbers in the _DATA_ section:

          #!/usr/bin/perl -w
          
          chomp(my @lines = <DATA>);
          my $total = 0.0;
          foreach my $line ( @lines ){
              $line =~ s/^\s+|\s+$//g;
              $total += int($line);
          }
          printf "average: %1.4f\n",$total / scalar @lines;
          
          __DATA__
          252
          252
          252
          252
          248
          252
          252
          
          Show
          Phil Walker added a comment - - edited After reviewing recent comments added to this discussion, I think I like Guillaume's suggestion to add a binding to script classes (this.scriptFileName), which is unambiguous as to which file you're interested in. A perl or bash self reference usually refers to the script in which it occurs. If a script uses a package (another perl script), any references in the other script will still refer back to the original script that launched the process. Similarly, if a bash script "sources" another script, the other script is treated as if it is contained within the calling script, so all self references will resolve to the originally called script. Here is a perl script that prints its filename: #!/usr/bin/perl printf( " this script file is named [%s]\n" ,$0) Here's the equivalent bash script: #!/bin/bash echo " this script file is named [$0]" There are other useful scripting techniques that involve self-reference. For example, in perl, you can append data to the end of a script file, by preceding it with a single line that starts with _ DATA _, which is quite difficult to simulate in groovy (impossible unless you can derive the filename). This script calculates the average of numbers in the _ DATA _ section: #!/usr/bin/perl -w chomp(my @lines = <DATA>); my $total = 0.0; foreach my $line ( @lines ){ $line =~ s/^\s+|\s+$ //g; $total += int ($line); } printf "average: %1.4f\n" ,$total / scalar @lines; __DATA__ 252 252 252 252 248 252 252
          Hide
          Jim White added a comment -

          @Jochen - I agree, that is serious problem with using a system property and I do not care for it much. But if we're not going to fix this more completely then setting such a system property in GroovyMain for typical script file invocations is worthwhile.

          But I think the right approach is adding a property to Script:

          URI getScriptURI()

          'scriptURI' seems the right thing because 'URI' would conflict with 'java.net.URI'. And 'toURI' seems odd.

          And a convenience method that uses that:

          File getScriptFile()

          { new File(getScriptURI()) }

          The reason for using a URI rather than "file name" is that we need to handle in some fashion non-file sources. Some protocols support literal data, which we could use for scripts with string sources if desired.

          Show
          Jim White added a comment - @Jochen - I agree, that is serious problem with using a system property and I do not care for it much. But if we're not going to fix this more completely then setting such a system property in GroovyMain for typical script file invocations is worthwhile. But I think the right approach is adding a property to Script: URI getScriptURI() 'scriptURI' seems the right thing because 'URI' would conflict with 'java.net.URI'. And 'toURI' seems odd. And a convenience method that uses that: File getScriptFile() { new File(getScriptURI()) } The reason for using a URI rather than "file name" is that we need to handle in some fashion non-file sources. Some protocols support literal data, which we could use for scripts with string sources if desired.
          Hide
          Phil Walker added a comment - - edited

          In order to do this as a URI, it will require modification in at least 2 places, possibly the GroovyCodeSource constructors for File and URL sources. I assume (but have not confirmed) that a jarfile URL would provide the full name of the source jar.

          However, it's still not clear to me whether that is sufficient, since a jar file can also be read from a Reader or an InputStream.

          I'll try to do some experiments when I get a chance.

          Phil

          Show
          Phil Walker added a comment - - edited In order to do this as a URI, it will require modification in at least 2 places, possibly the GroovyCodeSource constructors for File and URL sources. I assume (but have not confirmed) that a jarfile URL would provide the full name of the source jar. However, it's still not clear to me whether that is sufficient, since a jar file can also be read from a Reader or an InputStream. I'll try to do some experiments when I get a chance. Phil
          Hide
          Phil Walker added a comment - - edited

          After extensive investigation, here's what I concluded.

          It would be fairly easy to add a java.net.URL (e.g., named "scriptURL") to the script binding, although there are a number of different locations in the code that would potentially be affected, due to the variety of ways a script can be executed.

          To add "scriptURL" to the binding, the following line of code would be added in various locations in 2 source files.

          The new line of code:

          script.setProperty("scriptURL",scriptClass.getProtectionDomain().getCodeSource().getLocation());
          

          The insertion points in GroovyShell:

          groovy.lang.GroovyShell.runScriptOrMainOrTestOrRunnable()/*line:264*/
          groovy.lang.GroovyShell.evaluate(GroovyCodeSource)/*line:577*/
          

          Another interesting use case requires a similar line of code to be added to

          org.codehaus.groovy.runtime.InvokerHelper.evaluate()/*line:386*/

          I would have preferred to provide a universal implementation that would provide for all script execution paths from a single location in the code.
          Most (possibly all) use cases go through GroovyShell.parseClass(GroovyCodeSource,boolean) to get a script class file.

          1. to add the source URL to a script binding requires simultaneous access to 2 items:
          A. the sourceURL
          B. the script binding

          2. the sourceURL is can be derived in GroovyClassLoader from either of two objects:
          A. find the URL key in sourceCache matching the generated script Class object
          B. GroovyCodeSource parameter in GroovyClassLoader.parseClass():
          // public Class parseClass(GroovyCodeSource codeSource, boolean shouldCacheSource)(line 429)

                URL url = codeSource.getCodeSource().getLocation();
          

          3. the script binding is generally available in GroovyShell, in various locations:
          GroovyShell.runScriptOrMainOrTestOrRunnable(Class,...)/line:248/

          4. InvokerHelper.runScript(Class...)/line:383/

          Here's a unit test class for the changes listed above:

          package groovy
          
          class NameTest extends GroovyTestCase {
          
              void testScriptURL() {
                  def program =  ""
                  def binding  = new Binding()
                  ( new GroovyShell ( binding ) ).evaluate ( program )
                  // verify type of new 'scriptURL' variable
                  assert binding.scriptURL.getClass().name == 'java.net.URL'
                  // verify value of 'scriptURL' for this context
                  assert binding.scriptURL.file == '/groovy/shell'
              }
              void testCalledScriptURL() {
                  // TODO: write a temporary groovy source file, so this test is standalone.
                  File aFile = new File("bin/scriptUrl.gr")
          
                  if( aFile.exists() ) {
                      def aScript = new GroovyClassLoader().parseClass(aFile).newInstance();
                      assert aScript != null
          
                      aScript.out = System.out
                      def buf = new ByteArrayOutputStream()
                      def newOut = new PrintStream(buf)
                      def saveOut = aScript.out
                      // redirect System.out of for later comparison
                      System.out = newOut
                      // call aScript.main(), which prints
                      aScript.main([] as String[])
                      System.out = saveOut
                      def outString = buf.toString()
          
                      // verify type of 'scriptURL'
                      assert outString.contains("java.net.URL")
                      // verify value of 'scriptURL' for this context
                      assert outString.contains(aFile.toURL().toString())
                      printf("out[%s]\n",outString)
                  }
              }
          }
          

          If someone will provide an example of executing a script via a network URL, I'll add a unit test for it.

          Show
          Phil Walker added a comment - - edited After extensive investigation, here's what I concluded. It would be fairly easy to add a java.net.URL (e.g., named "scriptURL") to the script binding, although there are a number of different locations in the code that would potentially be affected, due to the variety of ways a script can be executed. To add "scriptURL" to the binding, the following line of code would be added in various locations in 2 source files. The new line of code: script.setProperty( "scriptURL" ,scriptClass.getProtectionDomain().getCodeSource().getLocation()); The insertion points in GroovyShell: groovy.lang.GroovyShell.runScriptOrMainOrTestOrRunnable()/*line:264*/ groovy.lang.GroovyShell.evaluate(GroovyCodeSource)/*line:577*/ Another interesting use case requires a similar line of code to be added to org.codehaus.groovy.runtime.InvokerHelper.evaluate()/*line:386*/ I would have preferred to provide a universal implementation that would provide for all script execution paths from a single location in the code. Most (possibly all) use cases go through GroovyShell.parseClass(GroovyCodeSource,boolean) to get a script class file. 1. to add the source URL to a script binding requires simultaneous access to 2 items: A. the sourceURL B. the script binding 2. the sourceURL is can be derived in GroovyClassLoader from either of two objects: A. find the URL key in sourceCache matching the generated script Class object B. GroovyCodeSource parameter in GroovyClassLoader.parseClass(): // public Class parseClass(GroovyCodeSource codeSource, boolean shouldCacheSource)(line 429) URL url = codeSource.getCodeSource().getLocation(); 3. the script binding is generally available in GroovyShell, in various locations: GroovyShell.runScriptOrMainOrTestOrRunnable(Class,...)/ line:248 / 4. InvokerHelper.runScript(Class...)/ line:383 / Here's a unit test class for the changes listed above: package groovy class NameTest extends GroovyTestCase { void testScriptURL() { def program = "" def binding = new Binding() ( new GroovyShell ( binding ) ).evaluate ( program ) // verify type of new 'scriptURL' variable assert binding.scriptURL.getClass().name == 'java.net.URL' // verify value of 'scriptURL' for this context assert binding.scriptURL.file == '/groovy/shell' } void testCalledScriptURL() { // TODO: write a temporary groovy source file, so this test is standalone. File aFile = new File( "bin/scriptUrl.gr" ) if ( aFile.exists() ) { def aScript = new GroovyClassLoader().parseClass(aFile).newInstance(); assert aScript != null aScript.out = System .out def buf = new ByteArrayOutputStream() def newOut = new PrintStream(buf) def saveOut = aScript.out // redirect System .out of for later comparison System .out = newOut // call aScript.main(), which prints aScript.main([] as String []) System .out = saveOut def outString = buf.toString() // verify type of 'scriptURL' assert outString.contains( "java.net.URL" ) // verify value of 'scriptURL' for this context assert outString.contains(aFile.toURL().toString()) printf( "out[%s]\n" ,outString) } } } If someone will provide an example of executing a script via a network URL, I'll add a unit test for it.
          Hide
          Jim White added a comment - - edited

          URL is not the correct type for this, it needs to be a URI. URL was a bad API in early Java and is deprecated, which is why for example NIO only supports URI.

          http://docs.oracle.com/javase/7/docs/api/java/io/File.html#toURL()

          http://docs.oracle.com/javase/7/docs/api/java/nio/file/Path.html#toUri()

          Show
          Jim White added a comment - - edited URL is not the correct type for this, it needs to be a URI. URL was a bad API in early Java and is deprecated, which is why for example NIO only supports URI. http://docs.oracle.com/javase/7/docs/api/java/io/File.html#toURL() http://docs.oracle.com/javase/7/docs/api/java/nio/file/Path.html#toUri()
          Hide
          Jim White added a comment -

          I think it is high time we resolved this issue. I've just raised GROOVY-6561 because the broken URL handling in GroovyMain adversely affects GROOVY-6451.

          Now that the worm has turned so much (and I've seen the relevant bits of the compiler in the process of implementing 6451) it seems to me that the way to handle this is through an annotation that injects the URI (obtained from the SourceUnit as in 6451) as a field into the script or class as desired.

          A limitation of this approach (which I think should be considered a feature rather than a bug) is that the script must have an annotation to get the information. The featureful aspect is that this kind of environmental dependency should announce itself. For DSL style scripts the annotation can be added using the AST transformation customization feature of compiler configuration.

          Show
          Jim White added a comment - I think it is high time we resolved this issue. I've just raised GROOVY-6561 because the broken URL handling in GroovyMain adversely affects GROOVY-6451 . Now that the worm has turned so much (and I've seen the relevant bits of the compiler in the process of implementing 6451) it seems to me that the way to handle this is through an annotation that injects the URI (obtained from the SourceUnit as in 6451) as a field into the script or class as desired. A limitation of this approach (which I think should be considered a feature rather than a bug) is that the script must have an annotation to get the information. The featureful aspect is that this kind of environmental dependency should announce itself. For DSL style scripts the annotation can be added using the AST transformation customization feature of compiler configuration.
          Hide
          blackdrag blackdrag added a comment -

          I was waiting a bit for comments, but I am fine with such an annotation. You want to try doing a pull request?

          Show
          blackdrag blackdrag added a comment - I was waiting a bit for comments, but I am fine with such an annotation. You want to try doing a pull request?
          Hide
          Jim White added a comment -

          Sure, now that I know that GroovyShell is the way to get GroovyMain to get URI into the Script I'll take a whack at these. I'd also like a comment on GROOVY-6582 which seems pretty obvious to me but has not had it PR merged yet.

          Show
          Jim White added a comment - Sure, now that I know that GroovyShell is the way to get GroovyMain to get URI into the Script I'll take a whack at these. I'd also like a comment on GROOVY-6582 which seems pretty obvious to me but has not had it PR merged yet.
          Hide
          Jim White added a comment -

          PR #337 has an implementation for this and GROOVY-6561.

          Show
          Jim White added a comment - PR #337 has an implementation for this and GROOVY-6561 .
          Hide
          Jim White added a comment -

          The name of the annotation is now @groovy.transform.SourceURI rather than @ScriptURI because it works equally well for sources that are for scripts or classes or both. I've been having scripts on the brain lately and liked the @ScriptURI name, but @SourceURI is actually better related to what it is connected to in the compiler and doesn't mislead folks into thinking it is just for scripts.

          Show
          Jim White added a comment - The name of the annotation is now @groovy.transform.SourceURI rather than @ScriptURI because it works equally well for sources that are for scripts or classes or both. I've been having scripts on the brain lately and liked the @ScriptURI name, but @SourceURI is actually better related to what it is connected to in the compiler and doesn't mislead folks into thinking it is just for scripts.
          Hide
          Pascal Schumacher added a comment -

          Pull request was merged by Paul.

          Show
          Pascal Schumacher added a comment - Pull request was merged by Paul.
          Hide
          Paul King added a comment -

          All we need now is some doco?

          Show
          Paul King added a comment - All we need now is some doco?
          Hide
          Jim White added a comment -

          Yes indeed, Guillaume made a similar comment on the list about the base script abstract method stuff. I had just one more thing to do (PR #371 for GROOVY-6675) and I think this stuff all hangs together pretty well now. I assume there are at least a couple weeks before the the final release for 2.3.0?

          Show
          Jim White added a comment - Yes indeed, Guillaume made a similar comment on the list about the base script abstract method stuff. I had just one more thing to do (PR #371 for GROOVY-6675 ) and I think this stuff all hangs together pretty well now. I assume there are at least a couple weeks before the the final release for 2.3.0?
          Hide
          Guillaume Laforge added a comment -

          We will likely go straight to RC-mode, with an RC-1 already Thursday if all goes well (no big show-stopper, etc), so the sooner the better. The date for 2.3.0 final is not defined yet, but it could be 2-3 weeks maximum.

          Show
          Guillaume Laforge added a comment - We will likely go straight to RC-mode, with an RC-1 already Thursday if all goes well (no big show-stopper, etc), so the sooner the better. The date for 2.3.0 final is not defined yet, but it could be 2-3 weeks maximum.
          Hide
          Jim White added a comment -

          Well, I'll see what I can do but I'm pretty booked up and so next weekend is probably the earliest I can put much time into this again. As long as the code gets in on time the docs can get filled in easily enough.

          Show
          Jim White added a comment - Well, I'll see what I can do but I'm pretty booked up and so next weekend is probably the earliest I can put much time into this again. As long as the code gets in on time the docs can get filled in easily enough.

            People

            • Assignee:
              Jim White
              Reporter:
              Marc Guillemot
            • Votes:
              12 Vote for this issue
              Watchers:
              20 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: