Details
-
Type:
New Feature
-
Status:
Reopened
-
Priority:
Major
-
Resolution: Unresolved
-
Affects Version/s: 1.0
-
Fix Version/s: 1.8.x
-
Component/s: None
-
Labels:None
-
Number of attachments :0
Description
It's a common need when using scripts to make some operations on the file system to get the name of the file containing the script beeing executed. The Script class doesn't provide this information currently but it seems possible to retrieve it with:
URL scriptUrl = getClass().classLoader.resourceLoader.loadGroovySource(getClass().name)
According to Jochen this may work... but not all the time.
What we would need is a safe way (as property of Script) to access this information like Ruby/Perls _FILE_ const.
Issue Links
- relates to
-
GROOVY-2648
groovy.bat and *nix groovy starter shell script set system property script.name differently
-
-
GROOVY-2375
"groovy" launcher broken on Cygwin
-
Activity
As specified on the mailing list by David Budworth and Gerrit Geens, it would be interesting too to have access to the URL (or File) of the initially started script. This is not necessary the same as the name of the current script because a script A can execute a script B.
As long as this information is not provided directly by Groovy, it's possible to modify the groovy.(sh|bat) starter script to make this property available as system property:
For unix boxes just change $GROOVY_HOME/bin/groovy (the sh script) to do
export JAVA_OPTS="$JAVA_OPTS -Dscript.name=$0"
before calling startGroovy
For Windows:
In startGroovy.bat add the following 2 lines right after the line with
the :init label (just before the parameter slurping starts):
@rem get name of script to launch with full path set GROOVY_SCRIPT_NAME=%~f1
A bit further down in the batch file after the line that says "set
JAVA_OPTS=%JAVA_OPTS% -Dgroovy.starter.conf="%STARTER_CONF%" add the
line
set JAVA_OPTS=%JAVA_OPTS% -Dscript.name="%GROOVY_SCRIPT_NAME%"
This partial fix broke the shell script on Cygwin: http://jira.codehaus.org/browse/GROOVY-2375
Not just Cygwin, the code added to the groovy script, which was not added to any of the other script (any reason why not?), breaks for any system where there is a space in path to the Groovy home. I am not sure there is any viable quoting that makes this work.
What is script.name supposed to point to ? The .groovy script of the "groovy" shell script ?
Oops, I meant ``The .groovy script OR the "groovy" shell script ? ´´
I believe that the rearrangement prompted by a comment from Daniel Serodio on GROOVY-2375 has fixed this in an appropriate manner. If people could test it out and if it seems to work close the issue. Thanks.
This doesn't solve the actual issue here. 'script.name' points to the groovy 'executable', this issue is about getting the path of the groovy script, like perl/ruby's FILE variable. This isn't resolved.
The trouble with File is that I don't think all Script objects will have them (particularly when compiled to .class and stored in a JAR file). The only thing you can count on is a URI (as it is part of the classpath regime).
With a URI you can easily get a File (if there is one) in a platform-independent fashion and from there it's parent directory or whatever you're after.
So what we should like to say is something like:
File scriptFile = new File(script.getURI())
Whether that should be called "toURI" for consistency with java.io.File I don't know. Obviously we want to be able to refer to it property-style in Groovy.
Now if you want to have a getScriptFile() convenience method on Script to do that, I think that's fine.
There is some need to be careful on Windows because there is a history of broken URLs there. But I think the modern URI stuff (JDK 1.4) is supposed to work properly. I know I've used that stuff, but not recently and I don't have a Windows box handy here on the road (actually there is one with me but it is stowed).
Some notes on the trouble with File.toURL (and why we use File.toURI):
http://weblogs.java.net/blog/kohsuke/archive/2007/04/how_to_convert.html
http://www.jroller.com/santhosh/entry/converting_file_to_url
As the initial comments discuss, this really is related to classloading, and I have stuff in that area in GroovyForOpenOffice and also some work on Ivy integration but I can't dig into those details right now.
I think this issue is a bit messed up right now.In the beginning it was asked for the initial file. The property solution seems to be perfect for this to me. Only that I would not set the property in the shell script, but through GroovyMain.
Now you are asking for something like _FILE_, which means the source location for every script. And while I have no idea yet how to add this, I would like to know what you need this for. In other words, I need examples for the usage, to see where it makes sense, and if other things are missing too.
If the first is a solution, then I can fix it easily. If the second is wanted, then it should be a new issue.
I'd at least like to see the ability for a script to figure out its own fully qualified path. Mostly to give it the ability to pick up resources from relative locations. If I deliver these scripts to clients, I can't ask them to modify their groovy installations so the workaround above (changing $GROOVY_HOME/bin/groovy) is out for me. I know I can do this in bash (or another shell) and wrap the call to the groovy script in it, but then I just have twice as many scripts as I actually need or want, so it would be nice to have this capability.
no, it prints the name of the groovy executable.
darren@hepburn ~ $ pwd
/home/darren
darren@hepburn ~ $ cat test.groovy
#!/usr/bin/env groovy
println System.getProperty("script.name")
darren@hepburn ~ $ echo $GROOVY_HOME
/opt/groovy
darren@hepburn ~ $ ./test.groovy
/opt/groovy/bin/groovy
as Marc G. mentioned at the top, adding:
export JAVA_OPTS="$JAVA_OPTS -Dscript.name=$0"
in $GROOVY_HOME/bin/groovy would work,and I can of course do this locally, but some of these scripts are used by clients and I obviously need consistent behaviour without asking them to modify their runtimes. GROOVY-2648 is the same issue I think (only just came across this one)
I finally found a reliable way to get the directory containing the groovy script itself:
new File(getClass().protectionDomain.codeSource.location.path).parent
Hideous, but it does work.
If instead you wanted the full path to the groovy script including the script name itself,
you could say:
getClass().protectionDomain.codeSource.location.path
Having a script know its own full name and path SHOULD be made easier by Groovy
(as easy as knowing the user's current working directory).
This location is a URL and could point to anything. There is no guarantee, that it will really point to the script file. Also there might be no script file... for example for precompiled classes... or location be a http address.
How should we react in these cases?
> How should we react in these cases?
For my purposes, I know that any script I write which uses a "#!/usr/bin/env groovy" invocation and is already interacting with the local file system is never going to be compiled, JAR'd or delivered over HTTP without major breakage anyway.
So as long as whatever the mechanism for the script figuring out its own location works when it runs as a script that would do for me.
@Jon : nice find (but you're right, it's hideous!)
That was not exactly an answer I can work with... Should there be an exception, a null value, a constant value...?
well doesn't Marc's original suggestion of simply amending the shell script make this whole issue go away anyhow? Or is that not a complete answer... I don't know. It would work for me.
<quote author="Marc G">
for unix boxes just change $GROOVY_HOME/bin/groovy (the sh script) to do
export JAVA_OPTS="$JAVA_OPTS -Dscript.name=$0"
before calling startGroovy
</quote>
It would also resolve GROOVY-2648. Of course, it would be a breaking change for anyone using it to rely on current behaviour, but that's unlikely. Especially with the windows/'nix inconsistency in this respect.
In answer to what Jon's code should return in cases where it makes no sense, null would be easier to handle than an exception in the script (no ugly try/catch needed). But I guess anything is fine so long as it's documented.
Wow, thanks for the fast feedback!
My intention was to share a trick for those who need something that works in a fully backward-compatible
way in the short term (like me!).
I agree that the right thing to do is make Groovy work as compatibly as possible across platforms,
so if there's a "script.name" that works on Windows, it should work the same way on Linux too.
The fact that script.name is different between Windows and Linux is a bug on Linux, pure and simple.
It should be fixed ASAP.
Coming up with a new name for this value on Linux to maximize "backward bug compatibility" is
probably misguided because the "script.name" strongly implies it will be the script's name,
not the script engine's name (i.e.: groovy). Hence my vote: fix the bug and minimize confusion
and cruft going forward at the minimal expense of messing up a few bug-dependent scripts today.
If people are super worried, then you could make the installer issue a warning message describing
the problem at install time. At that point, people can just grep their scripts for "script.name".
Personally, I think that's overkill, and might create far more pain than it prevents. Currently, script.name
is pretty much useless on Linux, so how may people actually use it? Not many, I'd guess.
As for Marc G's suggestion, I think it's basically the right one, except that it doesn't quite work
as stated for three reasons:
[1] You need to pass along $1 not $0 (I'm assuming that was just a typo)
[2] On Debian at least, whatever value you set for script.name in /usr/bin/groovy gets clobbered on line 18 of /usr/bin/startGroovy
[3] If you want to cannonicalize the script name's path (which you should), then you need to use readlink
Thus, the best you can do right now (until the fine folks in Groovy land fix the problem) is to modify /usr/bin/startGroovy
and where it once said on line 18 "SCRIPT_PATH="$0" it should say:
readLink=`which readlink`
if [ "$readLink" == "" ] ; then
SCRIPT_PATH="$1"
else
SCRIPT_PATH=`readlink -m "$1"`
fi
Why the fancy dance around readlink?
Sadly, Solaris 10 does not ship with it, so in that case the best you can hope for is to limp along w/o cannonicalization.
All of this is just a stopgap though, because Groovy should always try to do the same thing
(i.e.: give you a cannonicalized path) on all platforms. The lack of readline on Solaris 10
means that the ideal solution would probably be to push this logic into Groovy itself, so long
as that won't mess up folks on Cygwin. If people don't care about Solaris 10 much and/or
think those poor souls should just go out and install readlink one fine day, then the startGroovy
script looks like a winner (unless there are meaningful cases where groovy scripts are not
started by startGroovy.... then we're back to pushing the logic into Groovy itself).
For now, the straightforward /usr/bin/startGroovy solution is probably the best one.
What do you all think?
Cheers,
-Jon
This is an important issue for me, as well. My perspective is identical to Darren's, due to my use case:
I write scripts that are intended to work identically on Linux, OS-X, solaris, and cygwin.
Putting the fix in the script wrappers ($GROOVY_HOME/bin/groovy, or startGroovy) is less than ideal, due to
the readlink issue mentioned above.
Below is my suggested fix, in the form of a subversion diff (relative to subversion revision 19700).
It works on cygwin, linux and OSX, and should work on any OS script. With this patch, a script would access
it's name as show in this example script:
#!/usr/bin/env groovy
printf("this script is named [%s]\n",System.getProperty("script.file.name")
Here's the patch:
Index: src/main/groovy/ui/GroovyMain.java
===================================================================
— src/main/groovy/ui/GroovyMain.java (revision 19700)
+++ src/main/groovy/ui/GroovyMain.java (working copy)
@@ -371,6 +371,14 @@
if (!scriptFile.exists())
+ String canonicalName = scriptFileName;
+ try
catch(IOException io)
{ + System.setProperty("script.file.exception",io.getMessage()); + }+ System.setProperty("script.file.name",canonicalName);
+
return scriptFile;
}
Phil
Indeed a good approach sure to be working.
Perhaps we could even add a variable in the binding of the scripts, like scriptFileName?
Please note my earlier comment that a file path string is not a fully general solution because a script may be in JAR file. The "name" needs to be a URI to handle that case, and is probably the right thing for paths that don't use the file system at all (scripts from strings or http or what-have-you). Even though this change will work for GroovyMain, it is important that the API be consistent for any kind of script (which is why this long standing issue remains open).
This really should have been resolved as part of the reworking of the script/file/classloader stuff that was done not so long ago so that all the right meta level logic is properly addressed.
Also I'm skeptical of calling getCanonicalPath. It is a funky function and isn't necessary if you have the File object. And it could break code that is looking at the file path to do package hierarchy stuff.
I support this change if modified to do this:
{{
URI scriptURI = scriptFile.toURI();
System.setProperty("script.uri", scriptURI.toASCIIString());
}}
File.toURI doesn't throw and exception to be caught, and it does AbsolutePath rather than CanonicalPath, which is far more likely to be what is intended.
I haven't looked at the rest of the function being modified, but why is this only done when the script file doesn't exist?
The "!scriptFile.exists()" code is merely there for context (it's already present in GroovyMain).
// if we still haven't found the file, point back to the originally specified filename
if (!scriptFile.exists())
I agree that the URI is a more general solution, and tested it to verify that it works in the
script file context:
URI scriptURI = scriptFile.toURI();
System.setProperty("script.uri",scriptURI.toASCIIString());
To convert the property string back to a File in a script context, you would do something like this:
scriptUri = System.getProperty("script.uri")=
File scriptFile = new File(scriptUri.substring("file:/".length()))
It's a bit odd looking, but not too bad (maybe someone can suggest a more succinct usage?)
I suspect that GroovyMain is not the only place (or perhaps not the best place) to initialize the property.
I'll investigate to see if I can find a more general location.
Phil
A more readable way to consume the "script.uri" property:
File scriptFile = new File(new URI(System.getProperty("script.uri"))
Phil
After further investigation, there appear to be 5 use cases:
1. script source is a File
2. script source is a URL
3. script source is a String
4. script source is a Reader
5. script source is an InputStream
For each of these use cases, there is a corresponding constructor in groovy.lang.GroovyCodeSource.
In the case of a String, Reader, or InputStream, there is no URI, although the constructors do have a "name" parameter.
For the 1st 2 cases, we could add appropriate code to the File and URL constructors in groovy.lang.GroovyCodeSource. However, there are cases in which execution of a script would involve multiple GroovyCodeSource constructors, each overwriting the property value.
To correctly support the script file use case, the purpose of which is to match the capabilities of perl/python/bash scripts, a script must be able to determine where it resides in the filesystem. In this context, a script is always sourced from a file, and it would represent broken behavior if a class loaded from a URL were to overwrite the "script.file" property. To prevent the most recent GroovyCodeSource constructor from overwriting this value, it should be initialized in GroovyMain.
My suggestion, therefore, is that we create 2 properties:
"script.file" (always the absolute name of the script file, or null)
"script.uri" (the URI of the most recently constructed GroovyCodeSource)
Thoughts?
Phil
if you get the "its" location of a script, how does that work with multiple scripts in perl/python? Do you get only the one of the start script, or does it change depending on which script you are in currently?
As for the recently constructed GroovyCodeSource.... does it make sense? When I execute a inlined script, then it will change. script.uri may then change all the time even.
After reviewing recent comments added to this discussion, I think I like Guillaume's suggestion to add a binding to script classes (this.scriptFileName), which is unambiguous as to which file you're interested in.
A perl or bash self reference usually refers to the script in which it occurs. If a script uses a package (another perl script), any references in the other script will still refer back to the original script that launched the process. Similarly, if a bash script "sources" another script, the other script is treated as if it is contained within the calling script, so all self references will resolve to the originally called script.
Here is a perl script that prints its filename:
#!/usr/bin/perl
printf("this script file is named [%s]\n",$0)
Here's the equivalent bash script:
#!/bin/bash
echo "this script file is named [$0]"
There are other useful scripting techniques that involve self-reference. For example, in perl, you can append data to the end of a script file, by preceding it with a single line that starts with _DATA_, which is quite difficult to simulate in groovy (impossible unless you can derive the filename).
This script calculates the average of numbers in the _DATA_ section:
#!/usr/bin/perl -w
chomp(my @lines = <DATA>);
my $total = 0.0;
foreach my $line ( @lines ){
$line =~ s/^\s+|\s+$//g;
$total += int($line);
}
printf "average: %1.4f\n",$total / scalar @lines;
__DATA__
252
252
252
252
248
252
252
@Jochen - I agree, that is serious problem with using a system property and I do not care for it much. But if we're not going to fix this more completely then setting such a system property in GroovyMain for typical script file invocations is worthwhile.
But I think the right approach is adding a property to Script:
URI getScriptURI()
'scriptURI' seems the right thing because 'URI' would conflict with 'java.net.URI'. And 'toURI' seems odd.
And a convenience method that uses that:
File getScriptFile()
{ new File(getScriptURI()) }The reason for using a URI rather than "file name" is that we need to handle in some fashion non-file sources. Some protocols support literal data, which we could use for scripts with string sources if desired.
In order to do this as a URI, it will require modification in at least 2 places, possibly the GroovyCodeSource constructors for File and URL sources. I assume (but have not confirmed) that a jarfile URL would provide the full name of the source jar.
However, it's still not clear to me whether that is sufficient, since a jar file can also be read from a Reader or an InputStream.
I'll try to do some experiments when I get a chance.
Phil
After extensive investigation, here's what I concluded.
It would be fairly easy to add a java.net.URL (e.g., named "scriptURL") to the script binding, although there are a number of different locations in the code that would potentially be affected, due to the variety of ways a script can be executed.
To add "scriptURL" to the binding, the following line of code would be added in various locations in 2 source files.
The new line of code:
script.setProperty("scriptURL",scriptClass.getProtectionDomain().getCodeSource().getLocation());
The insertion points in GroovyShell:
groovy.lang.GroovyShell.runScriptOrMainOrTestOrRunnable()/*line:264*/ groovy.lang.GroovyShell.evaluate(GroovyCodeSource)/*line:577*/
Another interesting use case requires a similar line of code to be added to
org.codehaus.groovy.runtime.InvokerHelper.evaluate()/*line:386*/
I would have preferred to provide a universal implementation that would provide for all script execution paths from a single location in the code.
Most (possibly all) use cases go through GroovyShell.parseClass(GroovyCodeSource,boolean) to get a script class file.
1. to add the source URL to a script binding requires simultaneous access to 2 items:
A. the sourceURL
B. the script binding
2. the sourceURL is can be derived in GroovyClassLoader from either of two objects:
A. find the URL key in sourceCache matching the generated script Class object
B. GroovyCodeSource parameter in GroovyClassLoader.parseClass():
// public Class parseClass(GroovyCodeSource codeSource, boolean shouldCacheSource)(line 429)
URL url = codeSource.getCodeSource().getLocation();
3. the script binding is generally available in GroovyShell, in various locations:
GroovyShell.runScriptOrMainOrTestOrRunnable(Class,...)/line:248/
4. InvokerHelper.runScript(Class...)/line:383/
Here's a unit test class for the changes listed above:
package groovy class NameTest extends GroovyTestCase { void testScriptURL() { def program = "" def binding = new Binding() ( new GroovyShell ( binding ) ).evaluate ( program ) // verify type of new 'scriptURL' variable assert binding.scriptURL.getClass().name == 'java.net.URL' // verify value of 'scriptURL' for this context assert binding.scriptURL.file == '/groovy/shell' } void testCalledScriptURL() { // TODO: write a temporary groovy source file, so this test is standalone. File aFile = new File("bin/scriptUrl.gr") if( aFile.exists() ) { def aScript = new GroovyClassLoader().parseClass(aFile).newInstance(); assert aScript != null aScript.out = System.out def buf = new ByteArrayOutputStream() def newOut = new PrintStream(buf) def saveOut = aScript.out // redirect System.out of for later comparison System.out = newOut // call aScript.main(), which prints aScript.main([] as String[]) System.out = saveOut def outString = buf.toString() // verify type of 'scriptURL' assert outString.contains("java.net.URL") // verify value of 'scriptURL' for this context assert outString.contains(aFile.toURL().toString()) printf("out[%s]\n",outString) } } }
If someone will provide an example of executing a script via a network URL, I'll add a unit test for it.
A related improvement would be to add something like URL Class.getLocation() allowing to know from where a Class has been loaded like what does Ant's <whichResource class="..."/> task.