Details
-
Type:
Bug
-
Status:
Closed
-
Priority:
Critical
-
Resolution: Fixed
-
Affects Version/s: 1.5.6
-
Fix Version/s: 1.6.1, 1.5.8, 1.7-beta-1
-
Component/s: command line processing
-
Labels:None
-
Environment:Windows XP
-
Number of attachments :2
Description
Groovy does weird things with command line arguments containing *. Makes it very difficult to pass in things like file specs, ant specs and regex expresssions!!! A simple program that prints out the length of the args parameter passed into main procedure - a few are correct but most are wrong:
>groovy fail test args length: 1 >groovy fail * args length: 3 >groovy fail ** args length: 0 >groovy fail "**" args length: 0 >groovy fail "test" " *" args length: 2 >groovy fail "test" "*" args length: 1 >groovy fail "test *" args length: 1 >groovy fail "*test*" args length: 0 >groovy fail "*test*" args length: 0 >groovy fail " *test*" args length: 1
-
- startGroovy_V1.bat
- 19/Feb/09 6:00 PM
- 7 kB
- Herbert Gerhards
-
- startGroovy_V2.bat
- 19/Feb/09 6:00 PM
- 5 kB
- Herbert Gerhards
Activity
What Paul is targeting is that even before groovy takes action, the shell evaluates the arguments. For example in linux if I do "ls *" it lists all files in the current directory. For this is ls does not get the *, instead the shell expands * into all file names. so if "groovy fail *" says it gets 32 arguments, then these three are probably files, because the shell already expanded * before groovy had even a chance of doing something. "**" has probably a meaning too... not sure what it could be though. The only ones that looks strange to me is
groovy fail "test" "\*"
That one should return two arguments, not one...also the test
groovy fail "*test*"
should return one argument
the first thing we need to know is: What shell do you use. Because the shell is doing that and different shells do behave different.
For example the bash script
#!/bin/bash echo $#
executed with * in a directory containing the script and two other files will return 3, because * matches 3 files and $# counts the number of input arguments to the script. A java program doing System.out.println (args.length); in main behaves exactly the same. And a groovy class with the same code does also the same here on my system.
Now you probably not using a bash shell, but the windows command line, so cmd is your tool. so if anything is wrong, then it would be the .bat scripts...but you excluded them already.
It is definitely operating system globbing coming into play. In *nix worlds you can escape globbing yourself. I don't know of an easy way to do that for Windoze. The issue is that for Groovy scripts that process files, you would want the globbing on. Do you have any suggestions?
there is a way to find out if the windows shell is a fault...
set counter=0 :start if "%~1"=="" goto end set /a counter=counter+1 shift goto start :end echo %counter%
this untested program should give the number of input arguments as the bash script before. Also the same test with Java can be done as before to see what Java will get. Bob, could you test that? Because if this script misbehaves, then I am not sure we can do much. On the other hand if this one behaves right, then it is a bug in the groovy shell scripts. If Bob has no time, then Paul or Bob should ask on the groovy-user list for some one with an XP or Vista to test this. I currently have no access to such a machine
ok, I found an XP box and tested the script... cmd does not automatically expand *. So we do that on our own somewhere. That means it is a bug in the bat files... probably the bat script does not recognize if the input argument is quoted or not... probably because %1 is used instead of %~1
If I am right with cmd not expanding *, then the comment http://svn.groovy.codehaus.org/browse/groovy/trunk/groovy/groovy-core/src/bin/startGroovy.bat?r=12419#l116 is wrong.. anyway, when doing groovy fail "*", then I see something like this:
C:\>rem remove quotes around first arg C:\>for %i in ("-q*-q") do set _ARG=%~i C:\>rem set the remaining args C:\>set _ARGS=""
which does not look very right... this kind of problems seems to happen every time a star is inside quotes.. for example
groovy -e "println '*'"
becomes
-e "println
, which means that not only the part with * is missing, it did also swallow the quotes!
The comment on line 116 I think is meant to refer to the behavior as seen in this script:
@echo off FOR %%f IN (*.*) DO echo %%f
If you try it out you will definitely see expansion.
Though there is still most probably a bug here. From memory (it was a long time ago that I looked at this stuff) we couldn't find a perfect solution. Perhaps we should have used an explciit '-noglob' flag. I think we tried to allow these variations:
groovy BackupFiles *.* // globbing comes into play groovy Calc "9*3" // no globbing groovy Calc "9 * 3" // no globbing
I do remember a whole lot of pain to do with spaces in filenames which the horrible script is designed to partially handle given how brain dead dos shell is.
We either have to make it dumber or smarter. Smarter means trying to work out more variations and cope with them. Dumber means fall back to -noglob flag (breaks backward compatibility at the command line - though if it was broken, no one was using it) or use a Java globbing package. If we are moving towards native launcher then this effort is perhaps best directed elsewhere.
I was a bit surprised that the problem handling * in arguments in Windows still exists in groovy-1.6.0. So I spend some time with the stargGroovy.bat script to find a solution for this. As discussed above, the problem is cause by the FOR loop in the "horrible script".
I identified two problems with this:
1. Before the FOR loop is executed, some but NOT ALL existing * characters are escaped.
2. The processing of the command line will split the line by spaces IGNORING any existing quotation.
Please note: The second point is also responsible that groovy will fail if it is installed in directories having a space in the path, as the groovy path is the first argument of the FOR loop.
A solution to item 1 can be provided by a simple enhancement of the existing script. See the attached file V1 for the details. It will collect arguments starting with a quote until the closing quote appears. As there are only minimal changes of the existing script, it is very unlikly that this change will have any unexpected side effects.
But this fix will still not handle the problem with the * character. To find a solution for this, I tried to understand why the FOR loop is used at all. The loop is labeled ":win9xME_args_slurp", which seems to indicate a backward compatibility for old win9x systems is provided. I checked with an win98SE system installed in a virtual box and was quite surprised: Before winNT the command shell from windows did not know anything about the %~ operators. That will mean that ALL of your batch files will fail on a windows version before NT. For this I don't see any need to keep the backward compatibility here in place. But winNT does provide a simple method to collect the command line arguments in a batch file. This will allow to remove most of the horrible parts of the script and avoids the FOR loop at all. So please find the NT based solution in the attached file V2. (Note: As the command line handling is really simple now, I have removed the special handling for the 4NT systems, specially as GROOVY-1925 is indicating that the current solution did not even work.)
I have tested my solutions on a W2K and Vista64 system and worked fine. Spaces in the path are no longer a problem and spaces in arguments are handled correctly. Globbing is only done if a * appears outside of quotes, a * inside of quotes is passed unchanged to the script. Even escaping quotes will work, as shown here:
-> groovy -e "println \"$args\"" arg1 arg2 "a r g 3" "*"
I'm also setting the fix version of 1.6/1.7, as I suspect the problem is present there too.
I assume then that only the V1 version is in 1.6.4. Are there any plans to use the V2 version? I am trying to pass a regular expression such as ".*\.log" to a script and I am not able to do so.
Got this problem too. The parsing is still buggy and works in "groovy way" what I don`t like at all. To have "*" working as expected but also using old commandline syntax (where quotes for properties are not required) I did some changes:
bin/groovy.bat
- "%DIRNAME%\startGroovy.bat" "%DIRNAME%" groovy.ui.GroovyMain %*
+ set CMD_LINE_ARGS=%*
+ "%DIRNAME%\startGroovy.bat" "%DIRNAME%" groovy.ui.GroovyMain
bin/startGroovy.bat
- @rem Collect remaining command-line arguments
- set CMD_LINE_ARGS=%1
- shift
- :winNT_args_slurp
- if "x%~1" == "x" goto execute
- set CMD_LINE_ARGS=%CMD_LINE_ARGS% %1
- shift
- goto :winNT_args_slurp
This should also parse -cp argument as expected but skips arguments shifting which incorrectly parses arguments list.
I use the latest version, v1.8.0 and this bug still exists in it.
I have no idea why this issue is closed.
I believe the issue was fixed at the time but has since been broken. Unfortunately, we don't have a good test suite around our windows bat files or unix shell scripts. We certainly need to improve that. Also, I believe there are subsequent issues been raised against command line anomalies but if you have a particular specific easily reproducible example that needs fixing, feel free to raise an Jira. And feel free to attach patches/tests too if you can! ![]()
I have a GINT test for this at https://bitbucket.org/bob_swift/gint/src/6ef31d170cdd/src/itest/groovy/javaTest2.gant. It showed almost all cases fixed on 1.7.x. I haven't installed 1.8 yet
to see if it has regressed.
Cool Bob. This gives me an excuse to use GINT a little more. I don't think anything regressed in that area in the move to 1.8 but I think there was a small regression or two quite some time back (but after this issue was closed) which we haven't got around to fixing yet for fear of making things worse. Perhaps a GINT test suite would help us refactor and fix things with more confidence.
Oops, meant to reference the groovy test, not the java test:
https://bitbucket.org/bob_swift/gint/src/6ef31d170cdd/src/itest/groovy/groovyTest2.gant
Unfortunately, I don't have authority to edit my own comment
.
If instead of printing the length of the args, what happens if instead you print the args for some of these examples (or both the args and the length)?