jira.codehaus.org

  • Log In Access more options
    • Online Help
    • Keyboard Shortcuts
    • About JIRA
    • JIRA Credits
    • What?s New
  • Dashboards Access more options (Alt+d)
  • Projects Access more options (Alt+p)
  • Issues Access more options (Alt+i)
  • JRuby
  • JRUBY-2677

Problem in Process Command Line Arguments

  • Log In
  • Views
    • XML
    • Word
    • Printable

Details

  • Type: Bug Bug
  • Status: Closed Closed
  • Priority: Major Major
  • Resolution: Fixed
  • Affects Version/s: JRuby 1.1.2
  • Fix Version/s: JRuby 1.1.3
  • Component/s: Interpreter
  • Labels:
    None
  • Environment:
    WinXP SP2

Description

I'm a Chinese user, the language set of my system is Simplified Chinese.
For example, here is a test.rb:
#------------------------------
ARGV.each do |arg|
puts "#{arg}\n"
end
#-----------------------------
And I will use @ to replace any Simplified Chinese Characters(Take the consideration of your non-Chinese Support system).
Now test.rb is under D:@@\
The command is : jruby.bat D:@@\test.rb arg1 @@@ arg3
Then jruby told me that she can't find D:??\test.rb. As you see, jruby use the same amouts of ? to replace Chinese Characters which results to an error.
Well, I move test.rb to D:\test\ , then execute : jruby.bat D:\test\test.rb arg1 @@@ arg3 .
Now the output is :
arg1
???
arg3
It seems that this bug exists in every case involed command line arguments.

Yesterday I downloaded the source code and find something maybe useful(Now I'm using another machine, I can only write them down according to my memory):
In org\jruby\utils\RubyFile.java , there is a createfile method. After "filepath=new String(filepath.getBytes("ISO-8859-1"),"UTF-8")", the Chinese Characters in the String "filepath" are replaced with '?'. I don't understand the meaning of this line of code, but I know the "ISO-8859-1" is the main encoding method of English speaking countries, when there is a Chinese character in "filepath", "getBytes()" can't recognize it within ISO-5589-1, and then replace it with '?' in the return bytes.
Then I commented this line, and it worked! Jruby can process .rb files whose path including Chinese characters. But maybe this line of code is of use somewhere else, which I hope you can tell me.
Now the problem is only the arguments.
I traced it from org\jruby\RubyGlobal.java to the build_lib\ByteList.jar which I can't find source code(why?), now I'm sure the problem lies in Bytelist.jar, and it should be something like "getBytes("ISO-8859-1")".
In org\jruby\RubyGlobal.java, there's a RubyArray called argvArray[], the program adds every arguments to it, then define it as "ARGV". Just after "argvArray.add(runtime.newString(argv[i]))", the Chinese characters in the argv[i] is replaced with '?'. When I start with runtime.newString finally I got nothing because I only traced it to ByteList.jar.
Then I replace that line with simply "argvArray.add(argv[i])". But in my WinXP SP2 it still didn't work(useing the same example, the output text is arg1\n <TextOfAMess>\n arg3\n), but things is getting better because there is no longer '?'s. Later I restart my computer to enter my Linux, whose locale is UTF-8, what surprised me is that everything became OK. Thinking of my WinXp's locale is GB2312(Simplified Chinese), maybe jruby had converted the arguments in the ARGV to UTF-8, and the problem of XP is XP just can't display them normally.
But if I use the arguments as a path, more complex problem occurs.
For example: jruby.bat test2.rb "D:\test@@\readme.txt"
And I tried to read the contents of "readme.txt" in the test2.rb, even in Linux, jruby would told my can't find D:\test\<TextOfAMess>\readme.txt.
Then I have to stopped because things was worse than I had expected.

My English is poor, and I hope you can get want I'm writing about. Thanks for reading, thanks for your help.

Activity

Ascending order - Click to sort in descending order
  • All
  • Comments
  • Work Log
  • History
  • Activity
Hide
Permalink
Charles Oliver Nutter added a comment - 03/Jul/08 2:35 PM

Very helpful, thank you! I think we should modify that filePath line for certain to use whatever the host encoding is. As for the arguments, I think the only issue is that when feeding them into ARGV we need to make sure they're encoded from host encoding to UTF-8, rather than assuming they're coming in as ISO-8859-1 as on western systems.

This shouldn't be hard to fix, so I'm marking it for 1.1.3 and we'll try to get it in soon. Thank you for the report!

Show
Charles Oliver Nutter added a comment - 03/Jul/08 2:35 PM Very helpful, thank you! I think we should modify that filePath line for certain to use whatever the host encoding is. As for the arguments, I think the only issue is that when feeding them into ARGV we need to make sure they're encoded from host encoding to UTF-8, rather than assuming they're coming in as ISO-8859-1 as on western systems. This shouldn't be hard to fix, so I'm marking it for 1.1.3 and we'll try to get it in soon. Thank you for the report!
Hide
Permalink
Charles Oliver Nutter added a comment - 15/Jul/08 4:17 AM

I will commit a fix for all your issues shortly. With it, I can open files with unicode characters in them, pass arguments to JRuby with unicode characters, and everything looks ok.

Because JIRA doesn't appear to support unicode comments well, here is the output from a JRuby session with the fixes in place: http://pastie.org/233626

I believe this covers all the cases you reported issues with, yes? The fixes were all very easy, as you found. Can you submit some tests we can use in our test suite for these cases?

謝謝!

Show
Charles Oliver Nutter added a comment - 15/Jul/08 4:17 AM I will commit a fix for all your issues shortly. With it, I can open files with unicode characters in them, pass arguments to JRuby with unicode characters, and everything looks ok. Because JIRA doesn't appear to support unicode comments well, here is the output from a JRuby session with the fixes in place: http://pastie.org/233626 I believe this covers all the cases you reported issues with, yes? The fixes were all very easy, as you found. Can you submit some tests we can use in our test suite for these cases? 謝謝!
Hide
Permalink
Charles Oliver Nutter added a comment - 15/Jul/08 4:18 AM - edited

After posting my comment, I saw that JIRA does in fact seem to support unicode characters just fine, so here is the text of my demonstration:

~/NetBeansProjects/jruby ➔ cat test.rb
puts ARGV[0]
~/NetBeansProjects/jruby ➔ jruby test.rb 你好
你好
~/NetBeansProjects/jruby ➔ cat 你好。txt
你好我的朋友!
~/NetBeansProjects/jruby ➔ cat test2.rb
puts File.read('你好。txt')
~/NetBeansProjects/jruby ➔ jruby test2.rb
你好我的朋友!
~/NetBeansProjects/jruby ➔ jruby -e "puts File.read('你好。txt')"
你好我的朋友!
~/NetBeansProjects/jruby ➔ cat 你好.rb
puts 'hello my friend'
~/NetBeansProjects/jruby ➔ jruby 你好.rb
hello my friend

Please submit tests so we can mark this bug resolved.

Show
Charles Oliver Nutter added a comment - 15/Jul/08 4:18 AM - edited After posting my comment, I saw that JIRA does in fact seem to support unicode characters just fine, so here is the text of my demonstration: ~/NetBeansProjects/jruby ➔ cat test.rb puts ARGV[0] ~/NetBeansProjects/jruby ➔ jruby test.rb 你好 你好 ~/NetBeansProjects/jruby ➔ cat 你好。txt 你好我的朋友! ~/NetBeansProjects/jruby ➔ cat test2.rb puts File.read('你好。txt') ~/NetBeansProjects/jruby ➔ jruby test2.rb 你好我的朋友! ~/NetBeansProjects/jruby ➔ jruby -e "puts File.read('你好。txt')" 你好我的朋友! ~/NetBeansProjects/jruby ➔ cat 你好.rb puts 'hello my friend' ~/NetBeansProjects/jruby ➔ jruby 你好.rb hello my friend Please submit tests so we can mark this bug resolved.
Hide
Permalink
Tsing added a comment - 15/Jul/08 5:41 AM

I think you have tested on all cases and I think you have solved this bug.
I'm sure the next version of jruby will runs well on my Linux, whose locale is zh_cn.UTF-8.

But I'm wondering what it will be like on my WinXP, whose locale is not Unicode.
I hope that when passing arguments to jruby there will be no conversion of Charactor Encoding, just keep these charactors as original, because WinXP Chinese Simplified Version does not support Unicode charactors in filepath. (In "cmd.exe" all characters encoded in UTF-8 will be displayed as unrecognized characters.)
But it should be hard for you to find a machine running neither Unicode locale nor western locale, so let's just shake at that this bug has been resolved. When I get the new version of jruby, I will test it on my WinXP and see what wil happen.

Good luck!

Show
Tsing added a comment - 15/Jul/08 5:41 AM I think you have tested on all cases and I think you have solved this bug. I'm sure the next version of jruby will runs well on my Linux, whose locale is zh_cn.UTF-8. But I'm wondering what it will be like on my WinXP, whose locale is not Unicode. I hope that when passing arguments to jruby there will be no conversion of Charactor Encoding, just keep these charactors as original, because WinXP Chinese Simplified Version does not support Unicode charactors in filepath. (In "cmd.exe" all characters encoded in UTF-8 will be displayed as unrecognized characters.) But it should be hard for you to find a machine running neither Unicode locale nor western locale, so let's just shake at that this bug has been resolved. When I get the new version of jruby, I will test it on my WinXP and see what wil happen. Good luck!
Hide
Permalink
Charles Oliver Nutter added a comment - 15/Jul/08 1:44 PM

Tsing: We have nightly dumps of JRuby available here: http://192.168.0.200:8080/hudson/job/jruby-dist/

So if you want to try it out now and let us know if it works ok, that would be very helpful.

Show
Charles Oliver Nutter added a comment - 15/Jul/08 1:44 PM Tsing: We have nightly dumps of JRuby available here: http://192.168.0.200:8080/hudson/job/jruby-dist/ So if you want to try it out now and let us know if it works ok, that would be very helpful.
Hide
Permalink
Charles Oliver Nutter added a comment - 16/Jul/08 12:19 AM

And again, if you can come up with a couple simple test cases, I would really appreciate it.

Show
Charles Oliver Nutter added a comment - 16/Jul/08 12:19 AM And again, if you can come up with a couple simple test cases, I would really appreciate it.
Hide
Permalink
Tsing added a comment - 16/Jul/08 12:34 AM

That link can't be accessed here.
Maybe you can send it via email, waywardson@126.com, whose size limit of attached file is 20Mb.
These days I'm busy preparing for my final exams of this semester, after this weekend is summer vacation, during which I will not be able to be on line for one month, so, I hope this issue can be finished before that.

Show
Tsing added a comment - 16/Jul/08 12:34 AM That link can't be accessed here. Maybe you can send it via email, waywardson@126.com, whose size limit of attached file is 20Mb. These days I'm busy preparing for my final exams of this semester, after this weekend is summer vacation, during which I will not be able to be on line for one month, so, I hope this issue can be finished before that.
Hide
Permalink
Charles Oliver Nutter added a comment - 16/Jul/08 12:58 AM

tsing: ok, I was dumb....the address is http://jruby.headius.com:8080/hudson/job/jruby-dist/

Show
Charles Oliver Nutter added a comment - 16/Jul/08 12:58 AM tsing: ok, I was dumb....the address is http://jruby.headius.com:8080/hudson/job/jruby-dist/
Hide
Permalink
Tsing added a comment - 16/Jul/08 1:31 AM

I'm sorry but this link doesn't work, either

Show
Tsing added a comment - 16/Jul/08 1:31 AM I'm sorry but this link doesn't work, either
Hide
Permalink
Tsing added a comment - 16/Jul/08 5:05 AM

About one hour after I post last comment, I found I can open that url.

But nothing is improved.

From cmd.exe from WinXP:

F:\MyStudio\jruby-1.1.2\bin>.\jruby.bat -v
jruby 1.1.3-dev (ruby 1.8.6 patchlevel 114) (2008-07-15 rev 7174) [x86-java]
F:\MyStudio\jruby-1.1.2\bin>type test_en.rb
puts ARGV[0]
puts '中文字符串'
F:\MyStudio\jruby-1.1.2\bin>.\jruby.bat test_en.rb 中文参数
????
中文字符串
F:\MyStudio\jruby-1.1.2\bin>ren test_en.rb test_中文.rb
F:\MyStudio\jruby-1.1.2\bin>.\jruby.bat test_中文.rb 中文参数
Error opening script file: F:/MyStudio/jruby-1.1.2/bin/test_??.rb (文件名、目录名或卷标语法不正确。)

From Linux:

[Dice@localhost lib]$ java -jar jruby.jar -v
jruby 1.1.3-dev (ruby 1.8.6 patchlevel 114) (2008-07-15 rev 7174) [i386-java]
[Dice@localhost lib]$ cat test_en.rb
puts ARGV[0]
puts '中文字符串'
[Dice@localhost lib]$ java -jar jruby.jar test_en.rb 中文参数
????
中文字符串
[Dice@localhost lib]$ mv test_en.rb test_中文.rb
[Dice@localhost lib]$ java -jar jruby.jar test_中文.rb 中文参数
Error opening script file: /mnt/hda8/MyStudio/jruby-1.1.2/lib/test_??.rb (No such file or directory)

"test_en.rb" is encoded with GB2312 in WinXP, but UTF-8 in Linux.

Show
Tsing added a comment - 16/Jul/08 5:05 AM About one hour after I post last comment, I found I can open that url. But nothing is improved. From cmd.exe from WinXP: F:\MyStudio\jruby-1.1.2\bin>.\jruby.bat -v jruby 1.1.3-dev (ruby 1.8.6 patchlevel 114) (2008-07-15 rev 7174) [x86-java] F:\MyStudio\jruby-1.1.2\bin>type test_en.rb puts ARGV[0] puts '中文字符串' F:\MyStudio\jruby-1.1.2\bin>.\jruby.bat test_en.rb 中文参数 ???? 中文字符串 F:\MyStudio\jruby-1.1.2\bin>ren test_en.rb test_中文.rb F:\MyStudio\jruby-1.1.2\bin>.\jruby.bat test_中文.rb 中文参数 Error opening script file: F:/MyStudio/jruby-1.1.2/bin/test_??.rb (文件名、目录名或卷标语法不正确。) From Linux: [Dice@localhost lib]$ java -jar jruby.jar -v jruby 1.1.3-dev (ruby 1.8.6 patchlevel 114) (2008-07-15 rev 7174) [i386-java] [Dice@localhost lib]$ cat test_en.rb puts ARGV[0] puts '中文字符串' [Dice@localhost lib]$ java -jar jruby.jar test_en.rb 中文参数 ???? 中文字符串 [Dice@localhost lib]$ mv test_en.rb test_中文.rb [Dice@localhost lib]$ java -jar jruby.jar test_中文.rb 中文参数 Error opening script file: /mnt/hda8/MyStudio/jruby-1.1.2/lib/test_??.rb (No such file or directory) "test_en.rb" is encoded with GB2312 in WinXP, but UTF-8 in Linux.
Hide
Permalink
Charles Oliver Nutter added a comment - 16/Jul/08 1:02 PM

So it looks like the encoding of the incoming file is working out ok but arguments are still causing a problem. I'll take a second look.

Show
Charles Oliver Nutter added a comment - 16/Jul/08 1:02 PM So it looks like the encoding of the incoming file is working out ok but arguments are still causing a problem. I'll take a second look.
Hide
Permalink
Charles Oliver Nutter added a comment - 16/Jul/08 1:30 PM

Ok, I pushed another attempt at this. Unfortunately, as you suspected, I do not have access to a machine that isn't UTF-8 by default. So I will have to rely on you for testing

The additional change is based on your suggestion that always creating a UTF-8 string will be incorrect when UTF-8 is not default. So I allow Java to encode the ARGV strings as whatever the platform default encoding is supposed to be.

I have just started a new dist build on http://jruby.headius.com:8080/hudson. Hopefully you will be able to try it again.

Show
Charles Oliver Nutter added a comment - 16/Jul/08 1:30 PM Ok, I pushed another attempt at this. Unfortunately, as you suspected, I do not have access to a machine that isn't UTF-8 by default. So I will have to rely on you for testing The additional change is based on your suggestion that always creating a UTF-8 string will be incorrect when UTF-8 is not default. So I allow Java to encode the ARGV strings as whatever the platform default encoding is supposed to be. I have just started a new dist build on http://jruby.headius.com:8080/hudson. Hopefully you will be able to try it again.
Hide
Permalink
Charles Oliver Nutter added a comment - 16/Jul/08 1:37 PM

If you want to find us on IRC perhaps we could work on it together. FreeNode IRC, #jruby channel. I go by "headius" there.

Show
Charles Oliver Nutter added a comment - 16/Jul/08 1:37 PM If you want to find us on IRC perhaps we could work on it together. FreeNode IRC, #jruby channel. I go by "headius" there.
Hide
Permalink
Tsing added a comment - 16/Jul/08 7:34 PM

Good news:
The same test of yesterday turns out sucessfull this moning.
From Linux:
[Dice@localhost lib]$ java -jar jruby.jar -v
jruby 1.1.3-dev (ruby 1.8.6 patchlevel 114) (2008-07-16 rev 7193) [i386-java]
[Dice@localhost lib]$ cat test_en.rb
puts ARGV[0]
puts '中文字符串'
[Dice@localhost lib]$ java -jar jruby.jar test_en.rb 中文参数
中文参数
中文字符串
[Dice@localhost lib]$ mv test_en.rb test_中文.rb
[Dice@localhost lib]$ java -jar jruby.jar test_中文.rb 中文参数
中文参数
中文字符串

From WinXP:
F:\MyStudio\jruby-1.1.2\bin>jruby.bat -v
jruby 1.1.3-dev (ruby 1.8.6 patchlevel 114) (2008-07-16 rev 7193) [x86-java]
F:\MyStudio\jruby-1.1.2\bin>type test_en.rb
puts ARGV[0]
puts '中文字符串'
F:\MyStudio\jruby-1.1.2\bin>jruby.bat test_en.rb 中文参数
中文参数
中文字符串
F:\MyStudio\jruby-1.1.2\bin>ren test_en.rb test_中文.rb
F:\MyStudio\jruby-1.1.2\bin>jruby.bat test_中文.rb 中文参数
中文参数
中文字符串

BadNews:

From WinXP:

"txt_en" and "txt_中文" are the same.

F:\MyStudio\jruby-1.1.2\bin>type txt_en
English string from text file.
中文字符串来自text文件。
F:\MyStudio\jruby-1.1.2\bin>type txt_中文
English string from text file.
中文字符串来自text文件。

Here is a test.rb.

F:\MyStudio\jruby-1.1.2\bin>type test.rb
puts ARGV[0]
puts '中文字符串'
IO.foreach(ARGV[0]) {|line| puts line}

F:\MyStudio\jruby-1.1.2\bin>jruby.bat test.rb txt_en
txt_en
中文字符串
English string from text file.
中文字符串来自text文件。

F:\MyStudio\jruby-1.1.2\bin>jruby.bat test.rb txt_中文
txt_中文
中文字符串
test.rb:3:in `initialize': No such file or directory - File not found - txt_????
(Errno::ENOENT)
from test.rb:3:in `foreach'
from test.rb:3

As you see from above, when accessing a file have Chinese charactors in his name, things went wrong.
And when displaying error message, Chinese charactors is still displayed in ? symbol.
They are not one. For example:
F:\MyStudio\jruby-1.1.2\bin>ren test.rb 测试.rb
F:\MyStudio\jruby-1.1.2\bin>jruby.bat 测试.rb txt_en
txt_en
中文字符串
English string from text file.
中文字符串来自text文件。
F:\MyStudio\jruby-1.1.2\bin>jruby.bat 测试.rb txt_中文
txt_中文
中文字符串
.rb:3:in `initialize': No such file or directory - File not found - txt_?? (
Errno::ENOENT)
from ??.rb:3:in `foreach'
from ??.rb:3
As you see, "from ??.br:3", which means jruby can read this script but can't display their name correctly when thers is an error.

So, another two problems occurs.

PS: All files involved in Linux is encoded in UTF-8 and all files involved in XP is encoded in GB2312.

Good luck!

Show
Tsing added a comment - 16/Jul/08 7:34 PM Good news: The same test of yesterday turns out sucessfull this moning. From Linux: [Dice@localhost lib]$ java -jar jruby.jar -v jruby 1.1.3-dev (ruby 1.8.6 patchlevel 114) (2008-07-16 rev 7193) [i386-java] [Dice@localhost lib]$ cat test_en.rb puts ARGV[0] puts '中文字符串' [Dice@localhost lib]$ java -jar jruby.jar test_en.rb 中文参数 中文参数 中文字符串 [Dice@localhost lib]$ mv test_en.rb test_中文.rb [Dice@localhost lib]$ java -jar jruby.jar test_中文.rb 中文参数 中文参数 中文字符串 From WinXP: F:\MyStudio\jruby-1.1.2\bin>jruby.bat -v jruby 1.1.3-dev (ruby 1.8.6 patchlevel 114) (2008-07-16 rev 7193) [x86-java] F:\MyStudio\jruby-1.1.2\bin>type test_en.rb puts ARGV[0] puts '中文字符串' F:\MyStudio\jruby-1.1.2\bin>jruby.bat test_en.rb 中文参数 中文参数 中文字符串 F:\MyStudio\jruby-1.1.2\bin>ren test_en.rb test_中文.rb F:\MyStudio\jruby-1.1.2\bin>jruby.bat test_中文.rb 中文参数 中文参数 中文字符串 BadNews: From WinXP: "txt_en" and "txt_中文" are the same. F:\MyStudio\jruby-1.1.2\bin>type txt_en English string from text file. 中文字符串来自text文件。 F:\MyStudio\jruby-1.1.2\bin>type txt_中文 English string from text file. 中文字符串来自text文件。 Here is a test.rb. F:\MyStudio\jruby-1.1.2\bin>type test.rb puts ARGV[0] puts '中文字符串' IO.foreach(ARGV[0]) {|line| puts line} F:\MyStudio\jruby-1.1.2\bin>jruby.bat test.rb txt_en txt_en 中文字符串 English string from text file. 中文字符串来自text文件。 F:\MyStudio\jruby-1.1.2\bin>jruby.bat test.rb txt_中文 txt_中文 中文字符串 test.rb:3:in `initialize': No such file or directory - File not found - txt_???? (Errno::ENOENT) from test.rb:3:in `foreach' from test.rb:3 As you see from above, when accessing a file have Chinese charactors in his name, things went wrong. And when displaying error message, Chinese charactors is still displayed in ? symbol. They are not one. For example: F:\MyStudio\jruby-1.1.2\bin>ren test.rb 测试.rb F:\MyStudio\jruby-1.1.2\bin>jruby.bat 测试.rb txt_en txt_en 中文字符串 English string from text file. 中文字符串来自text文件。 F:\MyStudio\jruby-1.1.2\bin>jruby.bat 测试.rb txt_中文 txt_中文 中文字符串 .rb:3:in `initialize': No such file or directory - File not found - txt_?? ( Errno::ENOENT) from ??.rb:3:in `foreach' from ??.rb:3 As you see, "from ??.br:3", which means jruby can read this script but can't display their name correctly when thers is an error. So, another two problems occurs. PS: All files involved in Linux is encoded in UTF-8 and all files involved in XP is encoded in GB2312. Good luck!
Hide
Permalink
Charles Oliver Nutter added a comment - 16/Jul/08 9:07 PM

Tsing: Excellent news...and even the new bad news is good news! I'm looking into it now.

Show
Charles Oliver Nutter added a comment - 16/Jul/08 9:07 PM Tsing: Excellent news...and even the new bad news is good news! I'm looking into it now.
Hide
Permalink
Charles Oliver Nutter added a comment - 16/Jul/08 9:19 PM

Tsing: Your new issue is a bit more complicated, and not directly related to the fact that the string is coming in via command-line args. I'm going to mark this bug resolved, and you should open a new bug for the new problem.

Basically, now that we're pulling the strings in as the source encoding, we still need to get them to Java as the correct encoding when Java needs a unicode String. So your arguments come in as GB2312, and we leave them as 2312 so they'll print out and be encoded correctly. But we need to turn them into a unicode string for Java, and since we don't track what the original encoding was, we try to convert it from UTF-8. So the filename is bogus and fails.

This is a larger problem, since Ruby's strings don't track their encoding. To get completely seamless support for multiple encodings, which is what we need, we'll probably have to look toward Ruby 1.9 string features.

Anyway, file another bug for the new issue.

Show
Charles Oliver Nutter added a comment - 16/Jul/08 9:19 PM Tsing: Your new issue is a bit more complicated, and not directly related to the fact that the string is coming in via command-line args. I'm going to mark this bug resolved, and you should open a new bug for the new problem. Basically, now that we're pulling the strings in as the source encoding, we still need to get them to Java as the correct encoding when Java needs a unicode String. So your arguments come in as GB2312, and we leave them as 2312 so they'll print out and be encoded correctly. But we need to turn them into a unicode string for Java, and since we don't track what the original encoding was, we try to convert it from UTF-8. So the filename is bogus and fails. This is a larger problem, since Ruby's strings don't track their encoding. To get completely seamless support for multiple encodings, which is what we need, we'll probably have to look toward Ruby 1.9 string features. Anyway, file another bug for the new issue.
Hide
Permalink
Tsing added a comment - 16/Jul/08 10:31 PM

Well, I have filed an issue at http://jira.codehaus.org/browse/JRUBY-2812

Show
Tsing added a comment - 16/Jul/08 10:31 PM Well, I have filed an issue at http://jira.codehaus.org/browse/JRUBY-2812

People

  • Assignee:
    Charles Oliver Nutter
    Reporter:
    Tsing
Vote (0)
Watch (2)

Dates

  • Created:
    19/Jun/08 2:42 AM
    Updated:
    10/Sep/08 6:47 PM
    Resolved:
    16/Jul/08 9:19 PM
  • Atlassian JIRA (v5.0.4#731-sha1:3aa7374)
  • Report a problem
  • Powered by a free Atlassian JIRA open source license for Codehaus. Try JIRA - bug tracking software for your team.