JRuby (please use github issues at http://bugs.jruby.org)
  1. JRuby (please use github issues at http://bugs.jruby.org)
  2. JRUBY-2812

Problems in getting access to files who has Chinese charactors in name.

    Details

    • Type: Bug Bug
    • Status: Closed Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: JRuby 1.1.2
    • Fix Version/s: JRuby 1.6.4
    • Component/s: Core Classes/Modules
    • Labels:
      None
    • Environment:
      WinXP SP2; Linux
    • Number of attachments :
      3

      Description

      Original post is http://jira.codehaus.org/browse/JRUBY-2677 , when problems goes deeper, this issue is created.
      All files involved in WinXP is encoded in GB2312, and all files involved in Linux is encoded in UTF-8.
      My WinXP uses GB2312 as locale, and my Linux uses UTF-8 as locale.

      [Dice@localhost lib]$ cat test.rb
      puts ARGV[0]
      puts '中文字符串'
      IO.foreach(ARGV[0])

      {|line| puts line}

      [Dice@localhost lib]$ cat txt_en
      English string from text file.
      中文字符串来自text文件。
      [Dice@localhost lib]$ cat txt_中文
      English string from text file.
      中文字符串来自text文件。

      [Dice@localhost lib]$ java -jar jruby.jar -v
      ruby 1.8.6 (2008-05-28 rev 6586) [i386-jruby1.1.2]
      [Dice@localhost lib]$ java -jar jruby.jar test.rb txt_en
      txt_en
      中文字符串
      English string from text file.
      中文字符串来自text文件。
      [Dice@localhost lib]$ java -jar jruby.jar test.rb txt_中文
      txt_??
      中文字符串
      test.rb:3:in `initialize': No such file or directory - File not found - txt_?? (Errno::ENOENT)
      from test.rb:3:in `foreach'
      from test.rb:3

      This error comes from the fact that jruby1.1.2 can't process arguments containing Chinese charactors correctly.
      This bug is reported in Issue JRUBY-2677, and Charles Oliver Nutter fixed it.
      [Dice@localhost lib]$ java -jar jruby.jar -v
      jruby 1.1.3-dev (ruby 1.8.6 patchlevel 114) (2008-07-16 rev 7193) [i386-java]
      [Dice@localhost lib]$ java -jar jruby.jar test.rb txt_en
      txt_en
      中文字符串
      English string from text file.
      中文字符串来自text文件。
      [Dice@localhost lib]$ java -jar jruby.jar test.rb txt_中文
      txt_中文
      中文字符串
      English string from text file.
      中文字符串来自text文件。

      It becomes OK because the locale of my Linux is UTF-8.
      But when I tried it under WinXP whose locale is GB2312, deeper problems occured.
      F:\MyStudio\jruby-1.1.2\bin>jruby.bat -v
      jruby 1.1.3-dev (ruby 1.8.6 patchlevel 114) (2008-07-16 rev 7193) [x86-java]
      F:\MyStudio\jruby-1.1.2\bin>jruby.bat test.rb txt_中文
      txt_中文
      中文字符串
      test.rb:3:in `initialize': No such file or directory - File not found - txt_????
      (Errno::ENOENT)
      from test.rb:3:in `foreach'
      from test.rb:3

      Charles Oliver Nutter's explanation is overhere http://jira.codehaus.org/browse/JRUBY-2677?focusedCommentId=142123#action_142123

      Afer this weekend I'll have my summer holiday for more than one month, during which I will not be able to surf the Internet, so I can't view and post comments on this issue.
      And I'm just a Chinese college student who is just a beginner in programing, so I can't help you too much in fact.
      Well, when I return I will see what had happend.
      Good luck!

      1. JRUBY-2812.diff
        0.6 kB
        TAKAI Naoto
      2. jruby-2812.patch
        0.9 kB
        Charles Oliver Nutter
      3. JRUBY-2812-for-trunk.diff
        1 kB
        TAKAI Naoto

        Issue Links

          Activity

          Hide
          Charles Oliver Nutter added a comment -

          This is related to JRUBY-2812, which I've punted to 1.3. We'll include this bug and its test cases in that work.

          Show
          Charles Oliver Nutter added a comment - This is related to JRUBY-2812 , which I've punted to 1.3. We'll include this bug and its test cases in that work.
          Hide
          Tsing added a comment -

          Charles:
          I have seen your post on JRUBY-2812 and I think I know what's going on about this bug. And "Fix Version:1.3", it seems there will be a long time before this problem fixed.
          If you work out some fix I'll try some test cases for you.

          Show
          Tsing added a comment - Charles: I have seen your post on JRUBY-2812 and I think I know what's going on about this bug. And "Fix Version:1.3", it seems there will be a long time before this problem fixed. If you work out some fix I'll try some test cases for you.
          Hide
          Charles Oliver Nutter added a comment -

          Well hopefully 1.3 won't be that far out; really 1.2 was only three months, and this bug is already flagged to get worked on. I also think the Ruby 1.9 work we're doing will help this along, since it will force us to address default encoding, transcoding to Java String, and other details necessary to resolve this.

          Test cases will be helpful...and I'm eager to work on this more and get it working well in 1.3. Plus if we get it fixed sooner, there's no reason you couldn't use a nightly build.

          Show
          Charles Oliver Nutter added a comment - Well hopefully 1.3 won't be that far out; really 1.2 was only three months, and this bug is already flagged to get worked on. I also think the Ruby 1.9 work we're doing will help this along, since it will force us to address default encoding, transcoding to Java String, and other details necessary to resolve this. Test cases will be helpful...and I'm eager to work on this more and get it working well in 1.3. Plus if we get it fixed sooner, there's no reason you couldn't use a nightly build.
          Hide
          Charles Oliver Nutter added a comment -

          Tsing: It's still getting pushed back...we simply don't have the cycles to work on everything. But if you are able to get some simple test cases we'll be able to fix each case in turn.

          Show
          Charles Oliver Nutter added a comment - Tsing: It's still getting pushed back...we simply don't have the cycles to work on everything. But if you are able to get some simple test cases we'll be able to fix each case in turn.
          Hide
          Hiro Asari added a comment - - edited

          With recent improvements in encoding handling, I'm fairly certain that this one is fixed.

          (Original comment included output from Terminal.app, but apparently the encoding fails on JIRA.)

          Show
          Hiro Asari added a comment - - edited With recent improvements in encoding handling, I'm fairly certain that this one is fixed. (Original comment included output from Terminal.app, but apparently the encoding fails on JIRA.)

            People

            • Assignee:
              Hiro Asari
              Reporter:
              Tsing
            • Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: