Details

    • Type: Bug Bug
    • Status: Reopened Reopened
    • Priority: Major Major
    • Resolution: Unresolved
    • Affects Version/s: JRuby 1.5.3
    • Fix Version/s: JRuby 1.6
    • Component/s: Windows
    • Labels:
      None
    • Environment:
      Mac OS X 10.6 / openSUSE 11.3
    • Number of attachments :
      5

      Description

      In irb of CRuby 1.9, String#encoding returns valid encodings, UTF-8.

      But in 1.9 mode irb of JRuby, String#encoding returns invalid encodings, ASCII-8BIT.

      Please see the attached screenshots because description area can't diaplay Japanese.

        Activity

        Hide
        Youhei Kondou added a comment -

        I wrong to select Component/s area, please correct it.

        Show
        Youhei Kondou added a comment - I wrong to select Component/s area, please correct it.
        Hide
        Charles Oliver Nutter added a comment -

        So where does Ruby 1.9 get UTF-8 from? Perhaps we're still just defaulting to ASCII because that's the standard in Ruby 1.8.

        Show
        Charles Oliver Nutter added a comment - So where does Ruby 1.9 get UTF-8 from? Perhaps we're still just defaulting to ASCII because that's the standard in Ruby 1.8.
        Hide
        Youhei Kondou added a comment -

        At first, thanks to correct Component/s area.

        > So where does Ruby 1.9 get UTF-8 from?
        UTF-8 is from terminal's default encode (both in Mac OS X and in openSUSE).

        > Perhaps we're still just defaulting to ASCII because that's the standard in Ruby 1.8
        IMHO, it's okay if there are only alphabets and numbers. But, like its screenshot, it's strange Hiragana is indicated as ASCII.

        Show
        Youhei Kondou added a comment - At first, thanks to correct Component/s area. > So where does Ruby 1.9 get UTF-8 from? UTF-8 is from terminal's default encode (both in Mac OS X and in openSUSE). > Perhaps we're still just defaulting to ASCII because that's the standard in Ruby 1.8 IMHO, it's okay if there are only alphabets and numbers. But, like its screenshot, it's strange Hiragana is indicated as ASCII.
        Hide
        Yoko Harada added a comment -

        As far as I read my translation, http://yokolet.blogspot.com/2009/07/design-and-implementation-of-ruby-m17n.html#l14, Ruby 1.9 sees $LANG environment variable to determine the encoding of standard I/O. On my OS X, "echo $LANG" printed out "ja_JP.UTF-8", while Java's file.encoding is SJIS on JDK 6 and UTF-8 on JDK 5. Terminal's default encoding is usually the value of $LANG, so it coincides with what Youhei said.

        Show
        Yoko Harada added a comment - As far as I read my translation, http://yokolet.blogspot.com/2009/07/design-and-implementation-of-ruby-m17n.html#l14 , Ruby 1.9 sees $LANG environment variable to determine the encoding of standard I/O. On my OS X, "echo $LANG" printed out "ja_JP.UTF-8", while Java's file.encoding is SJIS on JDK 6 and UTF-8 on JDK 5. Terminal's default encoding is usually the value of $LANG, so it coincides with what Youhei said.
        Hide
        Charles Oliver Nutter added a comment -

        Perhaps we should use whatever the JDK claims as its default platform encoding for our default 1.9 mode encoding? That should be the correct encoding based on the current platform's $LANG and other settings.

        Show
        Charles Oliver Nutter added a comment - Perhaps we should use whatever the JDK claims as its default platform encoding for our default 1.9 mode encoding? That should be the correct encoding based on the current platform's $LANG and other settings.
        Hide
        Youhei Kondou added a comment -

        There are Ruby 1.9 Encoding Summary:
        http://redmine.ruby-lang.org/wiki/ruby-19/ScriptEncoding

        I guess irb is categorized as "-e and stdin case", and my sample is surely "no -K -E, no magic comment" case, so both "script encoding" and "default external" should be indicated for locale ($LANG).

        Show
        Youhei Kondou added a comment - There are Ruby 1.9 Encoding Summary: http://redmine.ruby-lang.org/wiki/ruby-19/ScriptEncoding I guess irb is categorized as "-e and stdin case", and my sample is surely "no -K -E, no magic comment" case, so both "script encoding" and "default external" should be indicated for locale ($LANG).
        Hide
        Charles Oliver Nutter added a comment -

        There are two bugs here:

        • We are not properly setting up the default external encoding and "kcode" (which lingers on from 1.8 mode) when starting up in 1.9 mode.
        • The parser is not propagating the current encoding (or default external encoding) into strings, etc that it parses.

        Most of this fix will require work in the 1.9 parser logic, so that it will create ByteList objects for our RubyStrings with the appropriate encoding. Some additional work is needed to set up the kcode and default external encoding in 1.9 mode, but it's not major.

        Because of this, I'm marking this as a 1.9 and Parser bug. Also marked for fixing in 1.6.

        Show
        Charles Oliver Nutter added a comment - There are two bugs here: We are not properly setting up the default external encoding and "kcode" (which lingers on from 1.8 mode) when starting up in 1.9 mode. The parser is not propagating the current encoding (or default external encoding) into strings, etc that it parses. Most of this fix will require work in the 1.9 parser logic, so that it will create ByteList objects for our RubyStrings with the appropriate encoding. Some additional work is needed to set up the kcode and default external encoding in 1.9 mode, but it's not major. Because of this, I'm marking this as a 1.9 and Parser bug. Also marked for fixing in 1.6.
        Hide
        Charles Oliver Nutter added a comment -

        Recent encoding work (likely Tom's big push for internal/external and parser encoding support) appear to have fixed this. Thanks for the report!

        ~/projects/jruby ➔ jruby --1.9 -S irb
        >> "foo".encoding
        => #<Encoding:UTF-8>
        
        Show
        Charles Oliver Nutter added a comment - Recent encoding work (likely Tom's big push for internal/external and parser encoding support) appear to have fixed this. Thanks for the report! ~/projects/jruby &#10132; jruby --1.9 -S irb >> "foo".encoding => #<Encoding:UTF-8>
        Hide
        Youhei Kondou added a comment -

        At first, many thanks for all to implement String#encoding with M17N.

        In 1.6.0 RC1, both on my Linux and on Mac OS X (Terminal is in UTF8), it succeeds totally.

        irb(main):001:0> JRUBY_VERSION
        "1.6.0.RC1"
        irb(main):002:0> RUBY_VERSION
        "1.9.2"
        irb(main):003:0> 'a'.encoding
        #<Encoding:UTF-8>
        irb(main):004:0> '&#12354;'.encoding
        #<Encoding:UTF-8>

        But on Windows (Terminal is in Japanese), it succeeds at half.

        irb(main):001:0> JRUBY_VERSION
        "1.6.0.RC1"
        irb(main):002:0> RUBY_VERSION
        "1.9.2"
        irb(main):003:0> 'a'.encoding
        #<Encoding:Windows-31J>
        irb(main):004:0> '&#12354;'.encoding
        SyntaxError: (irb):4: invalid multibyte char (Windows-31J)
        	from org/jruby/RubyKernel.java:1096:in `eval19'
        	from org/jruby/RubyKernel.java:1421:in `loop'
        	from org/jruby/RubyKernel.java:1208:in `rbCatch19'
        	from org/jruby/RubyKernel.java:1208:in `rbCatch19'
        Show
        Youhei Kondou added a comment - At first, many thanks for all to implement String#encoding with M17N. In 1.6.0 RC1, both on my Linux and on Mac OS X (Terminal is in UTF8), it succeeds totally. irb(main):001:0> JRUBY_VERSION "1.6.0.RC1" irb(main):002:0> RUBY_VERSION "1.9.2" irb(main):003:0> 'a'.encoding #<Encoding:UTF-8> irb(main):004:0> '&#12354;'.encoding #<Encoding:UTF-8> But on Windows (Terminal is in Japanese), it succeeds at half. irb(main):001:0> JRUBY_VERSION "1.6.0.RC1" irb(main):002:0> RUBY_VERSION "1.9.2" irb(main):003:0> 'a'.encoding #<Encoding:Windows-31J> irb(main):004:0> '&#12354;'.encoding SyntaxError: (irb):4: invalid multibyte char (Windows-31J) from org/jruby/RubyKernel.java:1096:in `eval19' from org/jruby/RubyKernel.java:1421:in `loop' from org/jruby/RubyKernel.java:1208:in `rbCatch19' from org/jruby/RubyKernel.java:1208:in `rbCatch19'
        Show
        Youhei Kondou added a comment - http://jira.codehaus.org/browse/JRUBY-5156?focusedCommentId=251274&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#action_251274
        Hide
        Charles Oliver Nutter added a comment -

        Adding "Windows" category, since it now appears to only fail on Windows.

        I'm surprised by this. Can you provide more information about your system please?

        • default external and internal encodings
        • whether this character works in a script, with and without encoding: header
        • full Java backtrace for the error, by passing -J-Djruby.backtrace.style=raw to JRuby

        I'd like to see us fix this for 1.6, so marking as such.

        Show
        Charles Oliver Nutter added a comment - Adding "Windows" category, since it now appears to only fail on Windows. I'm surprised by this. Can you provide more information about your system please? default external and internal encodings whether this character works in a script, with and without encoding: header full Java backtrace for the error, by passing -J-Djruby.backtrace.style=raw to JRuby I'd like to see us fix this for 1.6, so marking as such.
        Hide
        Youhei Kondou added a comment -

        screenshot for irb. The first is CRuby 1.9.2 and the last is JRuby 1.6.0 RC1 with --1.9 mode

        Show
        Youhei Kondou added a comment - screenshot for irb. The first is CRuby 1.9.2 and the last is JRuby 1.6.0 RC1 with --1.9 mode
        Hide
        Youhei Kondou added a comment - - edited

        I attached both CRuby's irb and JRuby's irb

        At first, CRuby's one.

        ! jruby5156_cruby192.png !

        Encoding.default_external is Windows-31J and Encoding.default_internal is nil.
        Windows-31J is the default encoding of command prompt in Japanese Windows.

        At last, JRuby's one.

        ! jruby5156_jruby160.png !

        Encoding.default_external is US-ASCII and Encoding.default_internal is nil.
        And its Java backtrace is below:

        SyntaxError: (irb):7: invalid multibyte char (Windows-31J)
        	from Thread.java:1479:in `getStackTrace'
        	from RubyException.java:154:in `prepareBacktrace'
        	from RaiseException.java:156:in `preRaise'
        	from Ruby.java:3432:in `newRaiseException'
        	from Ruby.java:3261:in `newSyntaxError'
        	from Parser.java:139:in `parse'
        	from Parser.java:83:in `parse'
        	from Parser.java:75:in `parse'
        	from Ruby.java:2346:in `parseEval'
        	from ASTInterpreter.java:146:in `evalWithBinding'
        	from RubyKernel.java:1138:in `evalCommon'
        	from RubyKernel.java:1096:in `eval19'
        	from org/jruby/RubyKernel$s_method_0_3$RUBYINVOKER$eval19.gen:65535:in `call'
        	from DynamicMethod.java:178:in `call'
        	from CachingCallSite.java:252:in `cacheAndCall'
        	from CachingCallSite.java:71:in `call'
        ... 118 levels...
        	from BeginNode.java:83:in `interpret'
        	from NewlineNode.java:103:in `interpret'
        	from BlockNode.java:71:in `interpret'
        	from ASTInterpreter.java:70:in `INTERPRET_METHOD'
        	from InterpretedMethod.java:184:in `call'
        	from DefaultMethod.java:179:in `call'
        	from CachingCallSite.java:282:in `cacheAndCall'
        	from CachingCallSite.java:139:in `call'
        	from jirb_swing:65:in `__file__'
        	from jirb_swing:-1:in `load'
        	from Ruby.java:702:in `runScript'
        	from Ruby.java:587:in `runNormally'
        	from Ruby.java:421:in `runFromMain'
        	from Main.java:304:in `run'
        	from Main.java:144:in `run'
        	from Main.java:113:in `main'
        Show
        Youhei Kondou added a comment - - edited I attached both CRuby's irb and JRuby's irb At first, CRuby's one. ! jruby5156_cruby192.png ! Encoding.default_external is Windows-31J and Encoding.default_internal is nil. Windows-31J is the default encoding of command prompt in Japanese Windows. At last, JRuby's one. ! jruby5156_jruby160.png ! Encoding.default_external is US-ASCII and Encoding.default_internal is nil. And its Java backtrace is below: SyntaxError: (irb):7: invalid multibyte char (Windows-31J) from Thread.java:1479:in `getStackTrace' from RubyException.java:154:in `prepareBacktrace' from RaiseException.java:156:in `preRaise' from Ruby.java:3432:in `newRaiseException' from Ruby.java:3261:in `newSyntaxError' from Parser.java:139:in `parse' from Parser.java:83:in `parse' from Parser.java:75:in `parse' from Ruby.java:2346:in `parseEval' from ASTInterpreter.java:146:in `evalWithBinding' from RubyKernel.java:1138:in `evalCommon' from RubyKernel.java:1096:in `eval19' from org/jruby/RubyKernel$s_method_0_3$RUBYINVOKER$eval19.gen:65535:in `call' from DynamicMethod.java:178:in `call' from CachingCallSite.java:252:in `cacheAndCall' from CachingCallSite.java:71:in `call' ... 118 levels... from BeginNode.java:83:in `interpret' from NewlineNode.java:103:in `interpret' from BlockNode.java:71:in `interpret' from ASTInterpreter.java:70:in `INTERPRET_METHOD' from InterpretedMethod.java:184:in `call' from DefaultMethod.java:179:in `call' from CachingCallSite.java:282:in `cacheAndCall' from CachingCallSite.java:139:in `call' from jirb_swing:65:in `__file__' from jirb_swing:-1:in `load' from Ruby.java:702:in `runScript' from Ruby.java:587:in `runNormally' from Ruby.java:421:in `runFromMain' from Main.java:304:in `run' from Main.java:144:in `run' from Main.java:113:in `main'
        Hide
        Thomas E Enebo added a comment -

        I believe I at least know what we are doing wrong. I set the default external encoding unconditionally to US-ASCII, and clearly this gets picked up by some other means. I will try and figure out how default encoding is supposed to get set and fix this.

        Show
        Thomas E Enebo added a comment - I believe I at least know what we are doing wrong. I set the default external encoding unconditionally to US-ASCII, and clearly this gets picked up by some other means. I will try and figure out how default encoding is supposed to get set and fix this.
        Hide
        Youhei Kondou added a comment -

        In 1.6.0 RC2, Encoding.default_external returns Windows-31J, but same error occurs.

        Show
        Youhei Kondou added a comment - In 1.6.0 RC2, Encoding.default_external returns Windows-31J, but same error occurs.
        Hide
        Thomas E Enebo added a comment -

        Reduced test case:

        # coding: utf-8
        
        require 'readline'
        
        line = Readline.readline('> ', true)
        p line.encoding
        

        The problem is our readline is not honoring default_external.

        Show
        Thomas E Enebo added a comment - Reduced test case: # coding: utf-8 require 'readline' line = Readline.readline('> ', true) p line.encoding The problem is our readline is not honoring default_external.
        Hide
        Thomas E Enebo added a comment -

        This issue should be fixed for irb, but it was not really a 100% fix since it assumes default_external will match the default charset used by a Java String. Unfortunately, we need to change jline to just read in bytes since String is not an adequate class for us to support all encodings supporting by M17n.

        Youhei...Can you try latest master (commit 246cc28 or later) and tell me if it fixes the problem for you?

        Show
        Thomas E Enebo added a comment - This issue should be fixed for irb, but it was not really a 100% fix since it assumes default_external will match the default charset used by a Java String. Unfortunately, we need to change jline to just read in bytes since String is not an adequate class for us to support all encodings supporting by M17n. Youhei...Can you try latest master (commit 246cc28 or later) and tell me if it fixes the problem for you?
        Hide
        Youhei Kondou added a comment -

        jirb_swing with jruby revision 246cc28

        Show
        Youhei Kondou added a comment - jirb_swing with jruby revision 246cc28
        Hide
        Youhei Kondou added a comment -

        I retry with 246cc28. But same error occurs. I attach the screenshot ! 246cc28_jruby5156_jruby160.PNG !of jirb_swing and write the stack-trace below

        SyntaxError: (irb):8: invalid multibyte char (Windows-31J)
        	from Thread.java:1479:in `getStackTrace'
        	from TraceType.java:20:in `getBacktrace'
        	from RubyException.java:151:in `prepareBacktrace'
        	from RaiseException.java:159:in `preRaise'
        	from RaiseException.java:80:in `<init>'
        	from Ruby.java:3258:in `newRaiseException'
        	from Ruby.java:3097:in `newSyntaxError'
        	from Parser.java:139:in `parse'
        	from Parser.java:83:in `parse'
        	from Parser.java:75:in `parse'
        	from Ruby.java:2311:in `parseEval'
        	from ASTInterpreter.java:158:in `evalWithBinding'
        	from RubyKernel.java:1135:in `evalCommon'
        	from RubyKernel.java:1093:in `eval19'
        	from org/jruby/RubyKernel$s_method_0_3$RUBYINVOKER$eval19.gen:65535:in `call'
        	from DynamicMethod.java:179:in `call'
        ... 118 levels...
        	from BlockNode.java:71:in `interpret'
        	from ASTInterpreter.java:74:in `INTERPRET_METHOD'
        	from InterpretedMethod.java:190:in `call'
        	from DefaultMethod.java:179:in `call'
        	from CachingCallSite.java:282:in `cacheAndCall'
        	from CachingCallSite.java:139:in `call'
        	from C:/jruby_latest/bin/jirb_swing:65:in `__file__'
        	from C:/jruby_latest/bin/jirb_swing:-1:in `load'
        	from Ruby.java:667:in `runScript'
        	from Ruby.java:571:in `runNormally'
        	from Ruby.java:420:in `runFromMain'
        	from Main.java:278:in `doRunFromMain'
        	from Main.java:198:in `internalRun'
        	from Main.java:164:in `run'
        	from Main.java:148:in `run'
        	from Main.java:128:in `main'
        Show
        Youhei Kondou added a comment - I retry with 246cc28. But same error occurs. I attach the screenshot ! 246cc28_jruby5156_jruby160.PNG !of jirb_swing and write the stack-trace below SyntaxError: (irb):8: invalid multibyte char (Windows-31J) from Thread.java:1479:in `getStackTrace' from TraceType.java:20:in `getBacktrace' from RubyException.java:151:in `prepareBacktrace' from RaiseException.java:159:in `preRaise' from RaiseException.java:80:in `<init>' from Ruby.java:3258:in `newRaiseException' from Ruby.java:3097:in `newSyntaxError' from Parser.java:139:in `parse' from Parser.java:83:in `parse' from Parser.java:75:in `parse' from Ruby.java:2311:in `parseEval' from ASTInterpreter.java:158:in `evalWithBinding' from RubyKernel.java:1135:in `evalCommon' from RubyKernel.java:1093:in `eval19' from org/jruby/RubyKernel$s_method_0_3$RUBYINVOKER$eval19.gen:65535:in `call' from DynamicMethod.java:179:in `call' ... 118 levels... from BlockNode.java:71:in `interpret' from ASTInterpreter.java:74:in `INTERPRET_METHOD' from InterpretedMethod.java:190:in `call' from DefaultMethod.java:179:in `call' from CachingCallSite.java:282:in `cacheAndCall' from CachingCallSite.java:139:in `call' from C:/jruby_latest/bin/jirb_swing:65:in `__file__' from C:/jruby_latest/bin/jirb_swing:-1:in `load' from Ruby.java:667:in `runScript' from Ruby.java:571:in `runNormally' from Ruby.java:420:in `runFromMain' from Main.java:278:in `doRunFromMain' from Main.java:198:in `internalRun' from Main.java:164:in `run' from Main.java:148:in `run' from Main.java:128:in `main'
        Hide
        Thomas E Enebo added a comment -

        Does it work with jirb? It is possible jirb_swing starts up much differently (not to say it isn't a bug as well).

        Show
        Thomas E Enebo added a comment - Does it work with jirb? It is possible jirb_swing starts up much differently (not to say it isn't a bug as well).
        Hide
        Youhei Kondou added a comment -

        jirb version of '246cc28_jruby5156_jruby160.png'

        Show
        Youhei Kondou added a comment - jirb version of '246cc28_jruby5156_jruby160.png'
        Hide
        Youhei Kondou added a comment -

        Okay, now working with jirb.
        ! 246cc28_jruby5156_jruby160_irb.png !

        But, inputted Japanese character is garbling.
        (So that's the why I use jirb_swing.)

        Show
        Youhei Kondou added a comment - Okay, now working with jirb. ! 246cc28_jruby5156_jruby160_irb.png ! But, inputted Japanese character is garbling. (So that's the why I use jirb_swing.)
        Hide
        Charles Oliver Nutter added a comment -

        Youhei: What happens when you use console IRB with --noreadline? Also, please try an updated build of JRuby after this commit and let me know if it's any better:

        commit 4f3a2a2d320a1e5c4d5ba4b47948f66b61798f16
        Author: Charles Oliver Nutter <headius@headius.com>
        Date: Tue Apr 5 12:33:53 2011 -0500

        Update to jline-1.0-SNAPSHOT to get recent merges and patches.

        Show
        Charles Oliver Nutter added a comment - Youhei: What happens when you use console IRB with --noreadline? Also, please try an updated build of JRuby after this commit and let me know if it's any better: commit 4f3a2a2d320a1e5c4d5ba4b47948f66b61798f16 Author: Charles Oliver Nutter <headius@headius.com> Date: Tue Apr 5 12:33:53 2011 -0500 Update to jline-1.0-SNAPSHOT to get recent merges and patches.
        Hide
        Youhei Kondou added a comment - - edited

        (1) jruby --1.9 -S jirb --noreadline (1.6.0)
        Succeeded with Win31J strings

        irb(main):001:0> JRUBY_VERSION
        => "1.6.0"
        irb(main):002:0> RUBY_VERSION
        => "1.9.2"
        irb(main):003:0> 'a'.encoding
        => #<Encoding:Windows-31J>
        irb(main):004:0> '&#12354;'.encoding
        => #<Encoding:Windows-31J>

        (2) jruby --1.9 -S jirb_swing --noreadline (1.6.0)
        No errors, but no successes. prompt is freezing.

        (3) jruby --1.9 -S jirb --noreadline (4f3a2a2d)
        Same as (1)

        (4) jruby --1.9 -S jirb_swing --noreadline (4f3a2a2d)
        Same as (2)

        Show
        Youhei Kondou added a comment - - edited (1) jruby --1.9 -S jirb --noreadline (1.6.0) Succeeded with Win31J strings irb(main):001:0> JRUBY_VERSION => "1.6.0" irb(main):002:0> RUBY_VERSION => "1.9.2" irb(main):003:0> 'a'.encoding => #<Encoding:Windows-31J> irb(main):004:0> '&#12354;'.encoding => #<Encoding:Windows-31J> (2) jruby --1.9 -S jirb_swing --noreadline (1.6.0) No errors, but no successes. prompt is freezing. (3) jruby --1.9 -S jirb --noreadline (4f3a2a2d) Same as (1) (4) jruby --1.9 -S jirb_swing --noreadline (4f3a2a2d) Same as (2)
        Hide
        Youhei Kondou added a comment -

        jirb is all OK and jirb_swing is all NG. And jirb is enough for JRuby echosystem (probably right). If jirb_swing is not important product in JRuby echosystem (probably right), IMHO, it is one way to purge jirb_swing from JRuby.

        Show
        Youhei Kondou added a comment - jirb is all OK and jirb_swing is all NG. And jirb is enough for JRuby echosystem (probably right). If jirb_swing is not important product in JRuby echosystem (probably right), IMHO, it is one way to purge jirb_swing from JRuby.
        Hide
        Youhei Kondou added a comment -

        I try again in JRuby 1.6.6

        (1) jruby --1.9 -S jirb (1.6.6)

        irb(main):001:0> JRUBY_VERSION
        => "1.6.6"
        irb(main):002:0> RUBY_VERSION
        => "1.9.2"
        irb(main):003:0> 'a'.encoding
        => #<Encoding:Windows-31J>
        irb(main):004:0> '&#12354;'.encoding
        => #<Encoding:Windows-31J>

        (1) jruby --1.9 -S jirb_swing (1.6.6)

        irb(main):001:0> JRUBY_VERSION
        => "1.6.6"
        irb(main):002:0> RUBY_VERSION
        => "1.9.2"
        irb(main):003:0> 'a'.encoding
        => #<Encoding:Windows-31J>
        irb(main):004:0> '&#12354;'.encoding
        SyntaxError: (irb):4: invalid multibyte char (Windows-31J)
        	from org/jruby/RubyKernel.java:1082:in `eval'
        	from org/jruby/RubyKernel.java:1408:in `loop'
        	from org/jruby/RubyKernel.java:1195:in `catch'
        	from org/jruby/RubyKernel.java:1195:in `catch'
        	from C:/jruby-1.6.6/bin/jirb_swing:54:in `(root)'
        Show
        Youhei Kondou added a comment - I try again in JRuby 1.6.6 (1) jruby --1.9 -S jirb (1.6.6) irb(main):001:0> JRUBY_VERSION => "1.6.6" irb(main):002:0> RUBY_VERSION => "1.9.2" irb(main):003:0> 'a'.encoding => #<Encoding:Windows-31J> irb(main):004:0> '&#12354;'.encoding => #<Encoding:Windows-31J> (1) jruby --1.9 -S jirb_swing (1.6.6) irb(main):001:0> JRUBY_VERSION => "1.6.6" irb(main):002:0> RUBY_VERSION => "1.9.2" irb(main):003:0> 'a'.encoding => #<Encoding:Windows-31J> irb(main):004:0> '&#12354;'.encoding SyntaxError: (irb):4: invalid multibyte char (Windows-31J) from org/jruby/RubyKernel.java:1082:in `eval' from org/jruby/RubyKernel.java:1408:in `loop' from org/jruby/RubyKernel.java:1195:in `catch' from org/jruby/RubyKernel.java:1195:in `catch' from C:/jruby-1.6.6/bin/jirb_swing:54:in `(root)'
        Hide
        Charles Oliver Nutter added a comment -

        Can you try JRuby 1.7/master? We have fixed a number of encoding issues. I tried with a chinese character and it was handled correctly as UTF-8 in my setup, but I'm not sure how to simulate your setup.

        Show
        Charles Oliver Nutter added a comment - Can you try JRuby 1.7/master? We have fixed a number of encoding issues. I tried with a chinese character and it was handled correctly as UTF-8 in my setup, but I'm not sure how to simulate your setup.
        Hide
        Youhei Kondou added a comment - - edited

        I tryed with JRuby 1.7.0 preview 1 zip on github.

        (1) jruby -S jirb (1.7.0 preview 1)

        irb(main):001:0> JRUBY_VERSION
        => "1.7.0.preview1"
        irb(main):002:0> RUBY_VERSION
        => "1.9.3"
        irb(main):003:0> 'a'.encoding
        => #<Encoding:Windows-31J>
        irb(main):004:0> '&#12354;'.encoding
        irb(main):005:0' '

        (1) jruby -S jirb_swing (1.7.0 preview 1)

        irb(main):001:0> JRUBY_VERSION
        "1.7.0.preview1"
        irb(main):002:0> RUBY_VERSION
        "1.9.3"
        irb(main):003:0> 'a'.encoding
        #<Encoding:Windows-31J>
        irb(main):004:0> '&#12354;'.encoding
        SyntaxError: (irb):4: invalid multibyte char (Windows-31J)
          from org/jruby/RubyKernel.java:1037:in `eval'
          from org/jruby/RubyKernel.java:1353:in `loop'
          from org/jruby/RubyKernel.java:1146:in `catch'
          from org/jruby/RubyKernel.java:1146:in `catch'
          from C:/temp/jruby-jruby-00c8c98/bin/jirb_swing:54:in `(root)'

        jirb_swing still fails, and jirb becomes to fail again(with another reason).

        Show
        Youhei Kondou added a comment - - edited I tryed with JRuby 1.7.0 preview 1 zip on github. (1) jruby -S jirb (1.7.0 preview 1) irb(main):001:0> JRUBY_VERSION => "1.7.0.preview1" irb(main):002:0> RUBY_VERSION => "1.9.3" irb(main):003:0> 'a'.encoding => #<Encoding:Windows-31J> irb(main):004:0> '&#12354;'.encoding irb(main):005:0' ' (1) jruby -S jirb_swing (1.7.0 preview 1) irb(main):001:0> JRUBY_VERSION "1.7.0.preview1" irb(main):002:0> RUBY_VERSION "1.9.3" irb(main):003:0> 'a'.encoding #<Encoding:Windows-31J> irb(main):004:0> '&#12354;'.encoding SyntaxError: (irb):4: invalid multibyte char (Windows-31J) from org/jruby/RubyKernel.java:1037:in `eval' from org/jruby/RubyKernel.java:1353:in `loop' from org/jruby/RubyKernel.java:1146:in `catch' from org/jruby/RubyKernel.java:1146:in `catch' from C:/temp/jruby-jruby-00c8c98/bin/jirb_swing:54:in `(root)' jirb_swing still fails, and jirb becomes to fail again(with another reason).
        Hide
        Charles Oliver Nutter added a comment -

        Thank you for the update. We will try to solve this once and for all for 1.7.

        Show
        Charles Oliver Nutter added a comment - Thank you for the update. We will try to solve this once and for all for 1.7.
        Hide
        Jan Kotlar added a comment - - edited

        Hello,
        i have trouble with UTF-8 encoded file scripts. It is working with magic comment but if i want run it withou it i have syntax error. I tried it with jruby 1.6.7.2 and also wiht new 1.7.0.preview1. I use windows 7 operationg system.
        Here is my test script(string is in czech):

        a = "P&#345;li lu&#357;ou&#269;k k&#367;&#328;"
        puts a
        puts a.encoding
        

        and if i run this console command:

        jruby -Ku --1.9 test_encoding.rb
        

        i got Syntax error:

        SyntaxError: test_encoding.rb:1: invalid multibyte char (US-ASCII)
        

        i thing -Ku parameter should be exactly for setting file encoding but it doesn't work.
        jruby and OS version i tryed:

        jruby 1.7.0.preview1 (ruby-1.9.3-p203) (2012-05-19 00c8c98) (Java HotSpot(TM) Client VM 1.6.0_31) [Windows 7-x86-java]
        and
        jruby 1.6.7.2 (ruby-1.9.2-p312) (2012-05-01 26e08ba) (Java HotSpot(TM) Client VM 1.6.0_31) [Windows 7-x86-java]
        

        Thanks for help.

        Show
        Jan Kotlar added a comment - - edited Hello, i have trouble with UTF-8 encoded file scripts. It is working with magic comment but if i want run it withou it i have syntax error. I tried it with jruby 1.6.7.2 and also wiht new 1.7.0.preview1. I use windows 7 operationg system. Here is my test script(string is in czech): a = "P&#345;li lu&#357;ou&#269;k k&#367;&#328;" puts a puts a.encoding and if i run this console command: jruby -Ku --1.9 test_encoding.rb i got Syntax error: SyntaxError: test_encoding.rb:1: invalid multibyte char (US-ASCII) i thing -Ku parameter should be exactly for setting file encoding but it doesn't work. jruby and OS version i tryed: jruby 1.7.0.preview1 (ruby-1.9.3-p203) (2012-05-19 00c8c98) (Java HotSpot(TM) Client VM 1.6.0_31) [Windows 7-x86-java] and jruby 1.6.7.2 (ruby-1.9.2-p312) (2012-05-01 26e08ba) (Java HotSpot(TM) Client VM 1.6.0_31) [Windows 7-x86-java] Thanks for help.
        Show
        Youhei Kondou added a comment - - edited In jirb and jirb_swing on 1.7.0 RC1 and RC2, http://jira.codehaus.org/browse/JRUBY-5156?focusedCommentId=299233&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-299233 still occurs.
        Hide
        Thomas E Enebo added a comment -

        Can you run this and tell me your result:

        jruby -e 'require "java"; puts java.lang.System.getProperty("file.encoding")'
        

        Also does your input charset match this value? Does your input charset match default_external in Ruby?

        I am assuming that your default_external is not matching file.encoding system property. I even have a comment about this being a problem in readline source code.

        Show
        Thomas E Enebo added a comment - Can you run this and tell me your result: jruby -e 'require "java"; puts java.lang.System.getProperty("file.encoding")' Also does your input charset match this value? Does your input charset match default_external in Ruby? I am assuming that your default_external is not matching file.encoding system property. I even have a comment about this being a problem in readline source code.
        Hide
        Youhei Kondou added a comment -

        Yes, these two are different.

        > jruby -e "require 'java'; puts java.lang.System.getProperty('file.encoding')"
        MS932
        > jruby -e "puts Encoding.default_external"
        Windows-31J

        But, these two indicate same codepoints.

        Show
        Youhei Kondou added a comment - Yes, these two are different. > jruby -e "require 'java'; puts java.lang.System.getProperty('file.encoding')" MS932 > jruby -e "puts Encoding.default_external" Windows-31J But, these two indicate same codepoints.

          People

          • Assignee:
            Charles Oliver Nutter
            Reporter:
            Youhei Kondou
          • Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

            • Created:
              Updated: