JRuby (please use github issues at http://bugs.jruby.org)
  1. JRuby (please use github issues at http://bugs.jruby.org)
  2. JRUBY-5346

jruby 1.6.0.RC1 doesn't recognize multibyte strings in 1.9 branch

    Details

    • Type: Bug Bug
    • Status: Closed Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: JRuby 1.6RC1
    • Fix Version/s: JRuby 1.6RC2
    • Component/s: Core Classes/Modules
    • Labels:
      None
    • Environment:
      MacOS X 10.6.6
    • Number of attachments :
      0

      Description

      printing german umlauts ae, oe, ue, which can't be pasted in the description field:

      jruby --1.9 -ve "puts ''"

      jruby 1.6.0.RC1 (ruby 1.9.2 trunk 136) (2011-01-10 769f847) (Java HotSpot(TM) 64-Bit Server VM 1.6.0_22) [darwin-x86_64-java]
      : -e:1: invalid multibyte char (US-ASCII) (SyntaxError)

      jruby 1.5.6 (ruby 1.9.2dev trunk 24787) (2010-12-03 9cf97c3) (Java HotSpot(TM) 64-Bit Server VM 1.6.0_22) [x86_64-java]

      ruby 1.9.2p136 (2010-12-25 revision 30365) [x86_64-darwin10.3.0]

        Activity

        Hide
        Charles Oliver Nutter added a comment -

        Ok, I think I figured this out. When we are parsing inline scripts in 1.9 mode, we should be defaulting to the locale's encoding, which appears to usually be UTF-8. Fixing.

        Show
        Charles Oliver Nutter added a comment - Ok, I think I figured this out. When we are parsing inline scripts in 1.9 mode, we should be defaulting to the locale's encoding, which appears to usually be UTF-8. Fixing.
        Hide
        Charles Oliver Nutter added a comment -

        commit 5f8bfc2ce908349fee11f6c14b2535fff1e3e968
        Author: Charles Oliver Nutter <headius@headius.com>
        Date: Thu Jan 27 01:29:52 2011 -0600

        Fix JRUBY-5346: jruby 1.6.0.RC1 doesn't recognize multibyte strings in 1.9 branch

        • inline scripts should assume locale's encoding in 1.9 mode.
        Show
        Charles Oliver Nutter added a comment - commit 5f8bfc2ce908349fee11f6c14b2535fff1e3e968 Author: Charles Oliver Nutter <headius@headius.com> Date: Thu Jan 27 01:29:52 2011 -0600 Fix JRUBY-5346 : jruby 1.6.0.RC1 doesn't recognize multibyte strings in 1.9 branch inline scripts should assume locale's encoding in 1.9 mode.
        Hide
        Hans-Georg Hhne added a comment -

        ruby --1.9 -ve "p ''"
        jruby 1.6.0.RC1 (ruby 1.9.2 patchlevel 136) (2011-01-27 5f8bfc2) (Java HotSpot(TM) 64-Bit Server VM 1.6.0_22) [darwin-x86_64-java]
        "\xC3\xA4\xC3\xB6\xC3\xBC\xC3\x9F"

        jruby -ve "p ''"
        jruby 1.6.0.RC1 (ruby 1.8.7 patchlevel 330) (2011-01-27 5f8bfc2) (Java HotSpot(TM) 64-Bit Server VM 1.6.0_22) [darwin-x86_64-java]
        "\303\244\303\266\303\274\303\237"

        Show
        Hans-Georg Hhne added a comment - ruby --1.9 -ve "p ''" jruby 1.6.0.RC1 (ruby 1.9.2 patchlevel 136) (2011-01-27 5f8bfc2) (Java HotSpot(TM) 64-Bit Server VM 1.6.0_22) [darwin-x86_64-java] "\xC3\xA4\xC3\xB6\xC3\xBC\xC3\x9F" jruby -ve "p ''" jruby 1.6.0.RC1 (ruby 1.8.7 patchlevel 330) (2011-01-27 5f8bfc2) (Java HotSpot(TM) 64-Bit Server VM 1.6.0_22) [darwin-x86_64-java] "\303\244\303\266\303\274\303\237"
        Hide
        Hans-Georg Hhne added a comment -

        ruby --1.9 -ve "puts ''"
        jruby 1.6.0.RC1 (ruby 1.9.2 patchlevel 136) (2011-01-27 5f8bfc2) (Java HotSpot(TM) 64-Bit Server VM 1.6.0_22) [darwin-x86_64-java]

        jruby --1.9 -ve "p ''"
        jruby 1.6.0.RC1 (ruby 1.9.2 patchlevel 136) (2011-01-27 5f8bfc2) (Java HotSpot(TM) 64-Bit Server VM 1.6.0_22) [darwin-x86_64-java]
        "\xC3\xA4\xC3\xB6\xC3\xBC\xC3\x9F"

        ruby -ve "p ''"
        ruby 1.9.2p136 (2010-12-25 revision 30365) [x86_64-darwin10.3.0]
        ""

        Show
        Hans-Georg Hhne added a comment - ruby --1.9 -ve "puts ''" jruby 1.6.0.RC1 (ruby 1.9.2 patchlevel 136) (2011-01-27 5f8bfc2) (Java HotSpot(TM) 64-Bit Server VM 1.6.0_22) [darwin-x86_64-java] jruby --1.9 -ve "p ''" jruby 1.6.0.RC1 (ruby 1.9.2 patchlevel 136) (2011-01-27 5f8bfc2) (Java HotSpot(TM) 64-Bit Server VM 1.6.0_22) [darwin-x86_64-java] "\xC3\xA4\xC3\xB6\xC3\xBC\xC3\x9F" ruby -ve "p ''" ruby 1.9.2p136 (2010-12-25 revision 30365) [x86_64-darwin10.3.0] ""
        Hide
        Thomas E Enebo added a comment -

        The difference you are seeing is not this issue, but the fact that String.inspect is displaying the string differently. I am pretty sure we have an issue for this open (and for sure we have failing specs on this).

        There is still an issue with the patch, in that locale is probably not the correct value here. Or at least it is not always the right value. If I supply -Ks to the supplied test case then the strings encoding should be 'Windows31J'. I will open up a new issue on this though.

        Show
        Thomas E Enebo added a comment - The difference you are seeing is not this issue, but the fact that String.inspect is displaying the string differently. I am pretty sure we have an issue for this open (and for sure we have failing specs on this). There is still an issue with the patch, in that locale is probably not the correct value here. Or at least it is not always the right value. If I supply -Ks to the supplied test case then the strings encoding should be 'Windows31J'. I will open up a new issue on this though.
        Hide
        Thomas E Enebo added a comment -

        My intuition about default_external is wrong, but there is still an issue. Followup issue is JRUBY-5432.

        Show
        Thomas E Enebo added a comment - My intuition about default_external is wrong, but there is still an issue. Followup issue is JRUBY-5432 .

          People

          • Assignee:
            Charles Oliver Nutter
            Reporter:
            Hans-Georg Hhne
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved: