Details

    • Number of attachments :
      1

      Description

      String#slice on strings with multibyte characters raises Java::JavaLang::ArrayIndexOutOfBoundsException for certain parameters.

      > 'å'.encoding.to_s
      => "UTF-8"

      > 'å'.slice(0,16)
      => "å", whereas

      > 'å'.slice(0,17)
      => Java::JavaLang::ArrayIndexOutOfBoundsException

      Moreover,

      > '1234567890å'.slice(0,17)
      => "1234567890å"

      > '1234567890å'.slice(1,17)
      => "234567890å\x00\x00\x00\x00"

      > '1234567890å'.slice(9,17)
      => Java::JavaLang::ArrayIndexOutOfBoundsException

      The examples above works in MRI Ruby 1.9.3.

      1. slice_bug.rb
        0.7 kB
        Joel Karlsson

        Activity

        Hide
        Hiro Asari added a comment -

        Could you attach the test case as a file, rather than as inline code?

        Also, could you test it on the master, since there have been a lot of fixes involving encoding stuff there?

        Thank you.

        Show
        Hiro Asari added a comment - Could you attach the test case as a file, rather than as inline code? Also, could you test it on the master, since there have been a lot of fixes involving encoding stuff there? Thank you.
        Hide
        Joel Karlsson added a comment -

        The described behaviour is still present on master. Moreover, evaluating the code from the file in IRB does not give quite the same results (see comments in code).

        Show
        Joel Karlsson added a comment - The described behaviour is still present on master. Moreover, evaluating the code from the file in IRB does not give quite the same results (see comments in code).
        Hide
        Charles Oliver Nutter added a comment -

        Behavior comparison between JRuby master and Ruby 1.9.3:

        system ~/projects/jruby $ jruby ~/Downloads/slice_bug.rb 
        > ''.slice(0,16)
        ""
        > ''.slice(0,17)
        ""
        > ''.slice(1,16)
        ""
        > ''.slice(1,17)
        Java::JavaLang::ArrayIndexOutOfBoundsException
        > '1234567890'.slice(0,17)
        "1234567890"
        > '1234567890'.slice(1,17)
        Java::JavaLang::ArrayIndexOutOfBoundsException
        > '1234567890'.slice(9,17)
        Java::JavaLang::ArrayIndexOutOfBoundsException
        
        system ~/projects/jruby $ ruby-1.9.3 ~/Downloads/slice_bug.rb 
        > ''.slice(0,16)
        ""
        > ''.slice(0,17)
        ""
        > ''.slice(1,16)
        ""
        > ''.slice(1,17)
        ""
        > '1234567890'.slice(0,17)
        "1234567890"
        > '1234567890'.slice(1,17)
        "234567890"
        > '1234567890'.slice(9,17)
        "0"
        
        Show
        Charles Oliver Nutter added a comment - Behavior comparison between JRuby master and Ruby 1.9.3: system ~/projects/jruby $ jruby ~/Downloads/slice_bug.rb > ''.slice(0,16) "" > ''.slice(0,17) "" > ''.slice(1,16) "" > ''.slice(1,17) Java::JavaLang::ArrayIndexOutOfBoundsException > '1234567890'.slice(0,17) "1234567890" > '1234567890'.slice(1,17) Java::JavaLang::ArrayIndexOutOfBoundsException > '1234567890'.slice(9,17) Java::JavaLang::ArrayIndexOutOfBoundsException system ~/projects/jruby $ ruby-1.9.3 ~/Downloads/slice_bug.rb > ''.slice(0,16) "" > ''.slice(0,17) "" > ''.slice(1,16) "" > ''.slice(1,17) "" > '1234567890'.slice(0,17) "1234567890" > '1234567890'.slice(1,17) "234567890" > '1234567890'.slice(9,17) "0"
        Hide
        Charles Oliver Nutter added a comment -

        I have a fix. String#slice did not, for the path that led to multibyte slicing, have a guard to ensure lengths outside the actual string length were truncated. Adding a simple check fixes the cases provided.

        The behavioral difference in IRB is likely due to strings parsed in IRB having different sizes for their backing store. Where the JRuby compiler (used on scripts run directly) will create strings of exactly the length in code, the JRuby interpreter and IRB parser will often have some room around the edges (since they'll be slicing off chunks of a larger line or the entire file in most cases).

        Show
        Charles Oliver Nutter added a comment - I have a fix. String#slice did not, for the path that led to multibyte slicing, have a guard to ensure lengths outside the actual string length were truncated. Adding a simple check fixes the cases provided. The behavioral difference in IRB is likely due to strings parsed in IRB having different sizes for their backing store. Where the JRuby compiler (used on scripts run directly) will create strings of exactly the length in code, the JRuby interpreter and IRB parser will often have some room around the edges (since they'll be slicing off chunks of a larger line or the entire file in most cases).
        Hide
        Charles Oliver Nutter added a comment -
        commit 91a60f38e45b540a5bc8ba8afe5ec1d8f7babdc1
        Author: Charles Oliver Nutter <headius@headius.com>
        Date:   Mon Aug 27 13:51:59 2012 -0500
        
            Fix JRUBY-6860
            
            String#slice on strings with multibyte chars fails
            
            String#slice was missing range checks on the length that would
            prevent walking off the end in unguarded methods it called. Added
            those checks.
        
        :000000 100644 0000000... 4b2702e... A	spec/regression/JRUBY-6860_slice_needs_range_check_spec.rb
        :100644 100644 8397e34... 548d3fc... M	src/org/jruby/RubyString.java
        
        Show
        Charles Oliver Nutter added a comment - commit 91a60f38e45b540a5bc8ba8afe5ec1d8f7babdc1 Author: Charles Oliver Nutter <headius@headius.com> Date: Mon Aug 27 13:51:59 2012 -0500 Fix JRUBY-6860 String#slice on strings with multibyte chars fails String#slice was missing range checks on the length that would prevent walking off the end in unguarded methods it called. Added those checks. :000000 100644 0000000... 4b2702e... A spec/regression/JRUBY-6860_slice_needs_range_check_spec.rb :100644 100644 8397e34... 548d3fc... M src/org/jruby/RubyString.java

          People

          • Assignee:
            Charles Oliver Nutter
            Reporter:
            Joel Karlsson
          • Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved: