Details

    • Type: Bug Bug
    • Status: Resolved Resolved
    • Priority: Major Major
    • Resolution: Won't Fix
    • Affects Version/s: JRuby 1.1.4
    • Fix Version/s: None
    • Component/s: Core Classes/Modules
    • Labels:
      None
    • Environment:
      using ubuntu 8.04
    • Number of attachments :
      2

      Description

      ###############################################################
      #a simple rails controller , rails 2.0.2 and jruby 1.1.4
      ###############################################################

      class TesterController < ApplicationController
      require 'jcode'
      def index(len=8)
      chars = ("a".."z").to_a + ("A".."Z").to_a + ("0".."9").to_a
      guid = ""
      1.upto(len)

      { |i| guid << chars[rand(chars.size-1)] }

      render :text => guid
      end
      end

      ###############################################################
      #am using jruby 1.1.4
      #and am getting this error when i do require 'jcode' in line 2,
      #but when removing it things goes ok....
      #too short multibyte code string: #/[\xc0-\xdf][\x80-\xbf]|[\xe0-\xef][\x80-\xbf][\x80-\xbf]$/
      #/home/khelll/projects/jruby/lib/ruby/1.8/jcode.rb:66:in `end_regexp'
      #/home/khelll/projects/jruby/lib/ruby/1.8/jcode.rb:79:in `succ!'
      #/home/khelll/projects/jruby/lib/ruby/1.8/jcode.rb:94:in `succ'
      #app/controllers/tester_controller.rb:4:in `each'
      #app/controllers/tester_controller.rb:4:in `index'
      #:1:in `initialize'
      ###############################################################

      1. jcode.rb.patch
        0.7 kB
        lunlumo
      2. pastie-263899.rb
        1 kB
        khaled al habache

        Activity

        Hide
        Jens-Christian Fischer added a comment -

        We ran into this bug too... Some experimenting shows:

        original problem:
        /cygdrive/c/Programme/jruby-1.3.1/lib/ruby/gems/1.8/cache
        $ jruby -Ku -e "/[\xc0-\xdf][\x80-\xbf]|[\xe0-\xef][\x80-\xbf][\x80-\xbf]|[\xc0-\xdf][\x80-\xbf]|[\xe0-\xef][\x80-\xbf][\x80-\xbf]/"
        -e:1: too short multibyte code string: /[\xc0-\xdf][\x80-\xbf]|[\xe0-\xef][\x80-\xbf][\x80-\xbf]|[\xc0-\xdf][\x80-\xbf]|[\xe0-\xef][\x80-\xbf][\x80-\xbf]/ (SyntaxError)

        Replacing / / with Regexp.new (as done a few lines below the defintion of the mulitbyte string in jcode.rb:

        /cygdrive/c/Programme/jruby-1.3.1/lib/ruby/gems/1.8/cache
        $ jruby -Ku -e "Regexp.new('[\xc0-\xdf][\x80-\xbf]|[\xe0-\xef][\x80-\xbf][\x80-\xbf]|[\xc0-\xdf][\x80-\xbf]|[\xe0-\xef][\x80-\xbf][\x80-\xbf]', 0, 'n')"

        works!!

        now using the same parameters "on" for the regexp generation fails:

        /cygdrive/c/Programme/jruby-1.3.1/lib/ruby/gems/1.8/cache
        $ jruby -Ku -e "Regexp.new('^[\xc0-\xdf][\x80-\xbf]|[\xe0-\xef][\x80-\xbf][\x80-\xbf]|[\xc0-\xdf][\x80-\xbf]|[\xe0-\xef][\x80-\xbf][\x80-\xbf]$', 0, 'on')"
        -e:1: too short multibyte code string: /^[\xc0-\xdf][\x80-\xbf]|[\xe0-\xef][\x80-\xbf][\x80-\xbf]|[\xc0-\xdf][\x80-\xbf]|[\xe0-\xef][\x80-\xbf][\x80-\xbf]$/ (RegexpError)

        Dropping the "o" parameter from the Regexp, works:

        /cygdrive/c/Programme/jruby-1.3.1/lib/ruby/gems/1.8/cache
        $ jruby -Ku -e "/[\xc0-\xdf][\x80-\xbf]|[\xe0-\xef][\x80-\xbf][\x80-\xbf]|[\xc0-\xdf][\x80-\xbf]|[\xe0-\xef][\x80-\xbf][\x80-\xbf]/n"

        The temporaray fix for us was to patch line 66 of jcode.rb to drop the o from the reqexp generation. If I understand the option correctly, it's not needed here, anyway.

        Show
        Jens-Christian Fischer added a comment - We ran into this bug too... Some experimenting shows: original problem: /cygdrive/c/Programme/jruby-1.3.1/lib/ruby/gems/1.8/cache $ jruby -Ku -e "/ [\xc0-\xdf] [\x80-\xbf] | [\xe0-\xef] [\x80-\xbf] [\x80-\xbf] | [\xc0-\xdf] [\x80-\xbf] | [\xe0-\xef] [\x80-\xbf] [\x80-\xbf] /" -e:1: too short multibyte code string: / [\xc0-\xdf] [\x80-\xbf] | [\xe0-\xef] [\x80-\xbf] [\x80-\xbf] | [\xc0-\xdf] [\x80-\xbf] | [\xe0-\xef] [\x80-\xbf] [\x80-\xbf] / (SyntaxError) Replacing / / with Regexp.new (as done a few lines below the defintion of the mulitbyte string in jcode.rb: /cygdrive/c/Programme/jruby-1.3.1/lib/ruby/gems/1.8/cache $ jruby -Ku -e "Regexp.new(' [\xc0-\xdf] [\x80-\xbf] | [\xe0-\xef] [\x80-\xbf] [\x80-\xbf] | [\xc0-\xdf] [\x80-\xbf] | [\xe0-\xef] [\x80-\xbf] [\x80-\xbf] ', 0, 'n')" works!! now using the same parameters "on" for the regexp generation fails: /cygdrive/c/Programme/jruby-1.3.1/lib/ruby/gems/1.8/cache $ jruby -Ku -e "Regexp.new('^ [\xc0-\xdf] [\x80-\xbf] | [\xe0-\xef] [\x80-\xbf] [\x80-\xbf] | [\xc0-\xdf] [\x80-\xbf] | [\xe0-\xef] [\x80-\xbf] [\x80-\xbf] $', 0, 'on')" -e:1: too short multibyte code string: /^ [\xc0-\xdf] [\x80-\xbf] | [\xe0-\xef] [\x80-\xbf] [\x80-\xbf] | [\xc0-\xdf] [\x80-\xbf] | [\xe0-\xef] [\x80-\xbf] [\x80-\xbf] $/ (RegexpError) Dropping the "o" parameter from the Regexp, works: /cygdrive/c/Programme/jruby-1.3.1/lib/ruby/gems/1.8/cache $ jruby -Ku -e "/ [\xc0-\xdf] [\x80-\xbf] | [\xe0-\xef] [\x80-\xbf] [\x80-\xbf] | [\xc0-\xdf] [\x80-\xbf] | [\xe0-\xef] [\x80-\xbf] [\x80-\xbf] /n" The temporaray fix for us was to patch line 66 of jcode.rb to drop the o from the reqexp generation. If I understand the option correctly, it's not needed here, anyway.
        Hide
        lunlumo added a comment -

        Third parameter of Regexp.new is the character set, that given match context. So, the below code work well.

        $ jruby -Ku -e "/[\xc0-\xdf][\x80-\xbf]|[\xe0-\xef][\x80-\xbf][\x80-\xbf]|[\xc0-\xdf][\x80-\xbf]|[\xe0-\xef][\x80-\xbf][\x80-\xbf]/on"

        But, execute the below program, then only last statement raise exception.

        test.rb
        $KCODE='u'
        
        puts 'a&#65345;' =~ /[\xc0-\xdf][\x80-\xbf]|[\xe0-\xef][\x80-\xbf][\x80-\xbf]|[\xc0-\xdf][\x80-\xbf]|[\xe0-\xef][\x80-\xbf][\x80-\xbf]/n #=> 1
        
        puts 'a&#65345;' =~ /[\xc0-\xdf][\x80-\xbf]|[\xe0-\xef][\x80-\xbf][\x80-\xbf]|[\xc0-\xdf][\x80-\xbf]|[\xe0-\xef][\x80-\xbf][\x80-\xbf]/on #=> 1
        
        pattern = '[\xc0-\xdf][\x80-\xbf]|[\xe0-\xef][\x80-\xbf][\x80-\xbf]|[\xc0-\xdf][\x80-\xbf]|[\xe0-\xef][\x80-\xbf][\x80-\xbf]'
        puts 'a&#65345;' =~ /#{pattern}/on #=> SyntaxError
        
        

        Therefore, I suspect, both o and n option given in jruby, then expand expression to string first, and create regex object after.
        Original probrem in jcode.rb is the same cause in the above code. (Attached ad-hoc patch for jcode.rb.)

        Show
        lunlumo added a comment - Third parameter of Regexp.new is the character set, that given match context. So, the below code work well. $ jruby -Ku -e "/ [\xc0-\xdf] [\x80-\xbf] | [\xe0-\xef] [\x80-\xbf] [\x80-\xbf] | [\xc0-\xdf] [\x80-\xbf] | [\xe0-\xef] [\x80-\xbf] [\x80-\xbf] /on" But, execute the below program, then only last statement raise exception. test.rb $KCODE='u' puts 'a&#65345;' =~ /[\xc0-\xdf][\x80-\xbf]|[\xe0-\xef][\x80-\xbf][\x80-\xbf]|[\xc0-\xdf][\x80-\xbf]|[\xe0-\xef][\x80-\xbf][\x80-\xbf]/n #=> 1 puts 'a&#65345;' =~ /[\xc0-\xdf][\x80-\xbf]|[\xe0-\xef][\x80-\xbf][\x80-\xbf]|[\xc0-\xdf][\x80-\xbf]|[\xe0-\xef][\x80-\xbf][\x80-\xbf]/on #=> 1 pattern = '[\xc0-\xdf][\x80-\xbf]|[\xe0-\xef][\x80-\xbf][\x80-\xbf]|[\xc0-\xdf][\x80-\xbf]|[\xe0-\xef][\x80-\xbf][\x80-\xbf]' puts 'a&#65345;' =~ /#{pattern}/on #=> SyntaxError Therefore, I suspect, both o and n option given in jruby, then expand expression to string first, and create regex object after. Original probrem in jcode.rb is the same cause in the above code. (Attached ad-hoc patch for jcode.rb.)
        Hide
        phoenix added a comment -

        jruby 1.4.0 (ruby 1.8.7 patchlevel 174) (2009-11-02 69fbfa3) (Java HotSpot(TM) Client VM 1.6.0_14) [x86-java]

        prompt $ jruby -Ku -e "/\265\332 \306\332/"
        -e:1: too short multibyte code string: /\265\332 \306\332/ (SyntaxError)

        Show
        phoenix added a comment - jruby 1.4.0 (ruby 1.8.7 patchlevel 174) (2009-11-02 69fbfa3) (Java HotSpot(TM) Client VM 1.6.0_14) [x86-java] prompt $ jruby -Ku -e "/\265\332 \306\332/" -e:1: too short multibyte code string: /\265\332 \306\332/ (SyntaxError)
        Hide
        Ivo Wever added a comment - - edited

        Seems fixed in 1.5.6.

        jruby -v -Ku -e "/\265\332 \306\332/"
        jruby 1.5.6 (ruby 1.8.7 patchlevel 249) (2010-12-03 9cf97c3) (Java HotSpot(TM) 64-Bit Server VM 1.6.0_20) [amd64-java]
        -e:1 warning: Useless use of a literal in void context.
        

        but not other output

        Other snippets from this issue, including

         jruby -Ku -e "/[\xc0-\xdf][\x80-\xbf]|[\xe0-\xef][\x80-\xbf][\x80-\xbf]$/"
         jruby -Ku -e "/[\xc0-\xdf][\x80-\xbf]|[\xe0-\xef][\x80-\xbf][\x80-\xbf]$/"
         jruby -Ku -e "Regexp.new('^[\xc0-\xdf][\x80-\xbf]|[\xe0-\xef][\x80-\xbf][\x80-\xbf]|[\xc0-\xdf][\x80-\xbf]|[\xe0-\xef][\x80-\xbf][\x80-\xbf]$', 0, 'on')"
         jruby -Ku -e "/[\xc0-\xdf][\x80-\xbf]|[\xe0-\xef][\x80-\xbf][\x80-\xbf]|[\xc0-\xdf][\x80-\xbf]|[\xe0-\xef][\x80-\xbf][\x80-\xbf]/on"
        

        give the same result

        Show
        Ivo Wever added a comment - - edited Seems fixed in 1.5.6. jruby -v -Ku -e "/\265\332 \306\332/" jruby 1.5.6 (ruby 1.8.7 patchlevel 249) (2010-12-03 9cf97c3) (Java HotSpot(TM) 64-Bit Server VM 1.6.0_20) [amd64-java] -e:1 warning: Useless use of a literal in void context. but not other output Other snippets from this issue, including jruby -Ku -e "/[\xc0-\xdf][\x80-\xbf]|[\xe0-\xef][\x80-\xbf][\x80-\xbf]$/" jruby -Ku -e "/[\xc0-\xdf][\x80-\xbf]|[\xe0-\xef][\x80-\xbf][\x80-\xbf]$/" jruby -Ku -e "Regexp. new ('^[\xc0-\xdf][\x80-\xbf]|[\xe0-\xef][\x80-\xbf][\x80-\xbf]|[\xc0-\xdf][\x80-\xbf]|[\xe0-\xef][\x80-\xbf][\x80-\xbf]$', 0, 'on')" jruby -Ku -e "/[\xc0-\xdf][\x80-\xbf]|[\xe0-\xef][\x80-\xbf][\x80-\xbf]|[\xc0-\xdf][\x80-\xbf]|[\xe0-\xef][\x80-\xbf][\x80-\xbf]/on" give the same result
        Hide
        Charles Oliver Nutter added a comment -

        jcode is gone in 1.9.3, so I'm marking this Won't Fix.

        Show
        Charles Oliver Nutter added a comment - jcode is gone in 1.9.3, so I'm marking this Won't Fix.

          People

          • Assignee:
            Charles Oliver Nutter
            Reporter:
            khaled al habache
          • Votes:
            4 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved: