Details

    • Type: Bug Bug
    • Status: Closed Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: JRuby 1.6.4, JRuby 1.6.5
    • Fix Version/s: JRuby 1.6.6, JRuby 1.7.0.pre1
    • Component/s: None
    • Labels:
      None
    • Environment:
      Nokogiri 1.5.0
      export JRUBY_OPTS="--1.9"
    • Testcase included:
      yes
    • Number of attachments :
      1

      Description

      In 1.9-mode, when a UTF-8 character is present in an XML string, Nokogiri does some regexp work that gets stuck within Joni:

      "main" prio=5 tid=7fcbdc801000 nid=0x10971c000 runnable [10971a000]
      java.lang.Thread.State: RUNNABLE
      at org.joni.Matcher.matchCheck(Matcher.java:293)
      at org.joni.Matcher.search(Matcher.java:461)
      at org.jruby.RubyRegexp.search(RubyRegexp.java:1489)
      at org.jruby.RubyRegexp.op_match(RubyRegexp.java:1406)
      at org.jruby.ast.Match3Node.interpret(Match3Node.java:101)
      at org.jruby.ast.OrNode.interpret(OrNode.java:98)
      at org.jruby.ast.IfNode.interpret(IfNode.java:111)
      at org.jruby.ast.LocalAsgnNode.interpret(LocalAsgnNode.java:123)
      at org.jruby.ast.NewlineNode.interpret(NewlineNode.java:104)
      at org.jruby.ast.BlockNode.interpret(BlockNode.java:71)
      at org.jruby.evaluator.ASTInterpreter.INTERPRET_METHOD(ASTInterpreter.java:75)
      at org.jruby.internal.runtime.methods.InterpretedMethod.call(InterpretedMethod.java:190)
      at org.jruby.internal.runtime.methods.DefaultMethod.call(DefaultMethod.java:179)
      at org.jruby.runtime.callsite.CachingCallSite.cacheAndCall(CachingCallSite.java:312)
      at org.jruby.runtime.callsite.CachingCallSite.call(CachingCallSite.java:169)
      at regexp_killer._file_(regexp_killer.rb:4)
      at regexp_killer.load(regexp_killer.rb)
      at org.jruby.Ruby.runScript(Ruby.java:693)
      at org.jruby.Ruby.runScript(Ruby.java:686)
      at org.jruby.Ruby.runNormally(Ruby.java:593)
      at org.jruby.Ruby.runFromMain(Ruby.java:442)
      at org.jruby.Main.doRunFromMain(Main.java:321)
      at org.jruby.Main.internalRun(Main.java:241)
      at org.jruby.Main.run(Main.java:207)
      at org.jruby.Main.run(Main.java:191)
      at org.jruby.Main.main(Main.java:171)

      This will work in 1.8 mode, but breaks in 1.9 mode for 1.6.4, 1.6.5 and HEAD:

      #encoding: utf-8
      require 'nokogiri'
      xml = %q

      {<?xml version="1.0" encoding="UTF-8"?><hörna/>}

      parsed_xml = Nokogiri.parse(xml)
      puts "done!"

        Activity

        Hide
        Yoko Harada added a comment -

        I know this. Still, many of nokogiri tests fail on 1.9 mode. In somewhere, string conversion Java to/from Ruby doesn't go well.
        So, I think this is not a Joni problem.

        Show
        Yoko Harada added a comment - I know this. Still, many of nokogiri tests fail on 1.9 mode. In somewhere, string conversion Java to/from Ruby doesn't go well. So, I think this is not a Joni problem.
        Hide
        Charles Oliver Nutter added a comment -

        It does appear to be spinning in Joni, but I don't know yet whether it's actually a Joni problem or if the nokogiri code is doing something odd:

        "main" prio=5 tid=103800800 nid=0x100601000 runnable [1005ff000]
           java.lang.Thread.State: RUNNABLE
        	at org.joni.Matcher.matchCheck(Matcher.java:293)
        	at org.joni.Matcher.search(Matcher.java:461)
        	at org.jruby.RubyRegexp.search(RubyRegexp.java:1541)
        	at org.jruby.RubyRegexp.op_match(RubyRegexp.java:1458)
        	at org.jruby.javasupport.util.RuntimeHelpers.match3(RuntimeHelpers.java:1588)
        	at Users.headius.projects.jruby.lib.ruby.gems.$1_dot_8.gems.nokogiri_minus_1_dot_5_dot_0_minus_java.lib.nokogiri.method__2$RUBY$parse(/Users/headius/projects/jruby/lib/ruby/gems/1.8/gems/nokogiri-1.5.0-java/lib/nokogiri.rb:67)
        	at Users$headius$projects$jruby$lib$ruby$gems$$1_dot_8$gems$nokogiri_minus_1_dot_5_dot_0_minus_java$lib$nokogiri$method__2$RUBY$parse.call(Users$headius$projects$jruby$lib$ruby$gems$$1_dot_8$gems$nokogiri_minus_1_dot_5_dot_0_minus_java$lib$nokogiri$method__2$RUBY$parse:65535)
        	at org.jruby.internal.runtime.methods.DynamicMethod.call(DynamicMethod.java:216)
        
        Show
        Charles Oliver Nutter added a comment - It does appear to be spinning in Joni, but I don't know yet whether it's actually a Joni problem or if the nokogiri code is doing something odd: "main" prio=5 tid=103800800 nid=0x100601000 runnable [1005ff000] java.lang.Thread.State: RUNNABLE at org.joni.Matcher.matchCheck(Matcher.java:293) at org.joni.Matcher.search(Matcher.java:461) at org.jruby.RubyRegexp.search(RubyRegexp.java:1541) at org.jruby.RubyRegexp.op_match(RubyRegexp.java:1458) at org.jruby.javasupport.util.RuntimeHelpers.match3(RuntimeHelpers.java:1588) at Users.headius.projects.jruby.lib.ruby.gems.$1_dot_8.gems.nokogiri_minus_1_dot_5_dot_0_minus_java.lib.nokogiri.method__2$RUBY$parse(/Users/headius/projects/jruby/lib/ruby/gems/1.8/gems/nokogiri-1.5.0-java/lib/nokogiri.rb:67) at Users$headius$projects$jruby$lib$ruby$gems$$1_dot_8$gems$nokogiri_minus_1_dot_5_dot_0_minus_java$lib$nokogiri$method__2$RUBY$parse.call(Users$headius$projects$jruby$lib$ruby$gems$$1_dot_8$gems$nokogiri_minus_1_dot_5_dot_0_minus_java$lib$nokogiri$method__2$RUBY$parse:65535) at org.jruby.internal.runtime.methods.DynamicMethod.call(DynamicMethod.java:216)
        Hide
        Hiro Asari added a comment - - edited

        This is fixed on master and the 1.6 branch (tested with Nokogiri 1.5.0 java).

        Show
        Hiro Asari added a comment - - edited This is fixed on master and the 1.6 branch (tested with Nokogiri 1.5.0 java).

          People

          • Assignee:
            Yoko Harada
            Reporter:
            Anders Bengtsson
          • Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved: