JRuby (please use github issues at http://bugs.jruby.org)
  1. JRuby (please use github issues at http://bugs.jruby.org)
  2. JRUBY-6200

[1.9] Loading some Unicode characters results in non-printable characters on Windows

    Details

    • Type: Bug Bug
    • Status: Closed Closed
    • Priority: Blocker Blocker
    • Resolution: Fixed
    • Affects Version/s: JRuby 1.6.5
    • Fix Version/s: JRuby 1.6.6
    • Component/s: Ruby 1.9.3, Windows
    • Labels:
      None
    • Environment:
      Windows 7 64-bit, JRuby 1.6.5 in 1.9.2 mode
    • Number of attachments :
      0

      Description

      So I'm trying to run Cucumber on JRuby in my Rails 3.1.1 application.

      Here's the trace:

      D:\jruby-1.6.5\bin\jruby.exe --1.9 -e $stdout.sync=true;$stderr.sync=true;load($0=ARGV.shift) C:/Users/Patrick/RubymineProjects/sparkbank/script/cucumber C:/Users/Patrick/RubymineProjects/sparkbank/features --format Teamcity::Cucumber::Formatter --expand --color -r features
      Testing started at 1:59 AM ...
      LoadError: load error: gherkin/i18n -- org.yaml.snakeyaml.reader.ReaderException: special characters are not allowed
        require at org/jruby/RubyKernel.java:1047
         (root) at D:/jruby-1.6.5/lib/ruby/gems/1.9/gems/gherkin-2.6.2-java/lib/gherkin/lexer/i18n_lexer.rb:1
        require at org/jruby/RubyKernel.java:1047
         (root) at D:/jruby-1.6.5/lib/ruby/gems/1.9/gems/gherkin-2.6.2-java/lib/gherkin/lexer/i18n_lexer.rb:2
        require at org/jruby/RubyKernel.java:1047
         (root) at D:/jruby-1.6.5/lib/ruby/gems/1.9/gems/cucumber-1.1.2/lib/cucumber/ast/table.rb:3
        require at org/jruby/RubyKernel.java:1047
         (root) at D:/jruby-1.6.5/lib/ruby/gems/1.9/gems/cucumber-1.1.2/lib/cucumber/ast/step_invocation.rb:7
        require at org/jruby/RubyKernel.java:1047
         (root) at D:/jruby-1.6.5/lib/ruby/gems/1.9/gems/cucumber-1.1.2/lib/cucumber/ast.rb:2
        require at org/jruby/RubyKernel.java:1047
         (root) at D:/jruby-1.6.5/lib/ruby/gems/1.9/gems/cucumber-1.1.2/lib/cucumber/parser.rb:6
        require at org/jruby/RubyKernel.java:1047
         (root) at D:/jruby-1.6.5/lib/ruby/gems/1.9/gems/cucumber-1.1.2/lib/cucumber.rb:8
           load at org/jruby/RubyKernel.java:1073
         (root) at -e:1
      
      Process finished with exit code 1
      

      Here's my Gemfile.lock

      GEM
        remote: http://rubygems.org/
        specs:
          actionmailer (3.1.1)
            actionpack (= 3.1.1)
            mail (~> 2.3.0)
          actionpack (3.1.1)
            activemodel (= 3.1.1)
            activesupport (= 3.1.1)
            builder (~> 3.0.0)
            erubis (~> 2.7.0)
            i18n (~> 0.6)
            rack (~> 1.3.2)
            rack-cache (~> 1.1)
            rack-mount (~> 0.8.2)
            rack-test (~> 0.6.1)
            sprockets (~> 2.0.2)
          activemodel (3.1.1)
            activesupport (= 3.1.1)
            builder (~> 3.0.0)
            i18n (~> 0.6)
          activerecord (3.1.1)
            activemodel (= 3.1.1)
            activesupport (= 3.1.1)
            arel (~> 2.2.1)
            tzinfo (~> 0.3.29)
          activerecord-jdbc-adapter (1.2.0)
          activerecord-jdbcsqlite3-adapter (1.2.0)
            activerecord-jdbc-adapter (~> 1.2.0)
            jdbc-sqlite3 (~> 3.7.2)
          activeresource (3.1.1)
            activemodel (= 3.1.1)
            activesupport (= 3.1.1)
          activesupport (3.1.1)
            multi_json (~> 1.0)
          addressable (2.2.6)
          arel (2.2.1)
          bcrypt-ruby (3.0.1)
          bcrypt-ruby (3.0.1-java)
          bcrypt-ruby (3.0.1-x86-mingw32)
          bouncy-castle-java (1.5.0146.1)
          bourbon (1.1.0)
            sass (>= 3.1)
          builder (3.0.0)
          capistrano (2.9.0)
            highline
            net-scp (>= 1.0.0)
            net-sftp (>= 2.0.0)
            net-ssh (>= 2.0.14)
            net-ssh-gateway (>= 1.1.0)
          capybara (1.1.1)
            mime-types (>= 1.16)
            nokogiri (>= 1.3.3)
            rack (>= 1.0.0)
            rack-test (>= 0.5.4)
            selenium-webdriver (~> 2.0)
            xpath (~> 0.1.4)
          childprocess (0.2.2)
            ffi (~> 1.0.6)
          coffee-rails (3.1.1)
            coffee-script (>= 2.2.0)
            railties (~> 3.1.0)
          coffee-script (2.2.0)
            coffee-script-source
            execjs
          coffee-script-source (1.1.3)
          cucumber (1.1.2)
            builder (>= 2.1.2)
            diff-lcs (>= 1.1.2)
            gherkin (~> 2.6.2)
            json (>= 1.4.6)
            term-ansicolor (>= 1.0.6)
          cucumber-rails (1.2.0)
            capybara (>= 1.1.1)
            cucumber (>= 1.1.1)
            nokogiri (>= 1.5.0)
          database_cleaner (0.6.7)
          diff-lcs (1.1.3)
          erubis (2.7.0)
          execjs (1.2.9)
            multi_json (~> 1.0)
          factory_girl (2.2.0)
            activesupport
          factory_girl_rails (1.3.0)
            factory_girl (~> 2.2.0)
            railties (>= 3.0.0)
          faraday (0.6.1)
            addressable (~> 2.2.4)
            multipart-post (~> 1.1.0)
            rack (>= 1.1.0, < 2)
          ffi (1.0.9)
          ffi (1.0.9-java)
          ffi (1.0.9-x86-mingw32)
          gherkin (2.6.2)
            json (>= 1.4.6)
          gherkin (2.6.2-java)
            json (>= 1.4.6)
          gherkin (2.6.2-x86-mingw32)
            json (>= 1.4.6)
          haml (3.1.3)
          haml-rails (0.3.4)
            actionpack (~> 3.0)
            activesupport (~> 3.0)
            haml (~> 3.0)
            railties (~> 3.0)
          highline (1.6.5)
          hike (1.2.1)
          i18n (0.6.0)
          jdbc-sqlite3 (3.7.2)
          jquery-rails (1.0.17)
            railties (~> 3.0)
            thor (~> 0.14)
          jruby-openssl (0.7.4)
            bouncy-castle-java
          json (1.6.1)
          json (1.6.1-java)
          json_pure (1.6.1)
          launchy (2.0.5)
            addressable (~> 2.2.6)
          libv8 (3.3.10.2)
          mail (2.3.0)
            i18n (>= 0.4.0)
            mime-types (~> 1.16)
            treetop (~> 1.4.8)
          maruku (0.6.0)
            syntax (>= 1.0.0)
          mime-types (1.17.2)
          multi_json (1.0.3)
          multipart-post (1.1.3)
          mysql (2.8.1)
          mysql (2.8.1-x86-mingw32)
          net-scp (1.0.4)
            net-ssh (>= 1.99.1)
          net-sftp (2.0.5)
            net-ssh (>= 2.0.9)
          net-ssh (2.2.1)
          net-ssh-gateway (1.1.0)
            net-ssh (>= 1.99.1)
          nokogiri (1.5.0)
          nokogiri (1.5.0-java)
          nokogiri (1.5.0-x86-mingw32)
          oauth (0.4.5)
          oauth2 (0.4.1)
            faraday (~> 0.6.1)
            multi_json (>= 0.0.5)
          polyamorous (0.5.0)
            activerecord (~> 3.0)
          polyglot (0.3.3)
          rack (1.3.5)
          rack-cache (1.1)
            rack (>= 0.4)
          rack-mount (0.8.3)
            rack (>= 1.0.0)
          rack-ssl (1.3.2)
            rack
          rack-test (0.6.1)
            rack (>= 1.0)
          rails (3.1.1)
            actionmailer (= 3.1.1)
            actionpack (= 3.1.1)
            activerecord (= 3.1.1)
            activeresource (= 3.1.1)
            activesupport (= 3.1.1)
            bundler (~> 1.0)
            railties (= 3.1.1)
          rails-footnotes (3.7.5)
            rails (>= 3.0.0)
          railties (3.1.1)
            actionpack (= 3.1.1)
            activesupport (= 3.1.1)
            rack-ssl (~> 1.3.2)
            rake (>= 0.8.7)
            rdoc (~> 3.4)
            thor (~> 0.14.6)
          rake (0.9.2.2)
          rdoc (3.11)
            json (~> 1.4)
          rspec (2.7.0)
            rspec-core (~> 2.7.0)
            rspec-expectations (~> 2.7.0)
            rspec-mocks (~> 2.7.0)
          rspec-core (2.7.1)
          rspec-expectations (2.7.0)
            diff-lcs (~> 1.1.2)
          rspec-mocks (2.7.0)
          rspec-rails (2.7.0)
            actionpack (~> 3.0)
            activesupport (~> 3.0)
            railties (~> 3.0)
            rspec (~> 2.7.0)
          rubyzip (0.9.4)
          sass (3.1.10)
          sass-rails (3.1.4)
            actionpack (~> 3.1.0)
            railties (~> 3.1.0)
            sass (>= 3.1.4)
            sprockets (~> 2.0.0)
            tilt (~> 1.3.2)
          selenium-webdriver (2.10.0)
            childprocess (>= 0.2.1)
            ffi (= 1.0.9)
            json_pure
            rubyzip
          shoulda (3.0.0.beta2)
            shoulda-context (~> 1.0.0.beta1)
            shoulda-matchers (~> 1.0.0.beta1)
          shoulda-context (1.0.0)
          shoulda-matchers (1.0.0)
          sorcery (0.7.4)
            bcrypt-ruby (~> 3.0.0)
            oauth (~> 0.4.4)
            oauth (~> 0.4.4)
            oauth2 (~> 0.4.1)
            oauth2 (~> 0.4.1)
          sprockets (2.0.3)
            hike (~> 1.2)
            rack (~> 1.0)
            tilt (~> 1.1, != 1.3.0)
          sqlite3 (1.3.4)
          squeel (0.9.3)
            activerecord (~> 3.0)
            activesupport (~> 3.0)
            polyamorous (~> 0.5.0)
          syntax (1.0.0)
          term-ansicolor (1.0.7)
          therubyracer (0.9.9)
            libv8 (~> 3.3.10)
          thor (0.14.6)
          tilt (1.3.3)
          treetop (1.4.10)
            polyglot
            polyglot (>= 0.3.1)
          tzinfo (0.3.31)
          uglifier (1.0.4)
            execjs (>= 0.3.0)
            multi_json (>= 1.0.2)
          xpath (0.1.4)
            nokogiri (~> 1.3)
      
      PLATFORMS
        java
        ruby
        x86-mingw32
      
      DEPENDENCIES
        activerecord-jdbcsqlite3-adapter
        bourbon
        capistrano
        capybara (>= 1.1.1)
        coffee-rails (~> 3.1.0)
        cucumber-rails
        database_cleaner
        factory_girl_rails (>= 1.2.0)
        haml (>= 3.1.2)
        haml-rails (>= 0.3.4)
        jquery-rails
        jruby-openssl
        launchy (>= 2.0.5)
        maruku
        mysql
        rails (= 3.1.1)
        rails-footnotes (>= 3.7)
        rspec-rails (>= 2.6.1)
        sass-rails (~> 3.1.0)
        shoulda (>= 3.0.0.beta2)
        sorcery
        sqlite3
        squeel
        therubyracer (>= 0.8.2)
        uglifier
      

        Issue Links

          Activity

          Hide
          Hiro Asari added a comment -

          Looks like your features file has some non-ascii characters and JRuby is having a problem. And I'm pretty sure we have a previous ticket dealing with that problem.

          Show
          Hiro Asari added a comment - Looks like your features file has some non-ascii characters and JRuby is having a problem. And I'm pretty sure we have a previous ticket dealing with that problem.
          Hide
          Patrick Ma added a comment -

          What would be the best way to debug this problem if this is occurring due to my application side?

          Show
          Patrick Ma added a comment - What would be the best way to debug this problem if this is occurring due to my application side?
          Hide
          Hiro Asari added a comment -

          Try running Cucumber tests one at a time, and isolate the problem(s) to a smaller set of input.

          Show
          Hiro Asari added a comment - Try running Cucumber tests one at a time, and isolate the problem(s) to a smaller set of input.
          Hide
          Patrick Ma added a comment -

          Nope, I removed all my features and ran cucumber and it failed with the same error.

          Show
          Patrick Ma added a comment - Nope, I removed all my features and ran cucumber and it failed with the same error.
          Hide
          John S added a comment -

          Have the same issue. jruby 1.6.4 ruby 1.9. Trying to run a blank set of features in cucumber results in the error above.

          Show
          John S added a comment - Have the same issue. jruby 1.6.4 ruby 1.9. Trying to run a blank set of features in cucumber results in the error above.
          Hide
          Charles Oliver Nutter added a comment -

          This is likely due to a gap in our YAML support...encodings are poorly supported because SnakeYAML only supports Java Strings (UTF-16). We transcode currently, a performance issue in itself, but it's likely that we're not transcoding properly for a case in Cucumber.

          I have sadly never run Cucumber. Can one of you post a short list of instructions for me to reproduce this? I do not think it will be difficult to fix, if SnakeYAML cooperates.

          Show
          Charles Oliver Nutter added a comment - This is likely due to a gap in our YAML support...encodings are poorly supported because SnakeYAML only supports Java Strings (UTF-16). We transcode currently, a performance issue in itself, but it's likely that we're not transcoding properly for a case in Cucumber. I have sadly never run Cucumber. Can one of you post a short list of instructions for me to reproduce this? I do not think it will be difficult to fix, if SnakeYAML cooperates.
          Hide
          John S added a comment -

          Easy to reproduce. Just install cucumber in Jruby 1.6.5 (ruby 1.9) on Windows ala...
          c:/dev/jruby-1.6.4> jruby -S gem install cucumber
          c:/dev/jruby-1.6.4> cucumber --help

          c:/dev/jruby-1.6.4/lib/ruby/gems/1.8/gems/gherkin-2.6.7-java/lib/gherkin/i18n.rb:9 warning: already initialized constant FEATURE_ELEMENT_KEYS
          c:/dev/jruby-1.6.4/lib/ruby/gems/1.8/gems/gherkin-2.6.7-java/lib/gherkin/i18n.rb:10 warning: already initialized constant STEP_KEYWORD_KEYS
          c:/dev/jruby-1.6.4/lib/ruby/gems/1.8/gems/gherkin-2.6.7-java/lib/gherkin/i18n.rb:11 warning: already initialized constant KEYWORD_KEYS
          LoadError: load error: gherkin/i18n – org.yaml.snakeyaml.reader.ReaderException: special characters are not allowed
          require at org/jruby/RubyKernel.java:1047

          Show
          John S added a comment - Easy to reproduce. Just install cucumber in Jruby 1.6.5 (ruby 1.9) on Windows ala... c:/dev/jruby-1.6.4> jruby -S gem install cucumber c:/dev/jruby-1.6.4> cucumber --help c:/dev/jruby-1.6.4/lib/ruby/gems/1.8/gems/gherkin-2.6.7-java/lib/gherkin/i18n.rb:9 warning: already initialized constant FEATURE_ELEMENT_KEYS c:/dev/jruby-1.6.4/lib/ruby/gems/1.8/gems/gherkin-2.6.7-java/lib/gherkin/i18n.rb:10 warning: already initialized constant STEP_KEYWORD_KEYS c:/dev/jruby-1.6.4/lib/ruby/gems/1.8/gems/gherkin-2.6.7-java/lib/gherkin/i18n.rb:11 warning: already initialized constant KEYWORD_KEYS LoadError: load error: gherkin/i18n – org.yaml.snakeyaml.reader.ReaderException: special characters are not allowed require at org/jruby/RubyKernel.java:1047
          Hide
          Rubn Pars-Selders added a comment -

          I'm also experiencing this error now.

          It seems to be a hard nut as it's already been filed here:

          http://jira.codehaus.org/browse/JRUBY-6223 (Charles, here somebody has extracted it to test it outside rails)

          and here:
          https://github.com/cucumber/cucumber/issues/98

          Hope you have some luck finding that nut.

          C:\Programme\jruby-1.6.5\bin\jruby.exe --1.9 -e $stdout.sync=true;$stderr.sync=true;load($0=ARGV.shift) C:/Repositories/apollo/script/cucumber C:/Repositories/apollo/features/admin_login.feature --format Teamcity::Cucumber::Formatter --expand --color
          Testing started at 17:17 ...
          LoadError: load error: gherkin/i18n – org.yaml.snakeyaml.reader.ReaderException: special characters are not allowed
          require at org/jruby/RubyKernel.java:1047
          (root) at C:/Repositories/apollo/vendor/gems/jruby/1.9/gems/gherkin-2.6.9-java/lib/gherkin/lexer/i18n_lexer.rb:1
          require at org/jruby/RubyKernel.java:1047
          (root) at C:/Repositories/apollo/vendor/gems/jruby/1.9/gems/gherkin-2.6.9-java/lib/gherkin/lexer/i18n_lexer.rb:2
          require at org/jruby/RubyKernel.java:1047
          (root) at C:/Repositories/apollo/vendor/gems/jruby/1.9/gems/cucumber-1.0.6/lib/cucumber/ast/table.rb:3
          require at org/jruby/RubyKernel.java:1047
          (root) at C:/Repositories/apollo/vendor/gems/jruby/1.9/gems/cucumber-1.0.6/lib/cucumber/ast/step_invocation.rb:7
          require at org/jruby/RubyKernel.java:1047
          (root) at C:/Repositories/apollo/vendor/gems/jruby/1.9/gems/cucumber-1.0.6/lib/cucumber/ast.rb:2
          require at org/jruby/RubyKernel.java:1047
          (root) at C:/Repositories/apollo/vendor/gems/jruby/1.9/gems/cucumber-1.0.6/lib/cucumber/parser.rb:6
          require at org/jruby/RubyKernel.java:1047
          (root) at C:/Repositories/apollo/vendor/gems/jruby/1.9/gems/cucumber-1.0.6/lib/cucumber.rb:8

          Show
          Rubn Pars-Selders added a comment - I'm also experiencing this error now. It seems to be a hard nut as it's already been filed here: http://jira.codehaus.org/browse/JRUBY-6223 (Charles, here somebody has extracted it to test it outside rails) and here: https://github.com/cucumber/cucumber/issues/98 Hope you have some luck finding that nut. C:\Programme\jruby-1.6.5\bin\jruby.exe --1.9 -e $stdout.sync=true;$stderr.sync=true;load($0=ARGV.shift) C:/Repositories/apollo/script/cucumber C:/Repositories/apollo/features/admin_login.feature --format Teamcity::Cucumber::Formatter --expand --color Testing started at 17:17 ... LoadError: load error: gherkin/i18n – org.yaml.snakeyaml.reader.ReaderException: special characters are not allowed require at org/jruby/RubyKernel.java:1047 (root) at C:/Repositories/apollo/vendor/gems/jruby/1.9/gems/gherkin-2.6.9-java/lib/gherkin/lexer/i18n_lexer.rb:1 require at org/jruby/RubyKernel.java:1047 (root) at C:/Repositories/apollo/vendor/gems/jruby/1.9/gems/gherkin-2.6.9-java/lib/gherkin/lexer/i18n_lexer.rb:2 require at org/jruby/RubyKernel.java:1047 (root) at C:/Repositories/apollo/vendor/gems/jruby/1.9/gems/cucumber-1.0.6/lib/cucumber/ast/table.rb:3 require at org/jruby/RubyKernel.java:1047 (root) at C:/Repositories/apollo/vendor/gems/jruby/1.9/gems/cucumber-1.0.6/lib/cucumber/ast/step_invocation.rb:7 require at org/jruby/RubyKernel.java:1047 (root) at C:/Repositories/apollo/vendor/gems/jruby/1.9/gems/cucumber-1.0.6/lib/cucumber/ast.rb:2 require at org/jruby/RubyKernel.java:1047 (root) at C:/Repositories/apollo/vendor/gems/jruby/1.9/gems/cucumber-1.0.6/lib/cucumber/parser.rb:6 require at org/jruby/RubyKernel.java:1047 (root) at C:/Repositories/apollo/vendor/gems/jruby/1.9/gems/cucumber-1.0.6/lib/cucumber.rb:8
          Hide
          Hiro Asari added a comment -

          jruby -S cucumber --help works for me on master on Windows. https://gist.github.com/1515431

          This is a fresh clone.

          Show
          Hiro Asari added a comment - jruby -S cucumber --help works for me on master on Windows. https://gist.github.com/1515431 This is a fresh clone.
          Hide
          Hiro Asari added a comment -

          Sorry. I forgot to run the previous test in 1.9 mode. In 1.9 mode, it does fail.

          Show
          Hiro Asari added a comment - Sorry. I forgot to run the previous test in 1.9 mode. In 1.9 mode, it does fail.
          Hide
          Hiro Asari added a comment -

          We can safely remove Cucumber from the equation here. The problem is gherkin.

          C:\Users\asari\Development\src\jruby>bin\jruby --1.9 -rubygems -e "require 'gherkin'"
          LoadError: load error: gherkin/i18n -- org.yaml.snakeyaml.reader.ReaderException: special characters are not allowed
            require at org/jruby/RubyKernel.java:970
            require at C:/Users/asari/Development/src/jruby/lib/ruby/site_ruby/1.8/rubygems/custom_require.rb:36
             (root) at C:/Users/asari/Development/src/jruby/lib/ruby/gems/1.8/gems/gherkin-2.7.1-java/lib/gherkin/lexer/i18n_lexer.rb:1
            require at org/jruby/RubyKernel.java:970
            require at C:/Users/asari/Development/src/jruby/lib/ruby/gems/1.8/gems/gherkin-2.7.1-java/lib/gherkin/lexer/i18n_lexer.rb:36
             (root) at C:/Users/asari/Development/src/jruby/lib/ruby/gems/1.8/gems/gherkin-2.7.1-java/lib/gherkin.rb:1
            require at org/jruby/RubyKernel.java:970
            require at C:/Users/asari/Development/src/jruby/lib/ruby/gems/1.8/gems/gherkin-2.7.1-java/lib/gherkin.rb:59
             (root) at -e:1
          
          
          Show
          Hiro Asari added a comment - We can safely remove Cucumber from the equation here. The problem is gherkin. C:\Users\asari\Development\src\jruby>bin\jruby --1.9 -rubygems -e "require 'gherkin'" LoadError: load error: gherkin/i18n -- org.yaml.snakeyaml.reader.ReaderException: special characters are not allowed require at org/jruby/RubyKernel.java:970 require at C:/Users/asari/Development/src/jruby/lib/ruby/site_ruby/1.8/rubygems/custom_require.rb:36 (root) at C:/Users/asari/Development/src/jruby/lib/ruby/gems/1.8/gems/gherkin-2.7.1-java/lib/gherkin/lexer/i18n_lexer.rb:1 require at org/jruby/RubyKernel.java:970 require at C:/Users/asari/Development/src/jruby/lib/ruby/gems/1.8/gems/gherkin-2.7.1-java/lib/gherkin/lexer/i18n_lexer.rb:36 (root) at C:/Users/asari/Development/src/jruby/lib/ruby/gems/1.8/gems/gherkin-2.7.1-java/lib/gherkin.rb:1 require at org/jruby/RubyKernel.java:970 require at C:/Users/asari/Development/src/jruby/lib/ruby/gems/1.8/gems/gherkin-2.7.1-java/lib/gherkin.rb:59 (root) at -e:1
          Hide
          Hiro Asari added a comment -

          In fact, Aslak reduced the problem to:

          jruby --1.9 -e "require 'yaml'; puts YAML.load(open('lib/ruby/gems/1.8/gems/gherkin-2.7.1-java/lib/gherkin/i18n.yml'))['zh-CN']['feature']"
          

          It turns out that snakeyaml gives a little more informative error message if you ask for e.toString() rather than e.getMessage(). In this case, you'd get:

          LoadError: load error: gherkin/i18n -- org.yaml.snakeyaml.reader.ReaderException: special characters are not allowed
          unacceptable character '?' (0xFFFD) special characters are not allowed in "<reader>", position 896
          

          Furthermore, the 'read' is happening like this:

          org.jruby.runtime.load.LoadService$SearchState: library=ExternalScript: C:/Users/asari/Development/src/jruby/lib/ruby/gems/1.8/gems/gherkin-2.7.1-java/lib/gherkin/i18n.rb, loadName=C:/Users/asari/Development/src/jruby/lib/ruby/gems/1.8/gems/gherkin-2.7.1-java/lib/gherkin/i18n.rb, suffixType=Both, searchFile=gherkin/i18n
          
          Show
          Hiro Asari added a comment - In fact, Aslak reduced the problem to: jruby --1.9 -e "require 'yaml'; puts YAML.load(open('lib/ruby/gems/1.8/gems/gherkin-2.7.1-java/lib/gherkin/i18n.yml'))['zh-CN']['feature']" It turns out that snakeyaml gives a little more informative error message if you ask for e.toString() rather than e.getMessage() . In this case, you'd get: LoadError: load error: gherkin/i18n -- org.yaml.snakeyaml.reader.ReaderException: special characters are not allowed unacceptable character '?' (0xFFFD) special characters are not allowed in "<reader>", position 896 Furthermore, the 'read' is happening like this: org.jruby.runtime.load.LoadService$SearchState: library=ExternalScript: C:/Users/asari/Development/src/jruby/lib/ruby/gems/1.8/gems/gherkin-2.7.1-java/lib/gherkin/i18n.rb, loadName=C:/Users/asari/Development/src/jruby/lib/ruby/gems/1.8/gems/gherkin-2.7.1-java/lib/gherkin/i18n.rb, suffixType=Both, searchFile=gherkin/i18n
          Hide
          Hiro Asari added a comment -

          The exception looks identical to the one described in JRUBY-5913.

          Show
          Hiro Asari added a comment - The exception looks identical to the one described in JRUBY-5913 .
          Hide
          Aung Maw added a comment - - edited

          I'm also having same issue, but not with gherkin. I installed rails_admin on windows 7, jruby 1.6.5, ruby 1.9.2 and got error messages like below. I also included Gem.lock below. Works on Ubuntu box though.

          => Booting WEBrick
          => Rails 3.1.3 application starting in development on http://127.0.0.1:3000
          => Call with -d to detach
          => Ctrl-C to shutdown server
          Exiting
          LoadError: load error: C:/Users/nash/railsapp/config/environment -- org.yaml.snakeyaml.reader.ReaderException: special characters are not allowed
                     require at org/jruby/RubyKernel.java:1047
                     require at C:/jruby-1.6.5/lib/ruby/gems/1.8/gems/activesupport-3.1.3/lib/active_support/dependencies.rb:240
             load_dependency at C:/jruby-1.6.5/lib/ruby/gems/1.8/gems/activesupport-3.1.3/lib/active_support/dependencies.rb:223
            new_constants_in at C:/jruby-1.6.5/lib/ruby/gems/1.8/gems/activesupport-3.1.3/lib/active_support/dependencies.rb:640
            new_constants_in at C:/jruby-1.6.5/lib/ruby/gems/1.8/gems/activesupport-3.1.3/lib/active_support/dependencies.rb:639
             load_dependency at C:/jruby-1.6.5/lib/ruby/gems/1.8/gems/activesupport-3.1.3/lib/active_support/dependencies.rb:223
                     require at C:/jruby-1.6.5/lib/ruby/gems/1.8/gems/activesupport-3.1.3/lib/active_support/dependencies.rb:240
                  parse_file at C:/Users/nash/railsapp/config.ru:4
               instance_eval at org/jruby/RubyBasicObject.java:1720
                  initialize at C:/jruby-1.6.5/lib/ruby/gems/1.8/gems/rack-1.3.5/lib/rack/builder.rb:51
                  parse_file at C:/Users/nash/railsapp/config.ru:1
                        eval at org/jruby/RubyKernel.java:1093
                  parse_file at C:/jruby-1.6.5/lib/ruby/gems/1.8/gems/rack-1.3.5/lib/rack/builder.rb:40
                         app at C:/jruby-1.6.5/lib/ruby/gems/1.8/gems/rack-1.3.5/lib/rack/server.rb:200
                         app at C:/jruby-1.6.5/lib/ruby/gems/1.8/gems/railties-3.1.3/lib/rails/commands/server.rb:46
                 wrapped_app at C:/jruby-1.6.5/lib/ruby/gems/1.8/gems/rack-1.3.5/lib/rack/server.rb:301
                       start at C:/jruby-1.6.5/lib/ruby/gems/1.8/gems/rack-1.3.5/lib/rack/server.rb:252
                       start at C:/jruby-1.6.5/lib/ruby/gems/1.8/gems/railties-3.1.3/lib/rails/commands/server.rb:70
                      (root) at C:/jruby-1.6.5/lib/ruby/gems/1.8/gems/railties-3.1.3/lib/rails/commands.rb:54
                         tap at org/jruby/RubyKernel.java:1804
                      (root) at C:/jruby-1.6.5/lib/ruby/gems/1.8/gems/railties-3.1.3/lib/rails/commands.rb:49
                     require at org/jruby/RubyKernel.java:1047
                      (root) at C:/jruby-1.6.5/lib/ruby/gems/1.8/gems/railties-3.1.3/lib/rails/commands.rb:6
                        load at org/jruby/RubyKernel.java:1073
                      (root) at -e:1
          
          Process finished with exit code 1
          
          
          
          GIT
            remote: git://github.com/sferik/rails_admin.git
            revision: b2c73d54298a8507f9ba3c02cfc4b784fd3a5ede
            specs:
              rails_admin (0.0.1)
                bbenezech-nested_form (~> 0.0)
                bootstrap-sass (~> 1.4, >= 1.4.1)
                builder (~> 3.0)
                coffee-rails (~> 3.1)
                haml (~> 3.1)
                jquery-rails (>= 1.0)
                kaminari (~> 0.12)
                rack-pjax (~> 0.5)
                rails (~> 3.1)
                remotipart (~> 1.0)
          
          GEM
            remote: http://rubygems.org/
            specs:
              actionmailer (3.1.3)
                actionpack (= 3.1.3)
                mail (~> 2.3.0)
              actionpack (3.1.3)
                activemodel (= 3.1.3)
                activesupport (= 3.1.3)
                builder (~> 3.0.0)
                erubis (~> 2.7.0)
                i18n (~> 0.6)
                rack (~> 1.3.5)
                rack-cache (~> 1.1)
                rack-mount (~> 0.8.2)
                rack-test (~> 0.6.1)
                sprockets (~> 2.0.3)
              activemodel (3.1.3)
                activesupport (= 3.1.3)
                builder (~> 3.0.0)
                i18n (~> 0.6)
              activerecord (3.1.3)
                activemodel (= 3.1.3)
                activesupport (= 3.1.3)
                arel (~> 2.2.1)
                tzinfo (~> 0.3.29)
              activerecord-jdbc-adapter (1.2.1)
              activerecord-jdbcsqlite3-adapter (1.2.1)
                activerecord-jdbc-adapter (~> 1.2.1)
                jdbc-sqlite3 (~> 3.7.2)
              activeresource (3.1.3)
                activemodel (= 3.1.3)
                activesupport (= 3.1.3)
              activesupport (3.1.3)
                multi_json (~> 1.0)
              ansi (1.4.1)
              arel (2.2.1)
              bbenezech-nested_form (0.0.2)
              bcrypt-ruby (3.0.1-java)
              bootstrap-sass (1.4.3)
                sass-rails (~> 3.1)
              bouncy-castle-java (1.5.0146.1)
              builder (3.0.0)
              coffee-rails (3.1.1)
                coffee-script (>= 2.2.0)
                railties (~> 3.1.0)
              coffee-script (2.2.0)
                coffee-script-source
                execjs
              coffee-script-source (1.1.3)
              devise (1.5.3)
                bcrypt-ruby (~> 3.0)
                orm_adapter (~> 0.0.3)
                warden (~> 1.1)
              erubis (2.7.0)
              execjs (1.2.11)
                multi_json (~> 1.0)
              fastercsv (1.5.4)
              haml (3.1.4)
              hike (1.2.1)
              hpricot (0.8.5-java)
              i18n (0.6.0)
              jdbc-sqlite3 (3.7.2)
              jquery-rails (1.0.19)
                railties (~> 3.0)
                thor (~> 0.14)
              jruby-openssl (0.7.4)
                bouncy-castle-java
              json (1.6.3-java)
              kaminari (0.13.0)
                actionpack (>= 3.0.0)
                activesupport (>= 3.0.0)
                railties (>= 3.0.0)
              mail (2.3.0)
                i18n (>= 0.4.0)
                mime-types (~> 1.16)
                treetop (~> 1.4.8)
              mime-types (1.17.2)
              multi_json (1.0.4)
              orm_adapter (0.0.5)
              polyglot (0.3.3)
              rack (1.3.5)
              rack-cache (1.1)
                rack (>= 0.4)
              rack-mount (0.8.3)
                rack (>= 1.0.0)
              rack-pjax (0.5.5)
                hpricot (~> 0.8.4)
                rack (~> 1.3)
              rack-ssl (1.3.2)
                rack
              rack-test (0.6.1)
                rack (>= 1.0)
              rails (3.1.3)
                actionmailer (= 3.1.3)
                actionpack (= 3.1.3)
                activerecord (= 3.1.3)
                activeresource (= 3.1.3)
                activesupport (= 3.1.3)
                bundler (~> 1.0)
                railties (= 3.1.3)
              railties (3.1.3)
                actionpack (= 3.1.3)
                activesupport (= 3.1.3)
                rack-ssl (~> 1.3.2)
                rake (>= 0.8.7)
                rdoc (~> 3.4)
                thor (~> 0.14.6)
              rake (0.9.2.2)
              rdoc (3.11)
                json (~> 1.4)
              remotipart (1.0.1)
              sass (3.1.11)
              sass-rails (3.1.5)
                actionpack (~> 3.1.0)
                railties (~> 3.1.0)
                sass (~> 3.1.10)
                tilt (~> 1.3.2)
              sprockets (2.0.3)
                hike (~> 1.2)
                rack (~> 1.0)
                tilt (~> 1.1, != 1.3.0)
              therubyrhino (1.73.0)
              thor (0.14.6)
              tilt (1.3.3)
              treetop (1.4.10)
                polyglot
                polyglot (>= 0.3.1)
              turn (0.8.2)
                ansi (>= 1.2.2)
              tzinfo (0.3.31)
              uglifier (1.1.0)
                execjs (>= 0.3.0)
                multi_json (>= 1.0.2)
              warden (1.1.0)
                rack (>= 1.0)
          
          PLATFORMS
            java
          
          DEPENDENCIES
            activerecord-jdbcsqlite3-adapter
            coffee-rails (~> 3.1.1)
            devise
            fastercsv
            jquery-rails
            jruby-openssl
            rails (= 3.1.3)
            rails_admin!
            sass-rails (~> 3.1.5)
            therubyrhino
            turn (= 0.8.2)
            uglifier (>= 1.0.3)
          
          Show
          Aung Maw added a comment - - edited I'm also having same issue, but not with gherkin. I installed rails_admin on windows 7, jruby 1.6.5, ruby 1.9.2 and got error messages like below. I also included Gem.lock below. Works on Ubuntu box though. => Booting WEBrick => Rails 3.1.3 application starting in development on http: //127.0.0.1:3000 => Call with -d to detach => Ctrl-C to shutdown server Exiting LoadError: load error: C:/Users/nash/railsapp/config/environment -- org.yaml.snakeyaml.reader.ReaderException: special characters are not allowed require at org/jruby/RubyKernel.java:1047 require at C:/jruby-1.6.5/lib/ruby/gems/1.8/gems/activesupport-3.1.3/lib/active_support/dependencies.rb:240 load_dependency at C:/jruby-1.6.5/lib/ruby/gems/1.8/gems/activesupport-3.1.3/lib/active_support/dependencies.rb:223 new_constants_in at C:/jruby-1.6.5/lib/ruby/gems/1.8/gems/activesupport-3.1.3/lib/active_support/dependencies.rb:640 new_constants_in at C:/jruby-1.6.5/lib/ruby/gems/1.8/gems/activesupport-3.1.3/lib/active_support/dependencies.rb:639 load_dependency at C:/jruby-1.6.5/lib/ruby/gems/1.8/gems/activesupport-3.1.3/lib/active_support/dependencies.rb:223 require at C:/jruby-1.6.5/lib/ruby/gems/1.8/gems/activesupport-3.1.3/lib/active_support/dependencies.rb:240 parse_file at C:/Users/nash/railsapp/config.ru:4 instance_eval at org/jruby/RubyBasicObject.java:1720 initialize at C:/jruby-1.6.5/lib/ruby/gems/1.8/gems/rack-1.3.5/lib/rack/builder.rb:51 parse_file at C:/Users/nash/railsapp/config.ru:1 eval at org/jruby/RubyKernel.java:1093 parse_file at C:/jruby-1.6.5/lib/ruby/gems/1.8/gems/rack-1.3.5/lib/rack/builder.rb:40 app at C:/jruby-1.6.5/lib/ruby/gems/1.8/gems/rack-1.3.5/lib/rack/server.rb:200 app at C:/jruby-1.6.5/lib/ruby/gems/1.8/gems/railties-3.1.3/lib/rails/commands/server.rb:46 wrapped_app at C:/jruby-1.6.5/lib/ruby/gems/1.8/gems/rack-1.3.5/lib/rack/server.rb:301 start at C:/jruby-1.6.5/lib/ruby/gems/1.8/gems/rack-1.3.5/lib/rack/server.rb:252 start at C:/jruby-1.6.5/lib/ruby/gems/1.8/gems/railties-3.1.3/lib/rails/commands/server.rb:70 (root) at C:/jruby-1.6.5/lib/ruby/gems/1.8/gems/railties-3.1.3/lib/rails/commands.rb:54 tap at org/jruby/RubyKernel.java:1804 (root) at C:/jruby-1.6.5/lib/ruby/gems/1.8/gems/railties-3.1.3/lib/rails/commands.rb:49 require at org/jruby/RubyKernel.java:1047 (root) at C:/jruby-1.6.5/lib/ruby/gems/1.8/gems/railties-3.1.3/lib/rails/commands.rb:6 load at org/jruby/RubyKernel.java:1073 (root) at -e:1 Process finished with exit code 1 GIT remote: git: //github.com/sferik/rails_admin.git revision: b2c73d54298a8507f9ba3c02cfc4b784fd3a5ede specs: rails_admin (0.0.1) bbenezech-nested_form (~> 0.0) bootstrap-sass (~> 1.4, >= 1.4.1) builder (~> 3.0) coffee-rails (~> 3.1) haml (~> 3.1) jquery-rails (>= 1.0) kaminari (~> 0.12) rack-pjax (~> 0.5) rails (~> 3.1) remotipart (~> 1.0) GEM remote: http: //rubygems.org/ specs: actionmailer (3.1.3) actionpack (= 3.1.3) mail (~> 2.3.0) actionpack (3.1.3) activemodel (= 3.1.3) activesupport (= 3.1.3) builder (~> 3.0.0) erubis (~> 2.7.0) i18n (~> 0.6) rack (~> 1.3.5) rack-cache (~> 1.1) rack-mount (~> 0.8.2) rack-test (~> 0.6.1) sprockets (~> 2.0.3) activemodel (3.1.3) activesupport (= 3.1.3) builder (~> 3.0.0) i18n (~> 0.6) activerecord (3.1.3) activemodel (= 3.1.3) activesupport (= 3.1.3) arel (~> 2.2.1) tzinfo (~> 0.3.29) activerecord-jdbc-adapter (1.2.1) activerecord-jdbcsqlite3-adapter (1.2.1) activerecord-jdbc-adapter (~> 1.2.1) jdbc-sqlite3 (~> 3.7.2) activeresource (3.1.3) activemodel (= 3.1.3) activesupport (= 3.1.3) activesupport (3.1.3) multi_json (~> 1.0) ansi (1.4.1) arel (2.2.1) bbenezech-nested_form (0.0.2) bcrypt-ruby (3.0.1-java) bootstrap-sass (1.4.3) sass-rails (~> 3.1) bouncy-castle-java (1.5.0146.1) builder (3.0.0) coffee-rails (3.1.1) coffee-script (>= 2.2.0) railties (~> 3.1.0) coffee-script (2.2.0) coffee-script-source execjs coffee-script-source (1.1.3) devise (1.5.3) bcrypt-ruby (~> 3.0) orm_adapter (~> 0.0.3) warden (~> 1.1) erubis (2.7.0) execjs (1.2.11) multi_json (~> 1.0) fastercsv (1.5.4) haml (3.1.4) hike (1.2.1) hpricot (0.8.5-java) i18n (0.6.0) jdbc-sqlite3 (3.7.2) jquery-rails (1.0.19) railties (~> 3.0) thor (~> 0.14) jruby-openssl (0.7.4) bouncy-castle-java json (1.6.3-java) kaminari (0.13.0) actionpack (>= 3.0.0) activesupport (>= 3.0.0) railties (>= 3.0.0) mail (2.3.0) i18n (>= 0.4.0) mime-types (~> 1.16) treetop (~> 1.4.8) mime-types (1.17.2) multi_json (1.0.4) orm_adapter (0.0.5) polyglot (0.3.3) rack (1.3.5) rack-cache (1.1) rack (>= 0.4) rack-mount (0.8.3) rack (>= 1.0.0) rack-pjax (0.5.5) hpricot (~> 0.8.4) rack (~> 1.3) rack-ssl (1.3.2) rack rack-test (0.6.1) rack (>= 1.0) rails (3.1.3) actionmailer (= 3.1.3) actionpack (= 3.1.3) activerecord (= 3.1.3) activeresource (= 3.1.3) activesupport (= 3.1.3) bundler (~> 1.0) railties (= 3.1.3) railties (3.1.3) actionpack (= 3.1.3) activesupport (= 3.1.3) rack-ssl (~> 1.3.2) rake (>= 0.8.7) rdoc (~> 3.4) thor (~> 0.14.6) rake (0.9.2.2) rdoc (3.11) json (~> 1.4) remotipart (1.0.1) sass (3.1.11) sass-rails (3.1.5) actionpack (~> 3.1.0) railties (~> 3.1.0) sass (~> 3.1.10) tilt (~> 1.3.2) sprockets (2.0.3) hike (~> 1.2) rack (~> 1.0) tilt (~> 1.1, != 1.3.0) therubyrhino (1.73.0) thor (0.14.6) tilt (1.3.3) treetop (1.4.10) polyglot polyglot (>= 0.3.1) turn (0.8.2) ansi (>= 1.2.2) tzinfo (0.3.31) uglifier (1.1.0) execjs (>= 0.3.0) multi_json (>= 1.0.2) warden (1.1.0) rack (>= 1.0) PLATFORMS java DEPENDENCIES activerecord-jdbcsqlite3-adapter coffee-rails (~> 3.1.1) devise fastercsv jquery-rails jruby-openssl rails (= 3.1.3) rails_admin! sass-rails (~> 3.1.5) therubyrhino turn (= 0.8.2) uglifier (>= 1.0.3)
          Hide
          Thomas E Enebo added a comment -

          A word-around and a pretty good hint at the problem is to change your JVM's default encoding:

          jruby --1.9 -J-Dfile.encoding=UTF-8 -rubygems -e "require %Q{gherkin}"
          

          It appears Snakeyaml is constructed to use default JVM encoding which does not match what gherkin expects in its i18n.yml file (UTF-8). This also explains why this works on Linux (defaults to UTF-8). I see a '# coding: UTF-8' in that yaml file...Is that something YAML parsers are supposed to honor?

          As an aside, if you are doing development no Windows and deployment on another OS, then you should consider setting your dev evironment to the same encoding:

          JRUBY_OPTS="--1.9 -J-Dfile.encoding=UTF-8"
          

          ^--- if you use CMD then I am not sure this syntax is quite right but you get the idea...

          Show
          Thomas E Enebo added a comment - A word-around and a pretty good hint at the problem is to change your JVM's default encoding: jruby --1.9 -J-Dfile.encoding=UTF-8 -rubygems -e "require %Q{gherkin}" It appears Snakeyaml is constructed to use default JVM encoding which does not match what gherkin expects in its i18n.yml file (UTF-8). This also explains why this works on Linux (defaults to UTF-8). I see a '# coding: UTF-8' in that yaml file...Is that something YAML parsers are supposed to honor? As an aside, if you are doing development no Windows and deployment on another OS, then you should consider setting your dev evironment to the same encoding: JRUBY_OPTS="--1.9 -J-Dfile.encoding=UTF-8" ^--- if you use CMD then I am not sure this syntax is quite right but you get the idea...
          Hide
          Hiro Asari added a comment -

          Aung, what's in C:/Users/nash/railsapp/config/environment.yml? I suspect there is a character in Unicode that snakeyaml can't handle, but only on Windows (obviously).

          The offending character here is \ufffd. This character causes a problem on non-Windows machine as well. On the Mac, for example:

          $ jruby --1.9 -ryaml -e 'YAML.load("\ufffd".to_yaml)'
          StreamReader.java:98:in `checkPrintable': unacceptable character '&#65533;' (0xFFFD) special characters are not allowed
          in "<reader>", position 4
          	from StreamReader.java:191:in `update'
          	from StreamReader.java:63:in `<init>'
          	from PsychParser.java:114:in `parse'
          	from PsychParser$INVOKER$i$1$0$parse.gen:65535:in `call'
          	from CachingCallSite.java:312:in `cacheAndCall'
          	from CachingCallSite.java:169:in `call'
          	from CallOneArgNode.java:57:in `interpret'
          	from NewlineNode.java:104:in `interpret'
          	from BlockNode.java:71:in `interpret'
          	from ASTInterpreter.java:75:in `INTERPRET_METHOD'
          	from InterpretedMethod.java:190:in `call'
          	from DefaultMethod.java:199:in `call'
          	from CachingCallSite.java:312:in `cacheAndCall'
          	from CachingCallSite.java:169:in `call'
          	from FCallOneArgNode.java:36:in `interpret'
          	from CallNoArgNode.java:63:in `interpret'
          	from LocalAsgnNode.java:123:in `interpret'
          	from NewlineNode.java:104:in `interpret'
          	from BlockNode.java:71:in `interpret'
          	from ASTInterpreter.java:75:in `INTERPRET_METHOD'
          	from InterpretedMethod.java:190:in `call'
          	from DefaultMethod.java:199:in `call'
          	from CachingCallSite.java:312:in `cacheAndCall'
          	from CachingCallSite.java:169:in `call'
          	from FCallOneArgNode.java:36:in `interpret'
          	from LocalAsgnNode.java:123:in `interpret'
          	from NewlineNode.java:104:in `interpret'
          	from BlockNode.java:71:in `interpret'
          	from ASTInterpreter.java:75:in `INTERPRET_METHOD'
          	from InterpretedMethod.java:190:in `call'
          	from DefaultMethod.java:199:in `call'
          	from CachingCallSite.java:312:in `cacheAndCall'
          	from CachingCallSite.java:169:in `call'
          	from -e:1:in `__file__'
          	from -e:-1:in `load'
          	from Ruby.java:731:in `runScript'
          	from Ruby.java:724:in `runScript'
          	from Ruby.java:631:in `runNormally'
          	from Ruby.java:480:in `runFromMain'
          	from Main.java:343:in `doRunFromMain'
          	from Main.java:255:in `internalRun'
          	from Main.java:221:in `run'
          	from Main.java:205:in `run'
          	from Main.java:185:in `main'
          

          This works fine on MRI (which may or may not be correct). According to snakeyaml source code (http://code.google.com/p/snakeyaml/source/browse/src/main/java/org/yaml/snakeyaml/reader/StreamReader.java#33)

          // NON_PRINTABLE changed from PyYAML: \uFFFD excluded because Java returns
          // it in case of data corruption

          Since we are not getting error on non-Windows machines, it seems safe to assume that the YAML files. I don't know where it is coming from.

          I also note that MRI 'prints' non-printable Unicode characters escaped; e.g.,

          $ ruby2.0 -v -ryaml -e 'p YAML.load("\uffff".to_yaml)'
          ruby 2.0.0dev (2011-12-14 trunk 34032) [x86_64-darwin11.2.0]
          "\uFFFF"
          

          JRuby throws the exception under consideration.

          Show
          Hiro Asari added a comment - Aung, what's in C:/Users/nash/railsapp/config/environment.yml? I suspect there is a character in Unicode that snakeyaml can't handle, but only on Windows (obviously). The offending character here is \ufffd . This character causes a problem on non-Windows machine as well. On the Mac, for example: $ jruby --1.9 -ryaml -e 'YAML.load("\ufffd".to_yaml)' StreamReader.java:98:in `checkPrintable': unacceptable character '&#65533;' (0xFFFD) special characters are not allowed in "<reader>", position 4 from StreamReader.java:191:in `update' from StreamReader.java:63:in `<init>' from PsychParser.java:114:in `parse' from PsychParser$INVOKER$i$1$0$parse.gen:65535:in `call' from CachingCallSite.java:312:in `cacheAndCall' from CachingCallSite.java:169:in `call' from CallOneArgNode.java:57:in `interpret' from NewlineNode.java:104:in `interpret' from BlockNode.java:71:in `interpret' from ASTInterpreter.java:75:in `INTERPRET_METHOD' from InterpretedMethod.java:190:in `call' from DefaultMethod.java:199:in `call' from CachingCallSite.java:312:in `cacheAndCall' from CachingCallSite.java:169:in `call' from FCallOneArgNode.java:36:in `interpret' from CallNoArgNode.java:63:in `interpret' from LocalAsgnNode.java:123:in `interpret' from NewlineNode.java:104:in `interpret' from BlockNode.java:71:in `interpret' from ASTInterpreter.java:75:in `INTERPRET_METHOD' from InterpretedMethod.java:190:in `call' from DefaultMethod.java:199:in `call' from CachingCallSite.java:312:in `cacheAndCall' from CachingCallSite.java:169:in `call' from FCallOneArgNode.java:36:in `interpret' from LocalAsgnNode.java:123:in `interpret' from NewlineNode.java:104:in `interpret' from BlockNode.java:71:in `interpret' from ASTInterpreter.java:75:in `INTERPRET_METHOD' from InterpretedMethod.java:190:in `call' from DefaultMethod.java:199:in `call' from CachingCallSite.java:312:in `cacheAndCall' from CachingCallSite.java:169:in `call' from -e:1:in `__file__' from -e:-1:in `load' from Ruby.java:731:in `runScript' from Ruby.java:724:in `runScript' from Ruby.java:631:in `runNormally' from Ruby.java:480:in `runFromMain' from Main.java:343:in `doRunFromMain' from Main.java:255:in `internalRun' from Main.java:221:in `run' from Main.java:205:in `run' from Main.java:185:in `main' This works fine on MRI (which may or may not be correct). According to snakeyaml source code ( http://code.google.com/p/snakeyaml/source/browse/src/main/java/org/yaml/snakeyaml/reader/StreamReader.java#33 ) // NON_PRINTABLE changed from PyYAML: \uFFFD excluded because Java returns // it in case of data corruption Since we are not getting error on non-Windows machines, it seems safe to assume that the YAML files. I don't know where it is coming from. I also note that MRI 'prints' non-printable Unicode characters escaped; e.g., $ ruby2.0 -v -ryaml -e 'p YAML.load("\uffff".to_yaml)' ruby 2.0.0dev (2011-12-14 trunk 34032) [x86_64-darwin11.2.0] "\uFFFF" JRuby throws the exception under consideration.
          Hide
          Hiro Asari added a comment -

          Oh, I see. So, that's how Windows is throwing \ufffd at snakeyaml.

          I confirmed that

          set JRUBY_OPTS="--1.9 -J-Dfile.encoding=UTF-8"
          

          does indeed do the trick on CMD on Windows.

          In the case of gherkin, the file in question, i18n.yml has the magic comment:

          # encoding: UTF-8
          

          but we are not respecting it. (I doubt this is a YAML specification.)

          A few additional questions:

          1. Should we read this comment and set the encoding accordingly?
          2. Should we set the default encoding on Windows to UTF-8 (or some other reasonable value) in the 1.9 mode?
          3. Should we anticipate this error condition in snakeyaml and give a more intelligent error message (on all platforms)?
          Show
          Hiro Asari added a comment - Oh, I see. So, that's how Windows is throwing \ufffd at snakeyaml. I confirmed that set JRUBY_OPTS="--1.9 -J-Dfile.encoding=UTF-8" does indeed do the trick on CMD on Windows. In the case of gherkin, the file in question, i18n.yml has the magic comment: # encoding: UTF-8 but we are not respecting it. (I doubt this is a YAML specification.) A few additional questions: Should we read this comment and set the encoding accordingly? Should we set the default encoding on Windows to UTF-8 (or some other reasonable value) in the 1.9 mode? Should we anticipate this error condition in snakeyaml and give a more intelligent error message (on all platforms)?
          Hide
          Thomas E Enebo added a comment -

          I glanced at psych and it sort of looks like all YAML files should be treated as UTF-8. If you have an internal representation different than UTF-8 it should transcode to that encoding. So this might be as simple as letting Snakeyaml always parse as UTF-8? If that is true though then why isn't it already doing that?

          I also have a possible theory on the #coding line. vi and emacs will be able to use that for editing the file...

          Show
          Thomas E Enebo added a comment - I glanced at psych and it sort of looks like all YAML files should be treated as UTF-8. If you have an internal representation different than UTF-8 it should transcode to that encoding. So this might be as simple as letting Snakeyaml always parse as UTF-8? If that is true though then why isn't it already doing that? I also have a possible theory on the #coding line. vi and emacs will be able to use that for editing the file...
          Hide
          Thomas E Enebo added a comment -

          Looks like ANY, UTF-8, UTF-16BE, UTF-16LE. I am hoping ANY means one of these three.
          Our parser wraps an IO in a InputStream which is wrapped by a Reader. IO itself comes from read calls which return an int. The IO is likely default internal and is emitting characters in default internal. Boom.

          Show
          Thomas E Enebo added a comment - Looks like ANY, UTF-8, UTF-16BE, UTF-16LE. I am hoping ANY means one of these three. Our parser wraps an IO in a InputStream which is wrapped by a Reader. IO itself comes from read calls which return an int. The IO is likely default internal and is emitting characters in default internal. Boom.
          Hide
          Charles Oliver Nutter added a comment -

          I have a temporary fix for the IO case...basically forcing the InputStreamReader to assume UTF-8:

          system ~/projects/jruby $ jruby --1.9 -e "require 'yaml'; puts YAML.load(open('lib/ruby/gems/1.8/gems/gherkin-2.7.1-java/lib/gherkin/i18n.yml'))['zh-CN']['feature']"
          &#21151;&#33021;
          

          Trying to find a good case for the String version. 0xfffd does indeed fail, but is it a proper UTF-8 character?

          Show
          Charles Oliver Nutter added a comment - I have a temporary fix for the IO case...basically forcing the InputStreamReader to assume UTF-8: system ~/projects/jruby $ jruby --1.9 -e "require 'yaml'; puts YAML.load(open('lib/ruby/gems/1.8/gems/gherkin-2.7.1-java/lib/gherkin/i18n.yml'))['zh-CN']['feature']" &#21151;&#33021; Trying to find a good case for the String version. 0xfffd does indeed fail, but is it a proper UTF-8 character?
          Show
          Hiro Asari added a comment - \uFFFD is most definitely a valid UTF-8 character. See http://www.fileformat.info/info/unicode/char/fffd/index.htm and http://stackoverflow.com/questions/3526965/unicode-issue-with-an-html-title-question-mark-65533
          Hide
          Charles Oliver Nutter added a comment -

          Ok...I've pushed the IO fix to master and jruby-1_6. Will see about the String case.

          Show
          Charles Oliver Nutter added a comment - Ok...I've pushed the IO fix to master and jruby-1_6. Will see about the String case.
          Hide
          Aung Maw added a comment -

          Hi Hiro, This is what I have in environment.rb. No special characters. Thanks.

          1. Load the rails application
            require File.expand_path('../application', _FILE_)
            $CLASSPATH << "file:///# {Rails.root}

            /lib/sqljdbc4.jar"

          2. Initialize the rails application
            Globalidm3::Application.initialize!
          Show
          Aung Maw added a comment - Hi Hiro, This is what I have in environment.rb. No special characters. Thanks. Load the rails application require File.expand_path('../application', _ FILE _) $CLASSPATH << "file:///# {Rails.root} /lib/sqljdbc4.jar" Initialize the rails application Globalidm3::Application.initialize!
          Hide
          Hiro Asari added a comment -

          Thank you, Aung. I believe we have identified the root cause and working towards a resolution now.

          Show
          Hiro Asari added a comment - Thank you, Aung. I believe we have identified the root cause and working towards a resolution now.
          Hide
          Charles Oliver Nutter added a comment -

          I believe the String case may also be fixed right now. See the following, where I first use an IO and then use an already-read String:

          system ~/projects/jruby $ jruby --1.9 -e "require 'yaml'; puts YAML.load(open('lib/ruby/gems/shared/gems/gherkin-2.7.2-java/lib/gherkin/i18n.yml'))['zh-CN']['feature']"
          功能
          
          system ~/projects/jruby $ jruby --1.9 -e "require 'yaml'; puts YAML.load(File.read('lib/ruby/gems/shared/gems/gherkin-2.7.2-java/lib/gherkin/i18n.yml'))['zh-CN']['feature']"
          功能
          

          The additional case Hiro found with \ufffd may be a separate issue...in fact, it may be that YAML disallows this character in the YAML stream. I will open a separate bug for investigating that.

          Show
          Charles Oliver Nutter added a comment - I believe the String case may also be fixed right now. See the following, where I first use an IO and then use an already-read String: system ~/projects/jruby $ jruby --1.9 -e "require 'yaml'; puts YAML.load(open('lib/ruby/gems/shared/gems/gherkin-2.7.2-java/lib/gherkin/i18n.yml'))['zh-CN']['feature']" 功能 system ~/projects/jruby $ jruby --1.9 -e "require 'yaml'; puts YAML.load(File.read('lib/ruby/gems/shared/gems/gherkin-2.7.2-java/lib/gherkin/i18n.yml'))['zh-CN']['feature']" 功能 The additional case Hiro found with \ufffd may be a separate issue...in fact, it may be that YAML disallows this character in the YAML stream. I will open a separate bug for investigating that.
          Hide
          Charles Oliver Nutter added a comment -

          Oh, FWIW, both of those did output the proper characters to my console...JIRA just mucked them up.

          Show
          Charles Oliver Nutter added a comment - Oh, FWIW, both of those did output the proper characters to my console...JIRA just mucked them up.
          Hide
          Charles Oliver Nutter added a comment -

          Another note: In another bug I discovered that the Gherkin spec was actually failing because it wasn't getting set to the proper encoding by Zlib::GzipReader. As a result, we attempted to transcode it to a Java String (UTF-16BE) from ASCII-8BIT, which mangled the characters in Aslak's last name. Fixing that (773a155) cleaned up remaining String decoding issues. So that wasn't directly an issue with PsychParser, but it has been fixed as well.

          Show
          Charles Oliver Nutter added a comment - Another note: In another bug I discovered that the Gherkin spec was actually failing because it wasn't getting set to the proper encoding by Zlib::GzipReader. As a result, we attempted to transcode it to a Java String (UTF-16BE) from ASCII-8BIT, which mangled the characters in Aslak's last name. Fixing that (773a155) cleaned up remaining String decoding issues. So that wasn't directly an issue with PsychParser, but it has been fixed as well.

            People

            • Assignee:
              Charles Oliver Nutter
              Reporter:
              Patrick Ma
            • Votes:
              2 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: