JRuby (please use github issues at http://bugs.jruby.org)
  1. JRuby (please use github issues at http://bugs.jruby.org)
  2. JRUBY-6548

REXML error when reading files containing ISO-8859-1 encoded data

    Details

    • Type: Bug Bug
    • Status: Resolved Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: JRuby 1.6.7
    • Fix Version/s: JRuby 1.7.0.pre2
    • Component/s: Standard Library
    • Labels:
      None
    • Environment:
      Arch Linux
      OpenJDK 1.7
      JRuby 1.6.7
    • Testcase included:
      yes
    • Patch Submitted:
      Yes
    • Number of attachments :
      3

      Description

      The attached script decodes the attached xml file. The xml file is encoded in ISO-8859-1. This triggers the iconv based decoding in rexml/encoding.rb:36

      ICONV.rb is implemented as listed below. As far as I can tell this can never work. @encoding is not defined in ICONVEncoder so it will always be nil. In the end this causes the TypeError listed below.

      The version of ICONV.rb that is included in JRuby does not match the one from http://www.germane-software.com/software/rexml/. The only reference to the diff between the two I could find is https://gist.github.com/1420725. The change that is applied there is what introduced this error. Before that patch the iconv decoding methods where defined in the REXML::Encoding module which does have the @encoding field.

      I've attached a possible fix in the form of a patched version of ICONV.rb.

      ICONV.rb

      class ICONVEncoder
        def decode(str)
          Iconv.conv(UTF_8, @encoding, str)
        end
            
        def encode(content)
          Iconv.conv(@encoding, UTF_8, content)
        end
      end
      
      iconv = ICONVEncoder.new
      register("ICONV") do |obj|
        Iconv.conv(UTF_8, obj.encoding, nil)
        obj.encoder = iconv
      end
      

      TypeError that is triggered during decoding

      TypeError: can't convert NilClass into String
         initialize at org/jruby/RubyIconv.java:207
               conv at org/jruby/RubyIconv.java:391
             encode at /opt/jruby-1.6.7/lib/ruby/1.8/rexml/encodings/ICONV.rb:12
          encoding= at /opt/jruby-1.6.7/lib/ruby/1.8/rexml/source.rb:55
         initialize at /opt/jruby-1.6.7/lib/ruby/1.8/rexml/source.rb:45
         initialize at /opt/jruby-1.6.7/lib/ruby/1.8/rexml/source.rb:160
        create_from at /opt/jruby-1.6.7/lib/ruby/1.8/rexml/source.rb:16
            stream= at /opt/jruby-1.6.7/lib/ruby/1.8/rexml/parsers/baseparser.rb:121
         initialize at /opt/jruby-1.6.7/lib/ruby/1.8/rexml/parsers/baseparser.rb:110
         initialize at /opt/jruby-1.6.7/lib/ruby/1.8/rexml/parsers/treeparser.rb:9
              build at /opt/jruby-1.6.7/lib/ruby/1.8/rexml/document.rb:227
         initialize at /opt/jruby-1.6.7/lib/ruby/1.8/rexml/document.rb:43
      
      1. ICONV.rb
        0.5 kB
        Pepijn Van Eeckhoudt
      2. test.rb
        0.1 kB
        Pepijn Van Eeckhoudt
      3. test.xml
        0.0 kB
        Pepijn Van Eeckhoudt

        Activity

        Hide
        Bill DePhillips added a comment -

        I saw this same bug. A workaround I used was to remove the encoding="ISO-8859" from the xml prolog.

        Show
        Bill DePhillips added a comment - I saw this same bug. A workaround I used was to remove the encoding="ISO-8859" from the xml prolog.
        Hide
        Pepijn Van Eeckhoudt added a comment -

        I think this is resolved by the fix for JRUBY-6517

        Show
        Pepijn Van Eeckhoudt added a comment - I think this is resolved by the fix for JRUBY-6517
        Hide
        Charles Oliver Nutter added a comment -

        Working on master.

        Show
        Charles Oliver Nutter added a comment - Working on master.

          People

          • Assignee:
            Charles Oliver Nutter
            Reporter:
            Pepijn Van Eeckhoudt
          • Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved: