JRuby (please use github issues at http://bugs.jruby.org)
  1. JRuby (please use github issues at http://bugs.jruby.org)
  2. JRUBY-5555

Do a better job of determining "locale" encoding when default charset's name is not in our encoding tables

    Details

    • Type: Bug Bug
    • Status: Open Open
    • Priority: Major Major
    • Resolution: Unresolved
    • Affects Version/s: JRuby 1.6RC2
    • Fix Version/s: None
    • Component/s: Ruby 1.9.2
    • Labels:
      None
    • Number of attachments :
      0

      Description

      See JRUBY-5525. There, we originally had a problem that if the default Java Charset's name was not found in our encoding tables (EncodingDB), we would cause a NPE. My fix was to failover to the 1.9-mode "locale" encoding, which ultimately just uses ASCII-8BIT when the default charset's name doesn't look up.

      We should endeavor to do better than failing over to ASCII-8BIT when we can't match Java's default Charset to an Encoding.

        Activity

        Hide
        Charles Oliver Nutter added a comment -

        For the MacRoman and MacCentralEurope cases, I'm not sure we've got much course of action. The aliases they present are all still "Mac" charsets:

        ~/projects/jruby ➔ java -Dfile.encoding=MacCentralEurope -jar lib/jruby.jar -rjava -e "puts java.nio.charset.Charset.defaultCharset().aliases().to_a"
        MacCentralEurRoman
        x-mac-centraleurope
        MacCentralEuropean
        x-mac-centraleurroman
        x-mac-ce
        MacCentralEurope
        x-mac-centraleuropean
        
        ~/projects/jruby ➔ java -Dfile.encoding=MacRoman -jar lib/jruby.jar -rjava -e "puts java.nio.charset.Charset.defaultCharset().aliases().to_a"
        MacRoman
        mac
        csMacintosh
        x-MacRoman
        x-mac-roman
        

        I'd like to get some charsets from other locales and platforms to see if it's worth trying to use aliases to do a second search of encodings.

        Show
        Charles Oliver Nutter added a comment - For the MacRoman and MacCentralEurope cases, I'm not sure we've got much course of action. The aliases they present are all still "Mac" charsets: ~/projects/jruby ➔ java -Dfile.encoding=MacCentralEurope -jar lib/jruby.jar -rjava -e "puts java.nio.charset.Charset.defaultCharset().aliases().to_a" MacCentralEurRoman x-mac-centraleurope MacCentralEuropean x-mac-centraleurroman x-mac-ce MacCentralEurope x-mac-centraleuropean ~/projects/jruby ➔ java -Dfile.encoding=MacRoman -jar lib/jruby.jar -rjava -e "puts java.nio.charset.Charset.defaultCharset().aliases().to_a" MacRoman mac csMacintosh x-MacRoman x-mac-roman I'd like to get some charsets from other locales and platforms to see if it's worth trying to use aliases to do a second search of encodings.

          People

          • Assignee:
            Thomas E Enebo
            Reporter:
            Charles Oliver Nutter
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated: