Issue Details (XML | Word | Printable)

Key: JRUBY-867
Type: Bug Bug
Status: Closed Closed
Resolution: Fixed
Priority: Minor Minor
Assignee: Ola Bini
Reporter: Nick Plante
Votes: 0
Watchers: 1
Operations

If you were logged in you would be able to see more operations.
JRuby

Iconv character set option //translit is not supported

Created: 23/Apr/07 12:13 AM   Updated: 23/Apr/08 10:02 AM
Component/s: Core Classes/Modules
Affects Version/s: JRuby 0.9.8
Fix Version/s: JRuby 1.1RC2

Time Tracking:
Not Specified

Environment: OS X, 10.4.9 Intel core 2 duo (macbook pro)


 Description  « Hide
Trying to convert from 'ascii/ignore//translit' to some other character set, Iconv under JRuby doesn't seem to recognize the translit (allow transliteration) and ignore (ignore conversion errors) options (appended to the character type string as demonstrated). Raises java.nio.charset.IllegalCharsetNameException. See stack trace:

java.nio.charset.IllegalCharsetNameException: ascii//ignore//translit

Charset.java:285:in `java.nio.charset.Charset.checkName'
Charset.java:459:in `java.nio.charset.Charset.lookup2'
Charset.java:438:in `java.nio.charset.Charset.lookup'
Charset.java:497:in `java.nio.charset.Charset.forName'
Charset.java:285:in `java.nio.charset.Charset.checkName'
Charset.java:459:in `java.nio.charset.Charset.lookup2'
Charset.java:438:in `java.nio.charset.Charset.lookup'
Charset.java:497:in `java.nio.charset.Charset.forName'
Charset.java:285:in `java.nio.charset.Charset.checkName'
Charset.java:459:in `java.nio.charset.Charset.lookup2'
Charset.java:438:in `java.nio.charset.Charset.lookup'
Charset.java:497:in `java.nio.charset.Charset.forName'
Charset.java:285:in `java.nio.charset.Charset.checkName'
Charset.java:459:in `java.nio.charset.Charset.lookup2'
Charset.java:438:in `java.nio.charset.Charset.lookup'
Charset.java:497:in `java.nio.charset.Charset.forName'
Charset.java:285:in `java.nio.charset.Charset.checkName'
Charset.java:459:in `java.nio.charset.Charset.lookup2'
Charset.java:438:in `java.nio.charset.Charset.lookup'
Charset.java:497:in `java.nio.charset.Charset.forName'
app/models/question.rb:4:in `call'
app/controllers/questions_controller.rb:58:in `transaction'
app/controllers/questions_controller.rb:58:in `create'
app/controllers/questions_controller.rb:60:in `create'

Same iconv method call works fine under normal Ruby 1.8.5 VM.



 All   Comments   Work Log   Change History      Sort Order: Ascending order - Click to sort in descending order
Charles Oliver Nutter added a comment - 23/Apr/07 12:16 AM
The problem here is that we aren't even handling this formatted input string for encodings. This is how iconv specifies appropriate settings for the conversion, much like Java's charset stuff does. We just need to find all the options, handle the parsing, and set up charset appropriately.

Finish for 1.0...shouldn't be a huge deal now that we know about it, but any information on the various iconv settings would speed the process along.


Thomas E Enebo added a comment - 05/May/07 06:32 PM
Bumping (to be done with other iconv issues)

Thomas E Enebo added a comment - 12/May/07 06:14 PM
soichiro ohba added support for ignore. Transliteration appears to be a tougher problem. Marking as 1.x feature since we will not get this done before final 1.0 release (unless someone knows an easy way to add transliteration)

Charles Oliver Nutter added a comment - 30/Sep/07 11:52 AM
Revisit for 1.1. Maybe not fixed, but we should give it a go.

Thomas E Enebo added a comment - 30/Sep/07 01:39 PM
Unless we can find an outside library which does tranliteration I don't think we can do this since Java does not support it.

Charles Oliver Nutter added a comment - 30/Sep/07 04:50 PM
Is any additional work needed to make the iconv string at least not blow up? Perhaps we could ignore the translit part or display a warning and continue.

Koichiro Ohba added a comment - 20/Dec/07 08:13 PM
How about the use of ICU4J?
http://www.icu-project.org/index.html

Charles Oliver Nutter added a comment - 20/Dec/07 08:26 PM
Very interesting Koichiro...I did not know this project existed. Would it provide the missing functionality we need? Is there any down side to using it instead of the Java Charset classes?

Koichiro Ohba added a comment - 21/Dec/07 07:33 AM
Yes, ICU4J would provide the translit functionality to iconv on JRuby.
(For further information, see http://www.icu-project.org/apiref/icu4j/com/ibm/icu/text/Transliterator.html )

But, there is a down side actually. It's too huge. (icu4j.jar<4.4MB>, icu4j-charsets.jar<2.4MB>).
It might be overdoing to include such a huge library just for the translit fanctionality, which can be said rarely-used.
I think we can just avoid throwing fatal errors and leave that missing.


Charles Oliver Nutter added a comment - 21/Dec/07 04:16 PM
I would be satisfied if we can fix it to just ignore the translit and perhaps warn. If anyone complains about translit missing we'll bother them to write us a gem-based Iconv extension with ICU4J.

Ola Bini added a comment - 09/Feb/08 08:09 AM
Have added tests to trunk to make sure that we actually ignore these values.