Details
-
Type:
Bug
-
Status:
Resolved
-
Priority:
Minor
-
Resolution: Fixed
-
Affects Version/s: JRuby 1.6.5
-
Fix Version/s: JRuby 1.7.0.pre2
-
Component/s: Ruby 1.9.2, Ruby 1.9.3
-
Labels:None
-
Number of attachments :
Description
As shown in the comments for JRUBY-6200, JRuby's "Psych" wrapper around SnakeYAML can't handle some unicode characters. This may be a SnakeYAML bug, or something in YAML spec we don't know about:
system ~/projects/jruby $ jruby --1.9 -ryaml -e 'YAML.load("\ufffd".to_yaml)'
StreamReader.java:98:in `checkPrintable': unacceptable character '�' (0xFFFD) special characters are not allowed
in "<reader>", position 4
from StreamReader.java:191:in `update'
from StreamReader.java:63:in `<init>'
from PsychParser.java:115:in `parse'
from PsychParser$INVOKER$i$1$0$parse.gen:65535:in `call'
I'm filing this because the original cases in JRUBY-6200 are resolved, and this seems to be a separate issue that won't affect most people.
Issue Links
- relates to
-
JRUBY-6200
[1.9] Loading some Unicode characters results in non-printable characters on Windows
-
My understanding is that this is SnakeYAML's design. SnakeYAML follows PyYAML and warns if it sees "nonprintable" characters.
http://code.google.com/p/snakeyaml/source/browse/src/main/java/org/yaml/snakeyaml/reader/StreamReader.java#33
As I noted in
JRUBY-6200, MRI dutifully prints any UTF character we throw at it.$ ruby2.0 -v -ryaml -e 'p YAML.load("\ufffe".to_yaml)' ruby 2.0.0dev (2011-12-31 trunk 34165) [x86_64-darwin11.2.0] "\uFFFE"It seems to me that the best we can do is to catch org.yaml.snakeyaml.reader.ReaderException and inform the user of the problem (noting that \ufffd may be an indication that JVM might have thrown an exception). To print these characters which SnakeYAML deems nonprintable, I think we need to parse the YAML input before we pass it off to SnakeYAML; this defeats the purpose of using SnakeYAML in the first place.