Details
-
Type:
Bug
-
Status:
Closed
-
Priority:
Major
-
Resolution: Fixed
-
Affects Version/s: None
-
Fix Version/s: 2.0-beta-6
-
Component/s: encoding
-
Labels:None
-
Number of attachments :
Description
There is various encoding problems with InputStream and XML in different components.
- Property resource file is encoded with UTF-8 , but Java reads bundle with UTF-8.
- In different components Reader is constructed with default system encoding.
- MXParser ignores encoding attribute in xml declaration.
Issue Links
- depends upon
-
DOXIA-60
Use a external XML Pull parser instead of plexus one
-
- is related to
-
MSITE-123
Output encoding is UTF-8 despite outputEncoding is set to ISO-8859-1
-
-
DOXIA-119
How to deal with encoding and documentation
-
- relates to
-
MSITE-239
encoding declaration in site.xml is ignored
-
-
PLXUTILS-11
MXParser can't handle the encoding declaration in XML declaration
-
This issue appears currently for a Japanese translation and maybe for other East Asian languages (CJK charsets).
PropertyResourceBundle class uses Properties internally: the ISO 8859-1 character encoding is used to load properties.
Have a look to the API:
http://java.sun.com/j2se/1.4.2/docs/api/java/util/PropertyResourceBundle.html
http://java.sun.com/j2se/1.4.2/docs/api/java/util/Properties.html
So, I propose to correct plexus-i18n and use it instead of ResourceBundle.getBundle() calls (I think specifically in maven-project-info-reports-plugin subproject). See plexus-i18n.diff.
Another solution could be to use native2ascii in each bundles but IMHO it is not really human readable.
Have a look to plexus-utils.diff and plexus-site-renderer.diff
Another issue could be in the toString() method from Xpp3Dom class: we need to add a default encoding. See plexus-utils_2.diff.
http://svn.apache.org/repos/asf/ant/core/trunk/src/main/org/apache/tools/ant/filters/StringInputStream.java
It is hard to debug charset problems and depends on several factors.
Other ideas are welcome.