jira.codehaus.org

  • Log In Access more options
    • Online Help
    • Keyboard Shortcuts
    • About JIRA
    • JIRA Credits
    • What?s New
  • Dashboards Access more options (Alt+d)
  • Projects Access more options (Alt+p)
  • Issues Access more options (Alt+i)
Signup
Maven 2 & 3
  • Maven 2 & 3
  • MNG-2932

Encoding chaos

  • Log In
  • Views
    • XML
    • Word
    • Printable

Details

  • Type: Bug Bug
  • Status: Closed Closed
  • Priority: Major Major
  • Resolution: Fixed
  • Affects Version/s: 2.0.4, 2.0.5, 2.0.6
  • Fix Version/s: 2.0.8
  • Component/s: POM::Encoding
  • Labels:
    None
  • Environment:
    windows, linux
  • Complexity:
    Intermediate
  • Number of attachments :
    0

Description

I have tried maven on a project where javadocs, xdocs, pom-comments are in a native language with many NON-ASCII characters.
This seems to reveal that maven is not acting clean with different encodings.

For instance the xdocs are XML. And XML allows me to use different encodings if properly declared in the xml header. However it only works if I encode the XML as UTF-8. If I use ISO-8859-1 then the produced HTML contains UTF-8 characters from the nationalized site messages (resource bundles of maven plugins) and maven dumps the ISO-8859-1 encoded characters into that and ends up with mixed encodings in one HTML page.

Additionally the JAVA files also cause trouble when I use a different encoding than UTF-8. I configured the "encoding" for javadoc plugin to ISO-8859-1 and used Java files in that encoding. The resulting javadoc HTML was written in ISO-8859-1 but the browser displayed it as UTF-8 and I had to switch explicitly to ISO-8859-1 in firefox in order to have the special characters displayed properly.

Further I encounter trouble when I use special characters in pom.xml files that go onto the generated web-site. In the end I could NOT find a way to have a site without problems - even when I encode everything as UTF-8.

Maybe there are too few developers involved from non english-speaking countries that are used to think beyond US-ASCII

Unfortunatly I can not tell where the problems come from - it may be XPP, doxia, site-plugin or individual reports or all together.
You need to properly distinguish between input and output encoding and have to be extremly careful with Stuff like byte[]
and never parse XML from strings.

Can you reproduce the problem or do you need dummy projects as test-cases?

Issue Links

depends upon

Bug - A problem which impairs or prevents the functions of the product. DOXIA-133 default XML encoding (UTF-8) or XML encoding set in XML files is ignored: inputEncoding is used instead

  • Major - Major loss of function.
  • Closed - The issue is considered finished, the resolution is correct. Issues which are not closed can be reopened.
duplicates

Bug - A problem which impairs or prevents the functions of the product. MNG-2254 the encoding parameter in xml declaration of POM is ignored

  • Major - Major loss of function.
  • Closed - The issue is considered finished, the resolution is correct. Issues which are not closed can be reopened.
relates to

Improvement - An improvement or enhancement to an existing feature or task. MNG-2216 Add default encodings section to POM

  • Major - Major loss of function.
  • Open - The issue is open and ready for the assignee to start work on it.

Activity

  • All
  • Comments
  • Work Log
  • History
  • Activity
No work has yet been logged on this issue.

People

  • Assignee:
    Herve Boutemy
    Reporter:
    Jörg Hohwiller
Vote (3)
Watch (2)

Dates

  • Created:
    05/Apr/07 3:47 AM
    Updated:
    04/Dec/07 1:25 PM
    Resolved:
    17/Oct/07 4:13 PM
  • Atlassian JIRA (v5.2.7#850-sha1:b2af0c8)
  • Report a problem
  • Powered by a free Atlassian JIRA open source license for Codehaus. Try JIRA - bug tracking software for your team.