Maven 2 & 3
  1. Maven 2 & 3
  2. MNG-2254

the encoding parameter in xml declaration of POM is ignored

    Details

    • Type: Bug Bug
    • Status: Closed Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 2.0.8
    • Component/s: POM::Encoding
    • Labels:
      None
    • Complexity:
      Intermediate
    • Patch Submitted:
      Yes
    • Number of attachments :
      8

      Description

      DefaultMavenProjectBuilder reads POM in system default character encoding, and the encoding parameter in xml declartion is ignored.
      to fix this problem, We should

      • fix modello-plugin-xpp3 to use the xml parser which is able to handle the encoding parameter properly
      • regenerate maven-model using fixed modello-plugin-xpp3
      • fix DefaultMavenProjectBuilder to use regenerated maven-model properly.
      1. DefaultMavenProjectBuilder.diff
        3 kB
        Naoki Nose
      2. MNG-2254_artifact.diff
        2 kB
        Herve Boutemy
      3. MNG-2254_components.diff
        12 kB
        Herve Boutemy
      4. MNG-2254_components-new.diff
        13 kB
        Herve Boutemy
      5. MNG-2254.diff
        3 kB
        Herve Boutemy
      6. MNG-2254-2.diff
        10 kB
        Herve Boutemy
      7. mng-2254-PomEncoding.tgz
        2 kB
        Herve Boutemy
      8. modello-plugin-xpp3.diff
        8 kB
        Naoki Nose

        Issue Links

          Activity

          Hide
          Naoki Nose added a comment -

          Attached files are the patch to fix this issue.
          These files depend on the patch of MXParser which I attached to the following issue.

          http://jira.codehaus.org/browse/MNG-2148

          using a external standard xml pull parser may be better solution, as discussed in the following issue.

          http://jira.codehaus.org/browse/DOXIA-60

          Show
          Naoki Nose added a comment - Attached files are the patch to fix this issue. These files depend on the patch of MXParser which I attached to the following issue. http://jira.codehaus.org/browse/MNG-2148 using a external standard xml pull parser may be better solution, as discussed in the following issue. http://jira.codehaus.org/browse/DOXIA-60
          Hide
          Carlos Sanchez added a comment -

          it'd be better to swith to a external xml parser, see MNG-2255

          Show
          Carlos Sanchez added a comment - it'd be better to swith to a external xml parser, see MNG-2255
          Hide
          Stefan Hübner added a comment -

          As I just stated in MNG-2255 I think, that "...the parser used to read in maven's model isn't the problem really. It's more likely the way the POM is handed to the parser."

          See MNG-2255 for full discussion of the issue.

          Show
          Stefan Hübner added a comment - As I just stated in MNG-2255 I think, that "...the parser used to read in maven's model isn't the problem really. It's more likely the way the POM is handed to the parser." See MNG-2255 for full discussion of the issue.
          Hide
          Herve Boutemy added a comment -

          for problems about encodings, there are ways to write a Reader that automagically detects the encoding of a XML stream:
          see https://rome.dev.java.net/apidocs/0_5/com/sun/syndication/io/XmlReader.html

          this permits to use a reader as a source for an XML parser without loosing any information: the magic is done by the reader, not by the parser.

          Such a solution would be an easy fix: the actual MavenXpp3Reader signature could be sufficient (even if using an InputStream would be more classical)

          Show
          Herve Boutemy added a comment - for problems about encodings, there are ways to write a Reader that automagically detects the encoding of a XML stream: see https://rome.dev.java.net/apidocs/0_5/com/sun/syndication/io/XmlReader.html this permits to use a reader as a source for an XML parser without loosing any information: the magic is done by the reader, not by the parser. Such a solution would be an easy fix: the actual MavenXpp3Reader signature could be sufficient (even if using an InputStream would be more classical)
          Hide
          Jason van Zyl added a comment -

          Herve, if you want take a crack at this I would certainly incorporate this into Modello.

          Show
          Jason van Zyl added a comment - Herve, if you want take a crack at this I would certainly incorporate this into Modello.
          Hide
          Herve Boutemy added a comment -

          Here is a patch to use XmlReader from plexus-utils 1.4.3-SNAPSHOT from MPLEXUS-11 branch.
          Modello change is not absolutely necessary: adding read(InputStream) to read(Reader) is only an enhencement to make people prefer the first one, but using the second one with an XmlReader instance is strictly the same

          Show
          Herve Boutemy added a comment - Here is a patch to use XmlReader from plexus-utils 1.4.3-SNAPSHOT from MPLEXUS-11 branch. Modello change is not absolutely necessary: adding read(InputStream) to read(Reader) is only an enhencement to make people prefer the first one, but using the second one with an XmlReader instance is strictly the same
          Hide
          Herve Boutemy added a comment -

          Here is a new patch fixing encoding detection for pom.xml, settings.xml and profiles.xml

          Show
          Herve Boutemy added a comment - Here is a new patch fixing encoding detection for pom.xml, settings.xml and profiles.xml
          Hide
          Herve Boutemy added a comment -

          see http://docs.codehaus.org/display/MAVENUSER/XML+encoding for the roadmap to XML encoding support in Maven2

          Show
          Herve Boutemy added a comment - see http://docs.codehaus.org/display/MAVENUSER/XML+encoding for the roadmap to XML encoding support in Maven2
          Hide
          Jason van Zyl added a comment -

          I will take a look, I have walked through everything else this week: plexus and modello, so now I can push in the changes into Maven's core.

          Show
          Jason van Zyl added a comment - I will take a look, I have walked through everything else this week: plexus and modello, so now I can push in the changes into Maven's core.
          Hide
          Herve Boutemy added a comment -

          splitted the patch into 2 pieces since maven-artifact has been moved to artifact

          Show
          Herve Boutemy added a comment - splitted the patch into 2 pieces since maven-artifact has been moved to artifact
          Hide
          Andrew Williams added a comment -

          Can you please clarify which patches apply and which do not

          Show
          Andrew Williams added a comment - Can you please clarify which patches apply and which do not
          Hide
          Herve Boutemy added a comment - - edited

          the 2 latest only are to be applied:

          • MNG-2254_artifact.diff for maven/artifact/trunk and maven/components/branches/maven-2.0.x/maven-artifact
          • MNG-2254_components.diff for maven/components/trunk and maven/components/branches/maven-2.0.x
          Show
          Herve Boutemy added a comment - - edited the 2 latest only are to be applied: MNG-2254 _artifact.diff for maven/artifact/trunk and maven/components/branches/maven-2.0.x/maven-artifact MNG-2254 _components.diff for maven/components/trunk and maven/components/branches/maven-2.0.x
          Hide
          brianfox brianfox added a comment -

          Can someone make an IT for this? See the instructions here: http://docs.codehaus.org/display/MAVEN/Creating+a+Maven+Integration+Test

          Show
          brianfox brianfox added a comment - Can someone make an IT for this? See the instructions here: http://docs.codehaus.org/display/MAVEN/Creating+a+Maven+Integration+Test
          Hide
          Herve Boutemy added a comment -

          Hi Brian,

          Here is an IT to check that pom.xml encoding is properly detected: the sample pom.xml is in utf-16, then if encoding is not detected, the build simply fails (no complicated test to check that an accentuated character would have been correctly read)

          I checked that it fails without the patch, not that it works with the patch applied...

          Show
          Herve Boutemy added a comment - Hi Brian, Here is an IT to check that pom.xml encoding is properly detected: the sample pom.xml is in utf-16, then if encoding is not detected, the build simply fails (no complicated test to check that an accentuated character would have been correctly read) I checked that it fails without the patch, not that it works with the patch applied...
          Hide
          brianfox brianfox added a comment -

          Thanks Herve, the test looks good. I wasn't able to get the patches applied cleanly. Can you update them and then I'll apply and test? (i'm going to add the test to the core suite)

          Show
          brianfox brianfox added a comment - Thanks Herve, the test looks good. I wasn't able to get the patches applied cleanly. Can you update them and then I'll apply and test? (i'm going to add the test to the core suite)
          Hide
          Herve Boutemy added a comment -

          ok: here is a new patch

          Show
          Herve Boutemy added a comment - ok: here is a new patch
          Hide
          Herve Boutemy added a comment -

          work done:

          • r585265 for trunk
          • r585268 for 2.0.x branch
          Show
          Herve Boutemy added a comment - work done: r585265 for trunk r585268 for 2.0.x branch

            People

            • Assignee:
              Herve Boutemy
              Reporter:
              Naoki Nose
            • Votes:
              9 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: