Maven
  1. Maven
  2. MNG-2057

The Maven 2.0.2 XML parser fails to parse a UTF-8 POM that begins with the optional byte-order mark.

    Details

    • Type: Bug Bug
    • Status: Closed Closed
    • Priority: Major Major
    • Resolution: Duplicate
    • Affects Version/s: 2.0.2
    • Fix Version/s: None
    • Component/s: POM::Encoding
    • Labels:
      None
    • Environment:
      java version "1.5.0_06"
      Java(TM) 2 Runtime Environment, Standard Edition (build 1.5.0_06-b05)
      Java HotSpot(TM) Client VM (build 1.5.0_06-b05, mixed mode, sharing)

      Microsoft Windows XP [Version 5.1.2600]
    • Complexity:
      Intermediate
    • Number of attachments :
      1

      Description

      The Byte-Order mark is optional, and discouraged in the UTF-8 encoding; but the Unicode specification is clear that it is allowed – if you read the Unicode Standard v4, section 2.6, and Table 2.3, and section 15.9, and table 15.3; it is clear that the BOM is allowed at the start of a UTF-8 file.

      It so happens that this is the way Windows NotePad saves files when you select UTF-8; and Maven will not parse it. I'll attach a small POM saved this way, and I'll put it into a ZIP file to hopefully preserve the encoding. Here is the Maven output:

      [INFO] Scanning for projects...
      [INFO] ----------------------------------------------------------------------------
      [ERROR] FATAL ERROR
      [INFO] ----------------------------------------------------------------------------
      [INFO] Error building POM (may not be this project's POM).

      Project ID: unknown
      POM Location: C:\Documents and Settings\Coco\Desktop\pom.xml

      Reason: Parse error reading POM. Reason: only whitespace content allowed before start tag and not \u
      ef (position: START_DOCUMENT seen \uef... @1:1)

      [INFO] ----------------------------------------------------------------------------
      [INFO] Trace
      org.apache.maven.reactor.MavenExecutionException: Parse error reading POM. Reason: only whitespace c
      ontent allowed before start tag and not \uef (position: START_DOCUMENT seen \uef... @1:1)
      at org.apache.maven.DefaultMaven.getProjects(DefaultMaven.java:365)
      at org.apache.maven.DefaultMaven.doExecute(DefaultMaven.java:278)
      at org.apache.maven.DefaultMaven.execute(DefaultMaven.java:115)
      at org.apache.maven.cli.MavenCli.main(MavenCli.java:249)
      at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
      at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
      at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
      at java.lang.reflect.Method.invoke(Method.java:585)
      at org.codehaus.classworlds.Launcher.launchEnhanced(Launcher.java:315)
      at org.codehaus.classworlds.Launcher.launch(Launcher.java:255)
      at org.codehaus.classworlds.Launcher.mainWithExitCode(Launcher.java:430)
      at org.codehaus.classworlds.Launcher.main(Launcher.java:375)
      Caused by: org.apache.maven.project.InvalidProjectModelException: Parse error reading POM. Reason: o
      nly whitespace content allowed before start tag and not \uef (position: START_DOCUMENT seen \uef...
      @1:1)
      at org.apache.maven.project.DefaultMavenProjectBuilder.readModel(DefaultMavenProjectBuilder.
      java:1134)
      at org.apache.maven.project.DefaultMavenProjectBuilder.readModel(DefaultMavenProjectBuilder.
      java:1094)
      at org.apache.maven.project.DefaultMavenProjectBuilder.buildFromSourceFile(DefaultMavenProje
      ctBuilder.java:289)
      at org.apache.maven.project.DefaultMavenProjectBuilder.build(DefaultMavenProjectBuilder.java
      :274)
      at org.apache.maven.DefaultMaven.getProject(DefaultMaven.java:515)
      at org.apache.maven.DefaultMaven.collectProjects(DefaultMaven.java:447)
      at org.apache.maven.DefaultMaven.getProjects(DefaultMaven.java:351)
      ... 11 more
      Caused by: org.codehaus.plexus.util.xml.pull.XmlPullParserException: only whitespace content allowed
      before start tag and not \uef (position: START_DOCUMENT seen \uef... @1:1)
      at org.codehaus.plexus.util.xml.pull.MXParser.parseProlog(MXParser.java:1516)
      at org.codehaus.plexus.util.xml.pull.MXParser.nextImpl(MXParser.java:1392)
      at org.codehaus.plexus.util.xml.pull.MXParser.next(MXParser.java:1090)
      at org.apache.maven.model.io.xpp3.MavenXpp3Reader.read(MavenXpp3Reader.java:4545)
      at org.apache.maven.project.DefaultMavenProjectBuilder.readModel(DefaultMavenProjectBuilder.
      java:1130)
      ... 17 more
      [INFO] ----------------------------------------------------------------------------
      [INFO] Total time: < 1 second
      [INFO] Finished at: Wed Feb 08 21:14:03 EST 2006
      [INFO] Final Memory: 1M/2M
      [INFO] ----------------------------------------------------------------------------

        Issue Links

          Activity

          No work has yet been logged on this issue.

            People

            • Assignee:
              Herve Boutemy
              Reporter:
              Steven Coco
            • Votes:
              4 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: