Maven Changes Plugin
  1. Maven Changes Plugin
  2. MCHANGES-66

The changes plugin scatters white space over its Changes report

    Details

    • Type: Bug Bug
    • Status: Closed Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 2.0
    • Component/s: changes.xml
    • Labels:
      None
    • Number of attachments :
      6

      Description

      The changelog plugin reads the contents of the changes.xml file and scatters \n characters into it. The attached patch fixes this and also improves performance by using a string buffer.

      1. changes.xml
        6 kB
        Benjamin Bentmann
      2. changes4.patch
        2 kB
        Henning Schmiedehausen
      3. changes7.patch
        3 kB
        Henning Schmiedehausen
      4. changes-report.html
        9 kB
        Benjamin Bentmann
      5. sax-parsing.patch
        2 kB
        Benjamin Bentmann

        Activity

        Hide
        Henning Schmiedehausen added a comment -

        That patch is even a bit better, because it also passes XML tags that are part of the various fields through to the output. The old Maven-1 plugin allowed e.g. to have links or <code> fields inside the action texts. This patch now also restores this behaviour.

        Show
        Henning Schmiedehausen added a comment - That patch is even a bit better, because it also passes XML tags that are part of the various fields through to the output. The old Maven-1 plugin allowed e.g. to have links or <code> fields inside the action texts. This patch now also restores this behaviour.
        Hide
        Stéphane Nicoll added a comment -

        The patch breaks some of my announcement so I need to look into it to fix it.

        With a sample announcement, such as :

        <document>
          <properties>
            <title>Changes Tester Project</title>
            <author email="jruiz@exist.com">Johnny R. Ruiz III</author>
          </properties>
          <body>
          
            <release version="1.0" date="2005-01-01" description="First release">
              <action dev="jruiz" type="update">
                Uploaded documentation on how to use the plugin.
              </action>
               <action dev="jruiz" type="add">
                Added additional documentation on how to configure the plugin.
              </action>
              <action dev="aramirez" type="fix" issue="MPJIRA-11">
                Enable retrieving component-specific issues.
              </action>
              <action dev="jruiz" type="fix" due-to="Allan Ramirez" due-to-email="aramirez@exist.com">
                The element type " link " must be terminated by the matching end-tag.
                Deleted the erroneous code.
              </action>
            </release>
          </body>
        </document>
        

        I got this:

        The ejbs-team is pleased to announce the ejbs-1.0.ejb release!
        
        http://www.blah.com
        
        Changes in this version include:
        
        
        New Features:
        
        o Added additional documentation on how to configure the plugin. 
        
        
        Fixed Bugs:
        
        o Enable retrieving component-specific issues.  Issue: MPJIRA-11. 
        o The element type " link " must be terminated by the matching end-tag.         Deleted the erroneous code. Thanks to Allan Ramirez. 
        
        
        Changes:
        
        o </properties>                           Uploaded documentation on how to use the plugin. 
        
        
        Removed:
        
        
        
        For a manual installation, you can download the ejbs-1.0.ejb here:
        
        http://www.blah.com/ejbs-1.0.jar
        
        
        Have fun!
        -ejbs-team
        
        
        Show
        Stéphane Nicoll added a comment - The patch breaks some of my announcement so I need to look into it to fix it. With a sample announcement, such as : <document> <properties> <title> Changes Tester Project </title> <author email= "jruiz@exist.com" > Johnny R. Ruiz III </author> </properties> <body> <release version= "1.0" date= "2005-01-01" description= "First release" > <action dev= "jruiz" type= "update" > Uploaded documentation on how to use the plugin. </action> <action dev= "jruiz" type= "add" > Added additional documentation on how to configure the plugin. </action> <action dev= "aramirez" type= "fix" issue= "MPJIRA-11" > Enable retrieving component-specific issues. </action> <action dev= "jruiz" type= "fix" due-to= "Allan Ramirez" due-to-email= "aramirez@exist.com" > The element type " link " must be terminated by the matching end-tag. Deleted the erroneous code. </action> </release> </body> </document> I got this: The ejbs-team is pleased to announce the ejbs-1.0.ejb release! http://www.blah.com Changes in this version include: New Features: o Added additional documentation on how to configure the plugin. Fixed Bugs: o Enable retrieving component-specific issues. Issue: MPJIRA-11. o The element type " link " must be terminated by the matching end-tag. Deleted the erroneous code. Thanks to Allan Ramirez. Changes: o </properties> Uploaded documentation on how to use the plugin. Removed: For a manual installation, you can download the ejbs-1.0.ejb here: http://www.blah.com/ejbs-1.0.jar Have fun! -ejbs-team
        Hide
        Dennis Lundberg added a comment -

        Henning, can you supply a changes.xml file that highlights this problem?

        Also, I'm wondering where the scattered '\n' characters turn up. Is it in the announcement, like it is for Stephane, or is it somewhere else?

        Show
        Dennis Lundberg added a comment - Henning, can you supply a changes.xml file that highlights this problem? Also, I'm wondering where the scattered '\n' characters turn up. Is it in the announcement, like it is for Stephane, or is it somewhere else?
        Hide
        Stéphane Nicoll added a comment -

        Not sure my example is still reproductible with the latest SVN state

        Show
        Stéphane Nicoll added a comment - Not sure my example is still reproductible with the latest SVN state
        Hide
        Benjamin Bentmann added a comment -

        Dennis, here is the changes.xml you requested. To see the unwanted newlines, checkout for lines that are longer than the others in the generated changes report.

        Attached is another try for a patch. It's a variation of Henning's first patch. To my knowledge, a SAX parser may split the input strings at ANY position when reporting character chunks (see also Top Ten SAX2 Tips). For instance, the string "Word" may be reported as the two chunks "Wo" and "rd" (that's why inserting a newline after each chunk is a bad idea). Likewise, it might happen that the string "Two Words" gets reported as "Two", " " and "Words" (that's why trimming the chunks before concatenation is a bad idea because it would yield "TwoWords"). So whatever string transformations the plugin needs, it should be done at the end after the entire input string has been read and reconstructed.

        Show
        Benjamin Bentmann added a comment - Dennis, here is the changes.xml you requested. To see the unwanted newlines, checkout for lines that are longer than the others in the generated changes report. Attached is another try for a patch. It's a variation of Henning's first patch. To my knowledge, a SAX parser may split the input strings at ANY position when reporting character chunks (see also Top Ten SAX2 Tips ). For instance, the string "Word" may be reported as the two chunks "Wo" and "rd" (that's why inserting a newline after each chunk is a bad idea). Likewise, it might happen that the string "Two Words" gets reported as "Two", " " and "Words" (that's why trimming the chunks before concatenation is a bad idea because it would yield "TwoWords"). So whatever string transformations the plugin needs, it should be done at the end after the entire input string has been read and reconstructed.
        Hide
        Dennis Lundberg added a comment -

        Thanks for the changes-file Benjamin. I tried it using both 2.0-beta-3 and the latest 2.0-beta-4-SNAPSHOT and I don't see any scattered whitespace. All the lines in the generated html report are of equal length. Please try it yourself with the attached project.

        Show
        Dennis Lundberg added a comment - Thanks for the changes-file Benjamin. I tried it using both 2.0-beta-3 and the latest 2.0-beta-4-SNAPSHOT and I don't see any scattered whitespace. All the lines in the generated html report are of equal length. Please try it yourself with the attached project.
        Hide
        Benjamin Bentmann added a comment -

        Well, I tried 2.0-beta-3 as well as the current 2.0-beta-4-SNAPSHOT (build from source r611604), both embedded newlines as I expected. I attached the generated changes-report.html just that you can get a feel of it.

        I previously tried to point out that the plugin currently misuses the SAX API (i.e. makes invalid assumptions about the parser callbacks). Please, consider the implications of the following quotation from the already mentioned "Top Ten SAX2 Tips":

        It'd be legal (but annoying) for the parser to report one character per callback

        Also, see the final note in the paragraph "Handling events" from the SAX Quickstart or the related question in the SAX FAQ. The issue of unwanted newlines may be hard to reproduce but it is there.

        Just for the fun of it, I extended the changes.xml to contain a really long line (only alphanum's so the browser won't wrap). In the resulting HTML output, I get a line break every 2048 characters... if I only had some money, I would bet this is the size of the parser's character buffer Dennis, maybe you can also reproduce this issue by yourself if you try a changes.xml with at least 8 KB or 16 KB character data in a line.

        Apropos parsing: What happened to the efforcts in MCHANGES-47 to bring Modello into the game? If I understand Modello correctly, it frees the plugin developer from parsing, thereby reducing the chance for errors like this.

        Show
        Benjamin Bentmann added a comment - Well, I tried 2.0-beta-3 as well as the current 2.0-beta-4-SNAPSHOT (build from source r611604), both embedded newlines as I expected. I attached the generated changes-report.html just that you can get a feel of it. I previously tried to point out that the plugin currently misuses the SAX API (i.e. makes invalid assumptions about the parser callbacks). Please, consider the implications of the following quotation from the already mentioned "Top Ten SAX2 Tips": It'd be legal (but annoying) for the parser to report one character per callback Also, see the final note in the paragraph "Handling events" from the SAX Quickstart or the related question in the SAX FAQ . The issue of unwanted newlines may be hard to reproduce but it is there. Just for the fun of it, I extended the changes.xml to contain a really long line (only alphanum's so the browser won't wrap). In the resulting HTML output, I get a line break every 2048 characters... if I only had some money, I would bet this is the size of the parser's character buffer Dennis, maybe you can also reproduce this issue by yourself if you try a changes.xml with at least 8 KB or 16 KB character data in a line. Apropos parsing: What happened to the efforcts in MCHANGES-47 to bring Modello into the game? If I understand Modello correctly, it frees the plugin developer from parsing, thereby reducing the chance for errors like this.
        Hide
        Dennis Lundberg added a comment -

        Thanks for the tips Benjamin. I have now reproduced this. For me it occurs at around 8k of characters. Now that I have something to work with I'll try out your patch.

        I'd also like to try out MCHANGES-47, but it depends on a new release of Modello. So I'm pushing to get that release done first.

        Show
        Dennis Lundberg added a comment - Thanks for the tips Benjamin. I have now reproduced this. For me it occurs at around 8k of characters. Now that I have something to work with I'll try out your patch. I'd also like to try out MCHANGES-47 , but it depends on a new release of Modello. So I'm pushing to get that release done first.
        Hide
        Dennis Lundberg added a comment -

        Benjamin, your patch removes all line breaks for me, was that intended?
        I end up with all the contents of an element on one very long line, even though the sources.xml file has line breaks in it.

        Show
        Dennis Lundberg added a comment - Benjamin, your patch removes all line breaks for me, was that intended? I end up with all the contents of an element on one very long line, even though the sources.xml file has line breaks in it.
        Hide
        Benjamin Bentmann added a comment -

        The removal of line breaks is simply caused by

        replace( '\n', ' ' )
        

        which can be easily spotted in the patch and removed if required.

        I adopted this behavior from Henning's patch which I considered useful as some kind of white-space normalization since one (at least me and my IDE) usually uses white-space in XML only for formatting of the XML input itself, and not for formatting of the output generated from the XML. Or is there any spec about the changes.xml that dictates to preserve white-space in certain elements?

        If white-space normalization is OK, then however my patch should be further improved to deliver a proper normalization. I would define a proper normalization as

        • no leading/trailing white-space (trim() in place)
        • no consecutive white-space (missing)
        • only 0x20 as white-space character (missing)

        The last two aspects should be easily realizable via

        replaceAll("\\s+", " ")
        

        instead of the existing replace() calls.

        Show
        Benjamin Bentmann added a comment - The removal of line breaks is simply caused by replace( '\n', ' ' ) which can be easily spotted in the patch and removed if required. I adopted this behavior from Henning's patch which I considered useful as some kind of white-space normalization since one (at least me and my IDE) usually uses white-space in XML only for formatting of the XML input itself, and not for formatting of the output generated from the XML. Or is there any spec about the changes.xml that dictates to preserve white-space in certain elements? If white-space normalization is OK, then however my patch should be further improved to deliver a proper normalization. I would define a proper normalization as no leading/trailing white-space (trim() in place) no consecutive white-space (missing) only 0x20 as white-space character (missing) The last two aspects should be easily realizable via replaceAll( "\\s+" , " " ) instead of the existing replace() calls.
        Hide
        Dennis Lundberg added a comment -

        I have applied the patch, but I opted not to remove line breaks. Thanks!
        A new SNAPSHOT has been deployed. Please give a try.

        Show
        Dennis Lundberg added a comment - I have applied the patch, but I opted not to remove line breaks. Thanks! A new SNAPSHOT has been deployed. Please give a try.
        Hide
        Henning Schmiedehausen added a comment - - edited

        <offending comment deleted>.

        Apologies to Dennis, this venting of steam actually hit the wrong person (because he is one of the few that actually cares for the open issues against maven plugin).

        Show
        Henning Schmiedehausen added a comment - - edited <offending comment deleted>. Apologies to Dennis, this venting of steam actually hit the wrong person (because he is one of the few that actually cares for the open issues against maven plugin).
        Hide
        Dennis Lundberg added a comment -

        That's an utterly disrespectful thing to say.

        Show
        Dennis Lundberg added a comment - That's an utterly disrespectful thing to say.
        Hide
        Benjamin Bentmann added a comment -

        51 weeks and one day after the initial bug report, a new snapshot has been deployed.

        Henning, how many weeks have passed since Dennis asked you to provide a changes.xml to nail the issue down? I guess you had something better to do, just like the committers. This issue has just one vote and there are lots of other, higher voted issues waiting to be investigated and solved.

        Show
        Benjamin Bentmann added a comment - 51 weeks and one day after the initial bug report, a new snapshot has been deployed. Henning, how many weeks have passed since Dennis asked you to provide a changes.xml to nail the issue down? I guess you had something better to do, just like the committers. This issue has just one vote and there are lots of other, higher voted issues waiting to be investigated and solved.

          People

          • Assignee:
            Dennis Lundberg
            Reporter:
            Henning Schmiedehausen
          • Votes:
            1 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved: