groovy

XmlNodePrinter outputs too much whitespace

Details

  • Type: Improvement Improvement
  • Status: Closed Closed
  • Priority: Major Major
  • Resolution: Fixed
  • Affects Version/s: 1.5.7, 1.6-rc-1
  • Fix Version/s: 1.6-rc-2, 1.7-beta-1
  • Component/s: XML Processing
  • Labels:
    None
  • Number of attachments :
    0

Description

XmlNodePrinter outputs too much whitespace. The printLineEnd() method is always called after begin tags, and after the text body. If the element test is short this leads to ugly XML, e.g.

<id>
         foo
      </id>

where

<id>foo</id>

is more readable and more like what you would write by hand. There could be a rule that all element text values that do not contain whitespace, or that are less than N characters long, would be output in this shorter form.

Activity

Hide
Paul King added a comment - - edited

I have added some improvements in trunk. Please review. You will need to create your own IndentPrinter:

def emptyString = ""
def addLines = false
new XmlNodePrinter(new IndentPrinter(printWriter, emptyString, addLines))

Without doing this, the defaults are all set for backwards compatibility.
See the updated tests for further examples.

If sufficient quick feedback is received I believe this is a candidate for RC2 or 1.6 final.

Show
Paul King added a comment - - edited I have added some improvements in trunk. Please review. You will need to create your own IndentPrinter:
def emptyString = ""
def addLines = false
new XmlNodePrinter(new IndentPrinter(printWriter, emptyString, addLines))
Without doing this, the defaults are all set for backwards compatibility. See the updated tests for further examples. If sufficient quick feedback is received I believe this is a candidate for RC2 or 1.6 final.
Hide
Dave Syer added a comment -

Where are the snapshot builds (I see nothing for RC2 at http://snapshots.repository.codehaus.org/org/codehaus/groovy/groovy-all/)?

Show
Dave Syer added a comment - Where are the snapshot builds (I see nothing for RC2 at http://snapshots.repository.codehaus.org/org/codehaus/groovy/groovy-all/)?
Hide
Paul King added a comment -

I normally get this from one of the CI build servers but I looked just now and none have the latest artifacts. The maven snapshot repo is done manually and I am away from the machine I have set up to do this. I'll see what I can do on short notice.

Show
Paul King added a comment - I normally get this from one of the CI build servers but I looked just now and none have the latest artifacts. The maven snapshot repo is done manually and I am away from the machine I have set up to do this. I'll see what I can do on short notice.
Hide
Phil Swenson added a comment -

I think it's a mistake to leave the default output as is. the default output creates invalid XML (the existing behavior is a bug). This is how kludegy libraries become kludegy.

JMO

Show
Phil Swenson added a comment - I think it's a mistake to leave the default output as is. the default output creates invalid XML (the existing behavior is a bug). This is how kludegy libraries become kludegy. JMO
Hide
Paul King added a comment -

I agree in hindsight it would have been better to have the alternative default but for backwards compatibility I think the current coding will be best.

Or to put another way, if we were to change this default we should also change the default of XmlParser to have trimWhitespace=false.

Perhaps it would be best to bite the bullet and do both of these as it would better align them with e.g. XmlSlurper but they are both breaking changes and could break quite a lot of existing code.

Show
Paul King added a comment - I agree in hindsight it would have been better to have the alternative default but for backwards compatibility I think the current coding will be best. Or to put another way, if we were to change this default we should also change the default of XmlParser to have trimWhitespace=false. Perhaps it would be best to bite the bullet and do both of these as it would better align them with e.g. XmlSlurper but they are both breaking changes and could break quite a lot of existing code.
Hide
Phil Swenson added a comment -

Not to start a big argument, but It seems like this behavior is recent. My code used to output valid xml and somewhere along the line, new versions of Groovy introduced the whitespace/linefeeds. If this is the case then backwards compatibility was already broken. (maybe I am wrong?)

Show
Phil Swenson added a comment - Not to start a big argument, but It seems like this behavior is recent. My code used to output valid xml and somewhere along the line, new versions of Groovy introduced the whitespace/linefeeds. If this is the case then backwards compatibility was already broken. (maybe I am wrong?)
Hide
Alexander Veit added a comment -

I think it's a mistake to leave the default output as is. the default output creates invalid XML (the existing behavior is a bug)

I fully agree.

"In editing XML documents, it is often convenient to use "white space"
(spaces, tabs, and blank lines) to set apart the markup for greater
readability. Such white space is typically not intended for inclusion in the
delivered version of the document. On the other hand, "significant" white
space that should be preserved in the delivered version is common, for
example in poetry and source code."
Extensible Markup Language (XML) 1.0 (Fifth Edition)

It couldn't be said better. With the current behaviour XmlNodePrinter is not really interoperable. And any consumer of the XML that is designed to restore the original data will parse different input as the same data.

Show
Alexander Veit added a comment -
I think it's a mistake to leave the default output as is. the default output creates invalid XML (the existing behavior is a bug)
I fully agree. "In editing XML documents, it is often convenient to use "white space" (spaces, tabs, and blank lines) to set apart the markup for greater readability. Such white space is typically not intended for inclusion in the delivered version of the document. On the other hand, "significant" white space that should be preserved in the delivered version is common, for example in poetry and source code." Extensible Markup Language (XML) 1.0 (Fifth Edition) It couldn't be said better. With the current behaviour XmlNodePrinter is not really interoperable. And any consumer of the XML that is designed to restore the original data will parse different input as the same data.
Hide
Paul King added a comment -

Just on getting the latest jars. Then are now available from the Bamboo server, e.g.:
http://bamboo.ci.codehaus.org/download/GROOVY-CORE/artifacts/build-4002/deliverables
(You might need to go to a later build once those artifacts are deleted)

Show
Paul King added a comment - Just on getting the latest jars. Then are now available from the Bamboo server, e.g.: http://bamboo.ci.codehaus.org/download/GROOVY-CORE/artifacts/build-4002/deliverables (You might need to go to a later build once those artifacts are deleted)
Hide
Paul King added a comment -

Regarding the B/C requirement. I will try to trace back when this changed and if it is recent, I will flip the default.

Show
Paul King added a comment - Regarding the B/C requirement. I will try to trace back when this changed and if it is recent, I will flip the default.
Hide
Dave Syer added a comment -

Paul: is there some way for you to publish the snapshots to a proper Maven repository (someone did it at least once with RC1)?

Show
Dave Syer added a comment - Paul: is there some way for you to publish the snapshots to a proper Maven repository (someone did it at least once with RC1)?
Hide
Guillaume Laforge added a comment -

Dave, I've uploaded snapshots on the Codehaus repository.
http://snapshots.repository.codehaus.org/org/codehaus/groovy/

Show
Guillaume Laforge added a comment - Dave, I've uploaded snapshots on the Codehaus repository. http://snapshots.repository.codehaus.org/org/codehaus/groovy/
Hide
Paul King added a comment -

Defaults are potentially opposite of desirable but other changes made it onto RC2. I'll close off this issue and we can raise an issue specifically about defaults - since we should really think about defaults for XmlParser and perhaps MarkupBuilder at the same time. I did a quick look at the code and it wasn't obvious when the change happened which added extra spaces.

Show
Paul King added a comment - Defaults are potentially opposite of desirable but other changes made it onto RC2. I'll close off this issue and we can raise an issue specifically about defaults - since we should really think about defaults for XmlParser and perhaps MarkupBuilder at the same time. I did a quick look at the code and it wasn't obvious when the change happened which added extra spaces.
Hide
Jochen Hinrichsen added a comment -

I just stumbled across the same problem, and IMHO the workaround described (new XmlNodePrinter(new IndentPrinter(printWriter, emptyString, addLines))) does not solve the problem.

If you set addLines() to false, everything is on one line, which is hard to read and cumbersome to process with some editors (complaining about long lines).

If you set addLines() to true, you still get broken text elements, such as

<names>
<name>
jndi.user
</name>
</names>

How can i achieve a pretty printed text node

<names>
<name>jndi.user</name>
</names>

By the way - i'm using 1.6.4.

Show
Jochen Hinrichsen added a comment - I just stumbled across the same problem, and IMHO the workaround described (new XmlNodePrinter(new IndentPrinter(printWriter, emptyString, addLines))) does not solve the problem. If you set addLines() to false, everything is on one line, which is hard to read and cumbersome to process with some editors (complaining about long lines). If you set addLines() to true, you still get broken text elements, such as <names> <name> jndi.user </name> </names> How can i achieve a pretty printed text node <names> <name>jndi.user</name> </names> By the way - i'm using 1.6.4.
Hide
Paul King added a comment -

You should leave addLines and just change the nodePrinter as follows:

nodePrinter.preserveWhitespace = true
Show
Paul King added a comment - You should leave addLines and just change the nodePrinter as follows:
nodePrinter.preserveWhitespace = true
Hide
Jochen Hinrichsen added a comment -

Runs like a charm - thanks!

Show
Jochen Hinrichsen added a comment - Runs like a charm - thanks!

People

Vote (4)
Watch (3)

Dates

  • Created:
    Updated:
    Resolved: