Woodstox

Implement DTD (or, in general, schema) based indentation/pretty-printing

Details

  • Type: New Feature New Feature
  • Status: Open Open
  • Priority: Major Major
  • Resolution: Unresolved
  • Affects Version/s: None
  • Fix Version/s: None
  • Component/s: None
  • Labels:
    None
  • Number of attachments :
    0

Description

Since Woodstox-3.0 will have flexible, pluggable writer-side validation framework (and support at least DTD, initially), it will be possible to implement indentation ("pretty-printing") quite easily.
Although from inter-operability perspective indentation is irrelevant, it's nice to be able to produce something that is more pleasing to human eye, for debugging; especially something that can be turned on during development and then disabled in production.

So, it'd be great if this could be implemented for Woodstox-3.0. And if not, definitely for 4.0.

Activity

Hide
Oleg Rudenko added a comment -

It will be nice if EventWriter will completely preserve formatting of Events, read using EventReader.
Currently whitespaces between attributes are not preserved.
This feature is important to keep readability of manually created XMLs after processing.

Show
Oleg Rudenko added a comment - It will be nice if EventWriter will completely preserve formatting of Events, read using EventReader. Currently whitespaces between attributes are not preserved. This feature is important to keep readability of manually created XMLs after processing.
Hide
Tatu Saloranta added a comment -

Wrt preserving white space between attributes, in start element, for preserving input formatting ("round-trippability"): unfortunately it would be rather difficult to implement this, the way Woodstox works. All such space is basically discarded, since it is semantically meaningless in XML (not included in Infoset). I can see that it would be nice to be able to do minimal changes to the input (when replacing, say, just a single element or so), but can not see an easy way to achieve that. It also should not be required by apps, and for readability purposes specific formatting could be regenerated by a custom writer (one wrapping real XMLStreamWriter).

It might be possible for the app to make use of the Location offsets, so that instead of reading and writing full document, one would just replace sub-trees and elements in-place. This might work for some cases.
Finally, although most xml processors can not do this kind of in-place editing (or preseving of exact formatting), there are some: for example, VTD-XML stores the whole input and has byte-accurate indexes, so it can in fact preserve the exact formatting of the input document. So if this is a hard requirement, you could investigate that possibility.

Show
Tatu Saloranta added a comment - Wrt preserving white space between attributes, in start element, for preserving input formatting ("round-trippability"): unfortunately it would be rather difficult to implement this, the way Woodstox works. All such space is basically discarded, since it is semantically meaningless in XML (not included in Infoset). I can see that it would be nice to be able to do minimal changes to the input (when replacing, say, just a single element or so), but can not see an easy way to achieve that. It also should not be required by apps, and for readability purposes specific formatting could be regenerated by a custom writer (one wrapping real XMLStreamWriter). It might be possible for the app to make use of the Location offsets, so that instead of reading and writing full document, one would just replace sub-trees and elements in-place. This might work for some cases. Finally, although most xml processors can not do this kind of in-place editing (or preseving of exact formatting), there are some: for example, VTD-XML stores the whole input and has byte-accurate indexes, so it can in fact preserve the exact formatting of the input document. So if this is a hard requirement, you could investigate that possibility.

People

Vote (1)
Watch (0)

Dates

  • Created:
    Updated: