Woodstox
  1. Woodstox
  2. WSTX-94

\r is serialized as 
 instead of the \r character, making it impossible to generate windows line endings

    Details

    • Type: Bug Bug
    • Status: Closed Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: None
    • Labels:
      None
    • Number of attachments :
      0

      Description

      This is in 3.0.2 and 3.1.0. The behaviour differs from the RI.

        Activity

        Tatu Saloranta made changes -
        Field Original Value New Value
        Assignee Tatu Saloranta [ cowtowncoder ]
        Hide
        Tatu Saloranta added a comment -

        This is actually a feature, not bug. Almost all xml serializers do the same (Xslt outputters used with DOM, JDom, XOM). That However, it may be something that should be configurable, so that this escaping behavior could be disabled.

        The main problem is this: since XML parsers must convert \r (Mac) and \r\n (Windows) linefeeds into canonical \n (Unix) when parsing, as per specification, one way to think about this is that if someone really wants to output char \r, they want to preserve it when parsed. The only way to do this is to quote \r as a character entity.
        However, it is also possible that caller doesn't care about it getting normalized, just wants linefeeds to work better when viewed (which I assume is what you would prefer).
        So it is pretty much impossible to handle both cases simultaneously.

        If the alternative behavior (just leave \r chars as is, instead of character entities) is something you'd like to see, this can quite easily be added as a feature to XMLStreamWriter implementation (to be configured via XMLOutputFactory).

        Assuming this would work, I can try to add this feature to 3.2.

        Show
        Tatu Saloranta added a comment - This is actually a feature, not bug. Almost all xml serializers do the same (Xslt outputters used with DOM, JDom, XOM). That However, it may be something that should be configurable, so that this escaping behavior could be disabled. The main problem is this: since XML parsers must convert \r (Mac) and \r\n (Windows) linefeeds into canonical \n (Unix) when parsing, as per specification, one way to think about this is that if someone really wants to output char \r, they want to preserve it when parsed. The only way to do this is to quote \r as a character entity. However, it is also possible that caller doesn't care about it getting normalized, just wants linefeeds to work better when viewed (which I assume is what you would prefer). So it is pretty much impossible to handle both cases simultaneously. If the alternative behavior (just leave \r chars as is, instead of character entities) is something you'd like to see, this can quite easily be added as a feature to XMLStreamWriter implementation (to be configured via XMLOutputFactory). Assuming this would work, I can try to add this feature to 3.2.
        Hide
        Brett Porter added a comment -

        yes, we have often had requests to generate windows line endings, so is a feature we'd like to see.

        I'm going to open this against stax.codehaus.org as well, since it does this, but the original BEA RI (included in JDK6) maintains the \r.

        Is this something the spec needs to clarify in a future revision?

        Show
        Brett Porter added a comment - yes, we have often had requests to generate windows line endings, so is a feature we'd like to see. I'm going to open this against stax.codehaus.org as well, since it does this, but the original BEA RI (included in JDK6) maintains the \r. Is this something the spec needs to clarify in a future revision?
        Hide
        Tatu Saloranta added a comment -

        It is something that could be addressed (as in dictating what is the expected default behavior). But perhaps even more importantly, it would be possible that a "de factor" property could be defined: the difference being that whereas Stax specification update does not seem like it is happening (it requires JSR expert group to be there, have time etc), proposing a property to be implemented by various implementations just requires good ideas and active developers for implementations.
        (yes, even that is sometimes a tall order...)

        It just so happens that Aleksander Slominski (and to a lesser degree, myself) has created a wiki page to collect ideas for such de facto standard properties. At least Woodstox and the reference implementation are very likely to support such features, and maybe we could convince Sun folks to seriously consider supporting them as well?

        As usual, thank you for taking time to add the entry and follow up on this: this is the most important mechanism for me to figure out which issues developers care about, and that affect their life.

        Show
        Tatu Saloranta added a comment - It is something that could be addressed (as in dictating what is the expected default behavior). But perhaps even more importantly, it would be possible that a "de factor" property could be defined: the difference being that whereas Stax specification update does not seem like it is happening (it requires JSR expert group to be there, have time etc), proposing a property to be implemented by various implementations just requires good ideas and active developers for implementations. (yes, even that is sometimes a tall order...) It just so happens that Aleksander Slominski (and to a lesser degree, myself) has created a wiki page to collect ideas for such de facto standard properties. At least Woodstox and the reference implementation are very likely to support such features, and maybe we could convince Sun folks to seriously consider supporting them as well? As usual, thank you for taking time to add the entry and follow up on this: this is the most important mechanism for me to figure out which issues developers care about, and that affect their life.
        Hide
        Brett Porter added a comment -

        where is the wiki page you refer to? I would be happy to look into contributing this in due course.

        (btw, your jira doesn't seem to be configured to notify the reporter when you update it which is a strange setting. I'd be happy to update that if you'd like)

        Show
        Brett Porter added a comment - where is the wiki page you refer to? I would be happy to look into contributing this in due course. (btw, your jira doesn't seem to be configured to notify the reporter when you update it which is a strange setting. I'd be happy to update that if you'd like)
        Hide
        Tatu Saloranta added a comment -

        The Wiki page is (I think) at:

        http://stax.codehaus.org/Extensions

        And yes, if you could help properly configuring Jira, that'd be awesome.

        Show
        Tatu Saloranta added a comment - The Wiki page is (I think) at: http://stax.codehaus.org/Extensions And yes, if you could help properly configuring Jira, that'd be awesome.
        Hide
        Tatu Saloranta added a comment -

        Ok, I implemented this feature. Setting (Woodstox-specific) property WstxOutputProperties.P_OUTPUT_ESCAPE_CR to Boolean.FALSE will disable escaping of CRs in CHARACTERS events (plain text segments).

        It's worth noting that same effect could be achieved by implementing stax2 class 'EscapingWriterFactory' (and matching writer): however, that is quite a bit of work for a potentially common task.

        This feature will be part of Woodstox 3.2 release, to be released any day now.

        Show
        Tatu Saloranta added a comment - Ok, I implemented this feature. Setting (Woodstox-specific) property WstxOutputProperties.P_OUTPUT_ESCAPE_CR to Boolean.FALSE will disable escaping of CRs in CHARACTERS events (plain text segments). It's worth noting that same effect could be achieved by implementing stax2 class 'EscapingWriterFactory' (and matching writer): however, that is quite a bit of work for a potentially common task. This feature will be part of Woodstox 3.2 release, to be released any day now.
        Tatu Saloranta made changes -
        Resolution Fixed [ 1 ]
        Status Open [ 1 ] Resolved [ 5 ]
        Hide
        Tatu Saloranta added a comment -

        Was released in 3.2.0.

        Show
        Tatu Saloranta added a comment - Was released in 3.2.0.
        Tatu Saloranta made changes -
        Status Resolved [ 5 ] Closed [ 6 ]

          People

          • Assignee:
            Tatu Saloranta
            Reporter:
            Brett Porter
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved: