groovy
  1. groovy
  2. GROOVY-5361

XmlParser and XmlSlurper should support XML Schema validation

    Details

    • Type: Bug Bug
    • Status: Closed Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 2.0-beta-2
    • Fix Version/s: 2.0.0
    • Component/s: XML Processing
    • Labels:
      None
    • Number of attachments :
      0

      Description

      XmlSlurper and XmlParser both support a validation capability, but this only works for DTDs. It should work for XMLSchemas as well.

        Activity

        Hide
        Paul King added a comment -

        They both do support XML Schema validation. So for clarification, my understanding is that you want simpler configuration when turning on such validation rather than configuring your own SAXParser.

        Show
        Paul King added a comment - They both do support XML Schema validation. So for clarification, my understanding is that you want simpler configuration when turning on such validation rather than configuring your own SAXParser.
        Hide
        Paul King added a comment - - edited

        So basically we currently have this for validating against internal XSDs:

        def factory = SAXParserFactory.newInstance()
        factory.validating = true
        factory.namespaceAware = true
        SAXParser sax = factory.newSAXParser()
        sax.setProperty("http://java.sun.com/xml/jaxp/properties/schemaLanguage", XMLConstants.W3C_XML_SCHEMA_NS_URI)
        def parser = new XmlParser(sax)
        

        which could be shortened to something like:

        def parser = new XmlParser(XmlUtil.defaultSchemaValidatingParser())
        

        Incidentally, we still need an error handler in either case:

        parser.errorHandler = { e -> println e.message } as ErrorHandler
        
        Show
        Paul King added a comment - - edited So basically we currently have this for validating against internal XSDs: def factory = SAXParserFactory.newInstance() factory.validating = true factory.namespaceAware = true SAXParser sax = factory.newSAXParser() sax.setProperty( "http: //java.sun.com/xml/jaxp/properties/schemaLanguage" , XMLConstants.W3C_XML_SCHEMA_NS_URI) def parser = new XmlParser(sax) which could be shortened to something like: def parser = new XmlParser(XmlUtil.defaultSchemaValidatingParser()) Incidentally, we still need an error handler in either case: parser.errorHandler = { e -> println e.message } as ErrorHandler
        Hide
        Paul King added a comment - - edited

        For validating against an external XSD we currently have:

        def schemaFactory = SchemaFactory.newInstance(XMLConstants.W3C_XML_SCHEMA_NS_URI)
        def factory = SAXParserFactory.newInstance()
        factory.validating = false
        factory.namespaceAware = true
        factory.schema = schemaFactory.newSchema([new StreamSource("mySchema.xsd")] as Source[])
        def parser = new XmlParser(factory.newSAXParser())
        

        which might become something like this:

        def parser = new XmlParser(XmlUtil.schemaValidatingParserFromSources(new StreamSource("mySchema.xsd")))
        

        With DTD validation off, this doesn't need a special error handler - the default one when not validating just sends error messages to System.err.

        Show
        Paul King added a comment - - edited For validating against an external XSD we currently have: def schemaFactory = SchemaFactory.newInstance(XMLConstants.W3C_XML_SCHEMA_NS_URI) def factory = SAXParserFactory.newInstance() factory.validating = false factory.namespaceAware = true factory.schema = schemaFactory.newSchema([ new StreamSource( "mySchema.xsd" )] as Source[]) def parser = new XmlParser(factory.newSAXParser()) which might become something like this: def parser = new XmlParser(XmlUtil.schemaValidatingParserFromSources( new StreamSource( "mySchema.xsd" ))) With DTD validation off, this doesn't need a special error handler - the default one when not validating just sends error messages to System.err .
        Hide
        Paul King added a comment -

        Or we could provide a Map style constructor:

        def parser = new XmlParser(validating: true, schemaValidation: true, schemaSources:[new StreamSource("mySchema.xsd")])
        

        Or something like that - though it won't necessarily support easy IDE completion.

        Show
        Paul King added a comment - Or we could provide a Map style constructor: def parser = new XmlParser(validating: true , schemaValidation: true , schemaSources:[ new StreamSource( "mySchema.xsd" )]) Or something like that - though it won't necessarily support easy IDE completion.
        Hide
        Russel Winder added a comment -

        I did indeed mean that using an XML Schema should be easier. The above proposal seem to take us on the right direction. Without trying the out, it is difficult to say more.

        Why does there have to be explicit mention of an error handler?

        What about folks using Relax NG?

        Show
        Russel Winder added a comment - I did indeed mean that using an XML Schema should be easier. The above proposal seem to take us on the right direction. Without trying the out, it is difficult to say more. Why does there have to be explicit mention of an error handler? What about folks using Relax NG?
        Hide
        Russel Winder added a comment -

        I didn't have to specify an error handler using the first segment, it just worked.

        Show
        Russel Winder added a comment - I didn't have to specify an error handler using the first segment, it just worked.
        Hide
        Paul King added a comment -

        Re: "I didn't have to specify an error handler using the first segment, it just worked."

        If I have validating set to true I get the warning to stderr you mentioned on the mailing list:

        Warning: validation was turned on but an org.xml.sax.ErrorHandler was not
        set, which is probably not what is desired.  Parser will use a default
        ErrorHandler to print the first 10 errors.  Please call
        the 'setErrorHandler' method to fix this.
        

        If you are not seeing it perhaps it is JVM version specific.

        Show
        Paul King added a comment - Re: "I didn't have to specify an error handler using the first segment, it just worked." If I have validating set to true I get the warning to stderr you mentioned on the mailing list: Warning: validation was turned on but an org.xml.sax.ErrorHandler was not set, which is probably not what is desired. Parser will use a default ErrorHandler to print the first 10 errors. Please call the 'setErrorHandler' method to fix this. If you are not seeing it perhaps it is JVM version specific.
        Hide
        Paul King added a comment - - edited

        For relaxng, with Jing in my classpath I needed the following as the long-hand for an external schema source (couldn't get the internal one working in the time I had available):

        // Jing jar doesn't seem to have META-INF to be found automatically by JAXP
        // so manually set it here - pick one of compact or xml syntax variants
        System.setProperty(SchemaFactory.name + ":" + RELAXNG_NS_URI,
        //    "com.thaiopensource.relaxng.jaxp.CompactSyntaxSchemaFactory")
            "com.thaiopensource.relaxng.jaxp.XMLSyntaxSchemaFactory")
        
        SchemaFactory schemaFactory = SchemaFactory.newInstance(RELAXNG_NS_URI)
        def factory = SAXParserFactory.newInstance()
        factory.validating = false
        factory.namespaceAware = true
        factory.schema = schemaFactory.newSchema([new StreamSource("mySchema.rng")] as Source[])
        def parser = new XmlParser(factory.newSAXParser())
        

        so perhaps this could become:

        // I'd propose leaving the ugly System.setProperty here as it is Jing specific
        // and hopefully will go away in future versions of the library anyway
        
        def parser = new XmlParser(XmlUtil.schemaValidatingParserFromSources(RELAXNG_NS_URI, new StreamSource("mySchema.rng")))
        

        and if we don't default to XML Schema, then the earlier example for XML Schema might need to become:

        def parser = new XmlParser(XmlUtil.schemaValidatingParserFromSources(W3C_XML_SCHEMA_NS_URI, new StreamSource("mySchema.xsd")))
        
        Show
        Paul King added a comment - - edited For relaxng, with Jing in my classpath I needed the following as the long-hand for an external schema source (couldn't get the internal one working in the time I had available): // Jing jar doesn't seem to have META-INF to be found automatically by JAXP // so manually set it here - pick one of compact or xml syntax variants System .setProperty(SchemaFactory.name + ":" + RELAXNG_NS_URI, // "com.thaiopensource.relaxng.jaxp.CompactSyntaxSchemaFactory" ) "com.thaiopensource.relaxng.jaxp.XMLSyntaxSchemaFactory" ) SchemaFactory schemaFactory = SchemaFactory.newInstance(RELAXNG_NS_URI) def factory = SAXParserFactory.newInstance() factory.validating = false factory.namespaceAware = true factory.schema = schemaFactory.newSchema([ new StreamSource( "mySchema.rng" )] as Source[]) def parser = new XmlParser(factory.newSAXParser()) so perhaps this could become: // I'd propose leaving the ugly System .setProperty here as it is Jing specific // and hopefully will go away in future versions of the library anyway def parser = new XmlParser(XmlUtil.schemaValidatingParserFromSources(RELAXNG_NS_URI, new StreamSource( "mySchema.rng" ))) and if we don't default to XML Schema, then the earlier example for XML Schema might need to become: def parser = new XmlParser(XmlUtil.schemaValidatingParserFromSources(W3C_XML_SCHEMA_NS_URI, new StreamSource( "mySchema.xsd" )))
        Hide
        Russel Winder added a comment -

        "If you are not seeing it perhaps it is JVM version specific."

        I see the error message when I set the validate parameter true in the XmlParser constructor call but I don't have a DTD. When I use the SAX parser constructor to XmlParser I do not see the error message. This would imply the error message comes from the default DTD validating SAX parser used by XmlParser. Since this is not used when specifying a given SAX parser the default error handler is clearly fine?

        Show
        Russel Winder added a comment - "If you are not seeing it perhaps it is JVM version specific." I see the error message when I set the validate parameter true in the XmlParser constructor call but I don't have a DTD. When I use the SAX parser constructor to XmlParser I do not see the error message. This would imply the error message comes from the default DTD validating SAX parser used by XmlParser. Since this is not used when specifying a given SAX parser the default error handler is clearly fine?
        Hide
        Russel Winder added a comment - - edited

        I agree with the thinking about support for RelaxNG except that there should be a way of allowing the incoming document to specify the schema against which validation happens.

        Show
        Russel Winder added a comment - - edited I agree with the thinking about support for RelaxNG except that there should be a way of allowing the incoming document to specify the schema against which validation happens.
        Hide
        Paul King added a comment -

        @Russel

        Re: "the default error handler is clearly fine". Yes, when creating your own SAX parser, if you don't tell it to also validate against DTDs - which you don't need to in your examples from what you have said, then the default is fine AFAIK.

        Re: "there should be a way of allowing the incoming document to specify the schema against which validation happens". Agreed! But I haven't found anything definitive on how to do this long hand in Java - hence providing a Groovy shorthand currently evades me.

        Show
        Paul King added a comment - @Russel Re: "the default error handler is clearly fine". Yes, when creating your own SAX parser, if you don't tell it to also validate against DTDs - which you don't need to in your examples from what you have said, then the default is fine AFAIK. Re: "there should be a way of allowing the incoming document to specify the schema against which validation happens". Agreed! But I haven't found anything definitive on how to do this long hand in Java - hence providing a Groovy shorthand currently evades me.
        Hide
        Paul King added a comment - - edited

        An example with imports etc. in case anyone is trying to actually run the snippets above:

        import javax.xml.transform.Source
        import javax.xml.transform.stream.StreamSource
        import javax.xml.validation.SchemaFactory
        import javax.xml.XMLConstants
        import javax.xml.parsers.SAXParserFactory
        
        def xml_g = '<person><first>James</first><last>Kirk</last></person>'
        def xml_b = '<person><first>James</first><middle>T.</middle><last>Kirk</last></person>'
        def xsd = '''<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
          <xs:element name="person">
            <xs:complexType>
              <xs:sequence>
                <xs:element name="first" type="xs:NCName"/>
                <xs:element name="last" type="xs:NCName"/>
              </xs:sequence>
            </xs:complexType>
          </xs:element>
        </xs:schema>'''
        def schemaFactory = SchemaFactory.newInstance(XMLConstants.W3C_XML_SCHEMA_NS_URI)
        def factory = SAXParserFactory.newInstance()
        factory.validating = false
        factory.namespaceAware = true
        factory.schema = schemaFactory.newSchema([new StreamSource(new StringReader(xsd))] as Source[])
        def parser = new XmlParser(factory.newSAXParser())
        [xml_g, xml_b].each {
            def p = parser.parseText(it)
            println "${p.last.text()}, ${p.first.text()}"
        }
        
        Show
        Paul King added a comment - - edited An example with imports etc. in case anyone is trying to actually run the snippets above: import javax.xml.transform.Source import javax.xml.transform.stream.StreamSource import javax.xml.validation.SchemaFactory import javax.xml.XMLConstants import javax.xml.parsers.SAXParserFactory def xml_g = '<person><first>James</first><last>Kirk</last></person>' def xml_b = '<person><first>James</first><middle>T.</middle><last>Kirk</last></person>' def xsd = '''<xs:schema xmlns:xs= "http: //www.w3.org/2001/XMLSchema" > <xs:element name= "person" > <xs:complexType> <xs:sequence> <xs:element name= "first" type= "xs:NCName" /> <xs:element name= "last" type= "xs:NCName" /> </xs:sequence> </xs:complexType> </xs:element> </xs:schema>''' def schemaFactory = SchemaFactory.newInstance(XMLConstants.W3C_XML_SCHEMA_NS_URI) def factory = SAXParserFactory.newInstance() factory.validating = false factory.namespaceAware = true factory.schema = schemaFactory.newSchema([ new StreamSource( new StringReader(xsd))] as Source[]) def parser = new XmlParser(factory.newSAXParser()) [xml_g, xml_b].each { def p = parser.parseText(it) println "${p.last.text()}, ${p.first.text()}" }
        Hide
        Paul King added a comment - - edited

        Here is what I propose to add to XmlUtil:

        public static SAXParser schemaParser(String schemaLanguage, boolean namespaceAware, boolean validating, File schema)
        public static SAXParser schemaParser(String schemaLanguage, boolean namespaceAware, boolean validating, URL schema)
        public static SAXParser schemaParser(String schemaLanguage, boolean namespaceAware, boolean validating, Source... schemas)
        

        So, the above example becomes (assuming the xsd is in a local file called 'person.xsd'):

        import static javax.xml.XMLConstants.*
        import static groovy.xml.XmlUtil.*
        
        def xml_g = '<person><first>James</first><last>Kirk</last></person>'
        def xml_b = '<person><first>James</first><middle>T.</middle><last>Kirk</last></person>'
        def parser = new XmlParser(schemaParser(W3C_XML_SCHEMA_NS_URI, true, false, 'person.xsd' as File))
        [xml_g, xml_b].each {
            def p = parser.parseText(it)
            println "${p.last.text()}, ${p.first.text()}"
        }
        

        I guess we could also create variants with default values for the booleans - I'd go with validating=false and namespaceAware=true as per our defaults for XmlSlurper and XmlParser.

        We could also have a default schema language but (despite the poor support for RelaxNG on the JVM at present) I am inclined not to at this stage - it can always be added later.

        Show
        Paul King added a comment - - edited Here is what I propose to add to XmlUtil: public static SAXParser schemaParser( String schemaLanguage, boolean namespaceAware, boolean validating, File schema) public static SAXParser schemaParser( String schemaLanguage, boolean namespaceAware, boolean validating, URL schema) public static SAXParser schemaParser( String schemaLanguage, boolean namespaceAware, boolean validating, Source... schemas) So, the above example becomes (assuming the xsd is in a local file called 'person.xsd'): import static javax.xml.XMLConstants.* import static groovy.xml.XmlUtil.* def xml_g = '<person><first>James</first><last>Kirk</last></person>' def xml_b = '<person><first>James</first><middle>T.</middle><last>Kirk</last></person>' def parser = new XmlParser(schemaParser(W3C_XML_SCHEMA_NS_URI, true , false , 'person.xsd' as File)) [xml_g, xml_b].each { def p = parser.parseText(it) println "${p.last.text()}, ${p.first.text()}" } I guess we could also create variants with default values for the booleans - I'd go with validating=false and namespaceAware=true as per our defaults for XmlSlurper and XmlParser. We could also have a default schema language but (despite the poor support for RelaxNG on the JVM at present) I am inclined not to at this stage - it can always be added later.
        Hide
        Paul King added a comment - - edited

        There is also the question of the best name for the utility methods. I have used "schemaParser" above but perhaps "newSAXParser" is better.

        And also the question of whether it is worth back-porting to 1.8.7? It doesn't alter any existing API, so I don't see the harm. It depends on how kind we want to be to stragglers in the Groovy ecosystem.

        Show
        Paul King added a comment - - edited There is also the question of the best name for the utility methods. I have used "schemaParser" above but perhaps "newSAXParser" is better. And also the question of whether it is worth back-porting to 1.8.7? It doesn't alter any existing API, so I don't see the harm. It depends on how kind we want to be to stragglers in the Groovy ecosystem.
        Hide
        Paul King added a comment -

        added as newSAXParser with and w/o boolean variants and backported to 1.8.7

        Show
        Paul King added a comment - added as newSAXParser with and w/o boolean variants and backported to 1.8.7
        Hide
        Russel Winder added a comment -

        On the one hand I am a huge fan of debug releases, i.e. changes in the z of the x.y.z, only having bug fixes. On the other hand this could be considered a bug fix – and Groovy does have a history of slipping in some new bits as bug fixes. So as long as there are no breaking changes 1.8.6 -> 1.8.7 I am content to see this stuff in.

        I wonder if "validatingParser" rather than "schemaParser" would be more descriptive.

        I guess we still need to work on a validating parser that reads the schema specification from the root node rather than being given it separately.

        Show
        Russel Winder added a comment - On the one hand I am a huge fan of debug releases, i.e. changes in the z of the x.y.z, only having bug fixes. On the other hand this could be considered a bug fix – and Groovy does have a history of slipping in some new bits as bug fixes. So as long as there are no breaking changes 1.8.6 -> 1.8.7 I am content to see this stuff in. I wonder if "validatingParser" rather than "schemaParser" would be more descriptive. I guess we still need to work on a validating parser that reads the schema specification from the root node rather than being given it separately.
        Hide
        Paul King added a comment -

        Re: the name: I ended up going with "newSAXParser" which is a little lower level than I would normally like but with a view to making it obvious that you are producing something which feeds into the SAXParser constructor of XmlParser/Slurper. I didn't go with "validatingParser" since you could create one with the (DTD) validating flag set to false and that might be counter-intuitive at first glance. And if the eventual goal is to have a higher-level API over the top, then all of this would be hidden anyway.

        Re: autodetecting the schema language: yes, that would be much nicer. It should really be fixed at the JVM-library level but if that looks unlikely to happen any time soon, Groovy could try to value add in the meantime.

        Show
        Paul King added a comment - Re: the name: I ended up going with "newSAXParser" which is a little lower level than I would normally like but with a view to making it obvious that you are producing something which feeds into the SAXParser constructor of XmlParser/Slurper. I didn't go with "validatingParser" since you could create one with the (DTD) validating flag set to false and that might be counter-intuitive at first glance. And if the eventual goal is to have a higher-level API over the top, then all of this would be hidden anyway. Re: autodetecting the schema language: yes, that would be much nicer. It should really be fixed at the JVM-library level but if that looks unlikely to happen any time soon, Groovy could try to value add in the meantime.
        Hide
        Paul King added a comment -

        Example of usage can be seen in XmlUtilTest#testSchemaValidationUtilityMethod:

        https://github.com/groovy/groovy-core/blob/master/src/test/groovy/xml/XmlUtilTest.groovy

        The test doesn't use a file which looks neater but on the other hand, the test is self-contained.

        Show
        Paul King added a comment - Example of usage can be seen in XmlUtilTest#testSchemaValidationUtilityMethod: https://github.com/groovy/groovy-core/blob/master/src/test/groovy/xml/XmlUtilTest.groovy The test doesn't use a file which looks neater but on the other hand, the test is self-contained.
        Hide
        Russel Winder added a comment -

        Just a note that:

        XMLConstants

        but:

        XmlUtils

        the inconsistency really is annoying.

        Show
        Russel Winder added a comment - Just a note that: XMLConstants but: XmlUtils the inconsistency really is annoying.
        Hide
        Russel Winder added a comment -

        It seems that the default behaviour of XmlUtil.newSAXParser is not to validate even though it is required to provide a schema. Or have I missed something?

        Show
        Russel Winder added a comment - It seems that the default behaviour of XmlUtil.newSAXParser is not to validate even though it is required to provide a schema. Or have I missed something?
        Hide
        Russel Winder added a comment - - edited

        XmlParser reports trimWhitespace as the property to set to control whitespace, it has the wrong default of true. XmlParser reports keepWhitespace as the property to control whitespace, it has the wrong default of false.

        Show
        Russel Winder added a comment - - edited XmlParser reports trimWhitespace as the property to set to control whitespace, it has the wrong default of true. XmlParser reports keepWhitespace as the property to control whitespace, it has the wrong default of false.
        Hide
        Paul King added a comment -

        Re: XMLConstants vs XmlUtils: XMLConstants, SAXParser, SQLException etc. come from Java. XmlParser, XmlSlurper, XmlUtil, JsonBuilder, Sql, etc. come from Groovy. Not nice but there has been a history behind both - unfortunate that sometimes the Java apis leak through to the Groovy ones. I guess in theory we could always hide them but ...

        Show
        Paul King added a comment - Re: XMLConstants vs XmlUtils: XMLConstants, SAXParser, SQLException etc. come from Java. XmlParser, XmlSlurper, XmlUtil, JsonBuilder, Sql, etc. come from Groovy. Not nice but there has been a history behind both - unfortunate that sometimes the Java apis leak through to the Groovy ones. I guess in theory we could always hide them but ...
        Hide
        Paul King added a comment -

        Re: default behaviour is to not validate: yes, the defaults are the same as for XmlSlurper and XmlParser and that flag is related to DTD validation only. Schema validation will be on. Perhaps the doco needs a bit of finessing to make that clearer?

        Show
        Paul King added a comment - Re: default behaviour is to not validate: yes, the defaults are the same as for XmlSlurper and XmlParser and that flag is related to DTD validation only. Schema validation will be on. Perhaps the doco needs a bit of finessing to make that clearer?
        Hide
        Paul King added a comment -

        Re: trimWhitespace vs keepWhitespace: it turns out the XmlSlurper's keepWhitespace and XmlParser's trimWhitespace are totally different beasts/concepts!

        Firstly, XmlParser's trimWhitespace: it causes ".trim()" to be called on element text values (but not attributes!). It really does have the wrong default value but would break legacy code if we changed it. I would nearly support changing it anyway as a breaking change in 2.0. But given that we want to put a new api over the top (and potentially rework some of the internals) do we break user's code once or twice? It is well documented and easy to turn off - so the current thinking is to leave it until we make the api changes - but feel free to debate further - though it isn't really part of this issue per say.

        Secondly, XmlSlurper's keepWhitespace: this is about preserving layout of XML documents. So if we have this XML fragment:

        ...
            <foo>content</foo>
            <bar>content</bar>
        ...
        

        when the flag is off, parsing this will return two elements, "foo" and "bar". When the flag is on, this will return a "foo" element, then a "text node" containing a newline and 4 spaces, then the "bar" element. If you want to process the document content, the "whitespace text nodes" are just unwanted noise. But if you want to preserve the original layout of the document after making some changes to the content, then keeping the flag on can be useful. From memory, XmlSlurper may not allow full round-tripping anyway as it may not keep everything intact, e.g. processing instructions, XML comments etc. (but I haven't checked the code just now so don't quote me on this - we have made it cover more of these things over time)

        Show
        Paul King added a comment - Re: trimWhitespace vs keepWhitespace: it turns out the XmlSlurper's keepWhitespace and XmlParser's trimWhitespace are totally different beasts/concepts! Firstly, XmlParser's trimWhitespace: it causes ".trim()" to be called on element text values (but not attributes!). It really does have the wrong default value but would break legacy code if we changed it. I would nearly support changing it anyway as a breaking change in 2.0. But given that we want to put a new api over the top (and potentially rework some of the internals) do we break user's code once or twice? It is well documented and easy to turn off - so the current thinking is to leave it until we make the api changes - but feel free to debate further - though it isn't really part of this issue per say. Secondly, XmlSlurper's keepWhitespace: this is about preserving layout of XML documents. So if we have this XML fragment: ... <foo> content </foo> <bar> content </bar> ... when the flag is off, parsing this will return two elements, "foo" and "bar". When the flag is on, this will return a "foo" element, then a "text node" containing a newline and 4 spaces, then the "bar" element. If you want to process the document content, the "whitespace text nodes" are just unwanted noise. But if you want to preserve the original layout of the document after making some changes to the content, then keeping the flag on can be useful. From memory, XmlSlurper may not allow full round-tripping anyway as it may not keep everything intact, e.g. processing instructions, XML comments etc. (but I haven't checked the code just now so don't quote me on this - we have made it cover more of these things over time)
        Hide
        Russel Winder added a comment -

        Re: XMLConstants vs XmlUtils: I think the Groovy naming is inconsistent with the policy for Groovy properties where getXMLThingy ( ) goes to XMLThingy in order to preserve the upper caseness. No matter the history, it would be good for Groovy to have a consistent approach to acronym case.

        Show
        Russel Winder added a comment - Re: XMLConstants vs XmlUtils: I think the Groovy naming is inconsistent with the policy for Groovy properties where getXMLThingy ( ) goes to XMLThingy in order to preserve the upper caseness. No matter the history, it would be good for Groovy to have a consistent approach to acronym case.
        Hide
        Russel Winder added a comment -

        Re validation. I found that using the two parameter newSAXParser (note not newSaxParser, see above I did not get any XML Schema validation, I had to switch it on by using the four parameter version.

        Show
        Russel Winder added a comment - Re validation. I found that using the two parameter newSAXParser (note not newSaxParser, see above I did not get any XML Schema validation, I had to switch it on by using the four parameter version.
        Hide
        Russel Winder added a comment -

        Re XmlParser's trimWhiteSpace: I suggest that the default is wrong and that Groovy 2 should have the right default. Legacy code should expect breaking changes 1.8 -> 2.0. Perhaps take this back to the mailing list to get wider debate?

        Show
        Russel Winder added a comment - Re XmlParser's trimWhiteSpace: I suggest that the default is wrong and that Groovy 2 should have the right default. Legacy code should expect breaking changes 1.8 -> 2.0. Perhaps take this back to the mailing list to get wider debate?
        Hide
        Russel Winder added a comment -

        Re XmlSlurper keepWhitespace: It seems that the use case you are explaining has little to do with the one we started with – unless I am missing something. Is it that XmlSlurper is using the same flag for multiple disjoint purposes? If so then a new flag should be introduced to cover the second case.

        If XmlSlurper cannot round-trip then isn't it broken?

        Show
        Russel Winder added a comment - Re XmlSlurper keepWhitespace: It seems that the use case you are explaining has little to do with the one we started with – unless I am missing something. Is it that XmlSlurper is using the same flag for multiple disjoint purposes? If so then a new flag should be introduced to cover the second case. If XmlSlurper cannot round-trip then isn't it broken?
        Hide
        Paul King added a comment -

        Re XmlSlurper keepWhitespace and XmlParser trimWhitespace: yes, they are totally different use cases with totally different flags that unfortunately overlap by 10 common characters. Perhaps they should have different names (breaking change): preserveWhitespace, autoTrimElementText

        Show
        Paul King added a comment - Re XmlSlurper keepWhitespace and XmlParser trimWhitespace: yes, they are totally different use cases with totally different flags that unfortunately overlap by 10 common characters. Perhaps they should have different names (breaking change): preserveWhitespace, autoTrimElementText
        Hide
        Paul King added a comment -

        Re newSAXParser vs newSaxParser: SAXParser is a Java class that leaks through the Groovy api. When thinking of Groovy as value add to Java, I don't mind these differences too much. If you want to treat Groovy as a Java replacement then such differences are annoying. Don't think the getXMLThingy is relevant since xmlThingy would be the property for XmlThingy?

        Show
        Paul King added a comment - Re newSAXParser vs newSaxParser: SAXParser is a Java class that leaks through the Groovy api. When thinking of Groovy as value add to Java, I don't mind these differences too much. If you want to treat Groovy as a Java replacement then such differences are annoying. Don't think the getXMLThingy is relevant since xmlThingy would be the property for XmlThingy?
        Hide
        Paul King added a comment -

        Re flipping trimWhitespace default: feel free to continue debate in user mailing list - if there is unanimous support I will be happy to help implement the change - if there is a wide divergence of opinion, I don't have heaps of time at present to contribute to persuading the masses - what little time I do have I'd prefer to invest in a future XML module with quite a few things fixed. But agree with you that the current default is very anti common practice.

        Show
        Paul King added a comment - Re flipping trimWhitespace default: feel free to continue debate in user mailing list - if there is unanimous support I will be happy to help implement the change - if there is a wide divergence of opinion, I don't have heaps of time at present to contribute to persuading the masses - what little time I do have I'd prefer to invest in a future XML module with quite a few things fixed. But agree with you that the current default is very anti common practice.
        Hide
        Paul King added a comment -

        Re two param vs four param newSAXParser: does XmlUtilTest#testSchemaValidationUtilityMethod work for you? It uses the two param version and seems to work?

        Show
        Paul King added a comment - Re two param vs four param newSAXParser: does XmlUtilTest#testSchemaValidationUtilityMethod work for you? It uses the two param version and seems to work?
        Hide
        Paul King added a comment -

        Russel, does XmlUtilTest#testSchemaValidationUtilityMethod not work for you? Or did you reopen this in relation to the keepWhitespace / trimWhitespace issue? I guess it is working for me with the test cases I have so I am looking for another example. Thanks.

        Show
        Paul King added a comment - Russel, does XmlUtilTest#testSchemaValidationUtilityMethod not work for you? Or did you reopen this in relation to the keepWhitespace / trimWhitespace issue? I guess it is working for me with the test cases I have so I am looking for another example. Thanks.
        Hide
        Paul King added a comment -

        Russel, I have lost track of what else you require for this issue to be closed? Can you remember?

        Show
        Paul King added a comment - Russel, I have lost track of what else you require for this issue to be closed? Can you remember?
        Hide
        Paul King added a comment -

        I am going to mark this as closed as I believe the Schema Validation part has been adequately addressed. If we need to tweak other defaults, we should use a fresh issue to create more focus around any required changes/discussion.

        Show
        Paul King added a comment - I am going to mark this as closed as I believe the Schema Validation part has been adequately addressed. If we need to tweak other defaults, we should use a fresh issue to create more focus around any required changes/discussion.

          People

          • Assignee:
            Paul King
            Reporter:
            Russel Winder
          • Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved: