groovy
  1. groovy
  2. GROOVY-2115

text() method on NodeChild with XmlSlurper returns composite. Need just local text.

    Details

    • Type: New Feature New Feature
    • Status: Closed Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 2.3.0-beta-2
    • Component/s: XML Processing
    • Labels:
      None
    • Environment:
      windows xp on jdk 1.4.2
    • Number of attachments :
      0

      Description

      I am trying to use XmlSlurper to process and xml with nested elements with text in them. I need to get the text from just one level at a time but the text() method returns all text for all childrend and I can't see anything that would bring back just the local.

      Here is a sample from the console:

      groovy> def model = new XmlSlurper().parseText('<aModel><aParent name="bubba">text<aChild>child text</aChild></aParent></aModel>')

      groovy> model.aParent[0].text()

      groovy>

      groovy>

      Result: "textchild text"

        Activity

        Guillaume Laforge made changes -
        Field Original Value New Value
        Assignee Paul King [ paulk ]
        Hide
        Paul King added a comment -

        John, assigning to you if for nothing else for comment. Given your proposed new method(s) to XmlSlurper it kind of makes sense to try to tackle this at the same time I think. Just assign back (or leave unassigned) if you have no time (though any suggestions would be welcome) or provide a comment if you think the feature request is not warranted and can be done by other means.

        Show
        Paul King added a comment - John, assigning to you if for nothing else for comment. Given your proposed new method(s) to XmlSlurper it kind of makes sense to try to tackle this at the same time I think. Just assign back (or leave unassigned) if you have no time (though any suggestions would be welcome) or provide a comment if you think the feature request is not warranted and can be done by other means.
        Paul King made changes -
        Assignee Paul King [ paulk ] John Wilson [ tug ]
        Hide
        John Wilson added a comment -

        This is not a bug - text() does what it is designed to do which is to give you all the text in the element

        However there is a need for an additional mechanism to let you handle mixed content as in the example provided.

        I'm working on this

        Show
        John Wilson added a comment - This is not a bug - text() does what it is designed to do which is to give you all the text in the element However there is a need for an additional mechanism to let you handle mixed content as in the example provided. I'm working on this
        John Wilson made changes -
        Issue Type Bug [ 1 ] New Feature [ 2 ]
        Hide
        Kevin C. Dorff added a comment -

        I went to write a parser today and was amazed to find I couldn't do this. Has there been any more thought or progress on this? I would also really like some way to get the actual contents of a NodeChild, something like

        def model = new XmlSlurper().parseText('<aModel><aParent name="bubba">text<aChild>child text</aChild></aParent></aModel>')

        assertEquals "text<aChild>child text</aChild>", model.aParent[0].contents()

        Show
        Kevin C. Dorff added a comment - I went to write a parser today and was amazed to find I couldn't do this. Has there been any more thought or progress on this? I would also really like some way to get the actual contents of a NodeChild, something like def model = new XmlSlurper().parseText('<aModel><aParent name="bubba">text<aChild>child text</aChild></aParent></aModel>') assertEquals "text<aChild>child text</aChild>", model.aParent [0] .contents()
        Hide
        Randy Jones added a comment - - edited

        I too did not expect this behavior. This is my work around.

        public static String groovyTwentyOneFifteen(def parent) {
            String all = parent.text()
            StringBuilder subtract = new StringBuilder()
            parent.children().each {
                subtract.append(it.text())
            }
            return all.substring(0, all.size() - subtract.toString().size())
        }
        
        Show
        Randy Jones added a comment - - edited I too did not expect this behavior. This is my work around. public static String groovyTwentyOneFifteen(def parent) { String all = parent.text() StringBuilder subtract = new StringBuilder() parent.children().each { subtract.append(it.text()) } return all.substring(0, all.size() - subtract.toString().size()) }
        Paul King made changes -
        Assignee John Wilson [ tug ]
        Hide
        Jean-Louis Jouannic added a comment -

        Here is my work around as a closure (based on the XPath text() function) :

        def localText = { parent ->
            def children = parent.getAt(0).children
            def result = [] as List
            for (child in children) {
                if (!(child instanceof groovy.util.slurpersupport.Node)) {
                    result.add(child)
                }
            }
            return result
        }
        

        Applied on the following XML :

        <root>aaa<sub-level>bbbb</sub-level>ccc</root>
        

        It gives the following String list :

        [aaa, ccc]
        
        Show
        Jean-Louis Jouannic added a comment - Here is my work around as a closure (based on the XPath text() function) : def localText = { parent -> def children = parent.getAt(0).children def result = [] as List for (child in children) { if (!(child instanceof groovy.util.slurpersupport.Node)) { result.add(child) } } return result } Applied on the following XML : <root>aaa<sub-level>bbbb</sub-level>ccc</root> It gives the following String list : [aaa, ccc]
        Hide
        Paul King added a comment -

        There is now a localText() method which returns the text from any text node(s) and ignores nested nodes.

        Show
        Paul King added a comment - There is now a localText() method which returns the text from any text node(s) and ignores nested nodes.
        Paul King made changes -
        Resolution Fixed [ 1 ]
        Fix Version/s 2.3.0-beta-2 [ 20226 ]
        Assignee Paul King [ paulk ]
        Status Open [ 1 ] Resolved [ 5 ]
        Paul King made changes -
        Status Resolved [ 5 ] Closed [ 6 ]

          People

          • Assignee:
            Paul King
            Reporter:
            Ken Sayers
          • Votes:
            8 Vote for this issue
            Watchers:
            8 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved: