|
wow, awesome! i'll certainly take a look! if you made a zip of it as org.jaxen, then one could easily pull up both source trees in a diff program to view the changes.. we can then decide on where to house the changes ok here it is. This passes the tests 100%. Its still just about 37x faster on the performance test, but no faster on the functional tests - I think those are largely I/O bound. I've also put in my schematron impl (org.jaxen.schematron package). this has a performance test too, which compares the xsl/jaxen performance on a more complex set of patterns. While I can pass the jaxen tests 100%, some of these tests fail (see the commented out assertion test in the performance test). So I don't think this code is quite right yet. The schematron tests run about 20% faster than xalan with this code (nearly twice as fast as with old jaxen). Also, I found several more jaxen tests that violated the spec, due to * patterns matching more than the principal node type (sec 2.3). I've put the corrected tests in but commented them out, so tests.xml will pass both jaxen and my code. Have fun! this is a big change. i believe brian said that this direct approach may have problems. either way, the idea is a 2.0 idea This is a revised attack on a number of v2 issues. Largely rewritten from scratch with jdk1.5 work-a-like public interface, a javacc-generated parser, an interal class structure designed around the OPTMINCONTEXT algorithm. While I havent gone as far as that algorithm in this code, its still 3x faster on the perf test, and comparable on the navigator test. It should be possible with some renames and a bit of surgery on BaseXPath to bolt this into the existing Jaxen APIs. A bit more fiddling with this new build... I'm back to MUCH MUCH faster (jaxen runs the performance test in 31 seconds, the new code does it in 200 /milliseconds/), but still passing all the navigator tests; touch wood that it won't all blow up later on. Performance on the navigator tests is largely the same as jaxen's because it calls result.size() to test for a pass/fail - forcing my code to evaluate all nodes. The differences from the previous upload were mainly in making RelativeLocationPath use an iterator instead of a loop, so it's as lazy as it can be; and relying on an optimised navigator.getNodeType() instead of navigator.isElement() and friends. I've profiled the code and don't see any more low-hanging fruit, so I'm going to work on building this as a (hefty) patch to the main code now. how about working on this in a branch in CVS? A branch in CVS will be a good idea, but not quite yet. I'm doing too much messing around with packages still, cvs would be painful. Also, when I upgraded MacOS I realized that I no longer remembered the passphrase for my ssh key (which keychain had been managing for me). I am such a loser Anyhoo, I am slowly getting to enough stability to go for a branch (2.0-alpha?). I'm working on backwards compatibility at the moment, the patch doesn't need to be as radical as that attachment. Sometime this week. So, are there any plans to incorporate the optimizations to Jaxen? Accorging to benchmarks, Jaxen is the slowest XPath implementation among the competition: Thanks for posting this. For the moment, I'm more concerned about the three conformance failures they say they've found. I'll have to check those out. Correctness trumps speed in my book. That said, if someone felt like implementing this idea without changing the public API, and could demonstrate that this did noticeably improve performance, it would certainly go in. However if the optimization requires changes to the public API, especially backwards incompatible changes, we'd have to look more closely, and possibly release it as 2.0. Possible, but it would certainly take longer if we're going to break existing clients. Do current unit tests have enough coverage to demonstrate API-wise backwards compatibility? I could give a shot at refreshing Brian's stuff. FYI, the tests you cite were run with Jaxen 1.0. This is years out of date. Since then, a lot of work has gone into both conformance and optimization so we likely do better now. I'd be curious to see updated results. That said, the optimization suggested here may still be a good idea. As far as API compatibility, just make sure you don't change any existing public or protected class or method signatures, and you should be good. And of course the unti tests should all pass too. OK, I've dug out some old posts on jaxen-user made by Brian in Nov 2004: http://markmail.org/message/fhl327jznaxw3fbp They sort of lay out the proposed course of action WRT optimizing Jaxen. Do you agree with assertions presented there? As I understand, none of this has happened and 1.1.1 and 1.1.2 releases were made in the mean time. I've contacted Brian and he has prepared a cleaned up version of the patch with minimized differences (now it's only 30 kb). I'll be working on it, but just in case I'm attaching it here. The patch shouldn't break API compatibility (modulo some issues left to iron out) and Brian has just recently fixed some issues with it:
Additional notes from Brian: "In one case I just added an ArrayList wrapper which is a bit wasteful, to make the sort work I implemented a load more methods in LazyList the tests that are failing are: testUnboundNamespaceUsedInXPathExpression - this is likely to be because the test no longer triggers. There is code in BaseXpath now that unwraps some JaxenRuntimeExceptions, which was actually supposed to catch this test. Not sure why its not firing now. However, this test is only failing on illegal xpaths anyway - not a high priority testPrecedingAxisInDocumentOrder - this one should have been fixed by the sort() mentioned above. testIdentitySetUsageInDefaultNameStep - seems to be some issue with the way I'm uniquing the nodes after finding them. (there is a uniqueness check, but obviously its not working) testAncestorAxis, testPrecedingSiblingAxisIsInDocumentOrder,testPrecedingAxisIsInDocumentOrder,testAncestorOrSelfAxis - all to do with the sort order of nodes. I think the rest of the 21 are all dupes of these The kind of thing that is supposed to be much faster with this, btw, is doing things like boolean(//x) - which tests for the existence of a node in the entire tree. Since this just involves forward axes, the lazy-list version will stop when it hits the first matching node, rather than continuing to accumulate node-sets. There's some overhead, reverse-axis tests may be slower, count() and the like which require collection of all nodes are likely to be unaffected." Aleksander commented that there were still a few hotspots in this version, checking isElement() etc. I had a look over the code for any obvious optimisations. I think the tests we do there are unavoidable, and the number of calls ought to match up with released version of jaxen. However, I did spot a few further improvements. First thing: it turns out that removing the new ArrayList() from "Collections.reverse(new ArrayList(contextNodeSet));" in DefaultLocationPath took my failed tests down to 6 (really 2, duplicated because IDEA runs the suites and the tests). Its unclear to me why this would have any effect at all (which is worrying). Next, optimisations: there were a few things that jumped out. Any time we call context.size() we force the LazyLists introduced in the previous patch to fill up. The code quite nastily uses 'list.size() == 0' to test for emptiness, rather than using 'list.isEmpty()'. That's just bad, there's no need to know the actual size to test if a list is empty, and the LazyLists are optimised for that case; so I replaced those. There was also an avoidable use of size() testing 'size() ==1' prior to a list.get(0); I fixed this to use the iterator. DefaultStep was not using iterators, so non-name-steps were evaluating the size. Fixing all of this did not change the test situation at all, but should have a good impact on perf. There's two users of size() which are more complicated: PredicateSet and Context. Context is mutable, which is a big mistake; worse, the size, position, and nodeset are mutable separately. In order for some of the logic to work, if the nodeset changes, the position may need reset to zero; but only if the position is not explicitly set before it is used. Anyway, I've updated that code slightly so that the right thing happens, but where possible we avoid calling nodeset.size(). I've not updated the PredicateSet code - which is more complicated - so it remains the case that uses of predicates will trigger slow paths. patch attached. Looking good. I would like to get the failed tests down to zero though. We could probably make Context immutable as part of a larger revision. Brian, you're a committer right? (A bit off topic here) yes, or more accurately I was a committer. I hadn't pushed anything for a while, and then an ssh key vulnerability was announced, which led to all vulnerable ssh keys - including mine - being disabled everywhere. I never submitted a new one, so effectively I'm out of the loop. I'm still monitoring the mailing list and following you on twitter tho! Also, I wouldn't expect to see this land until all the tests pass; its not quite there yet, though its not far off (2 real failures now). So pushing all of this isn't an option for now, though as I see you've just commented elsewhere, landing bits of it, like isEmpty(), might be useful. This version of the patch passes all tests (w00t!), and reduces the number of places that force the lazy list to be filled. I've refactored the lazy list itself a bit, so that a separate method, fill(), is called when filling is forced for the first time. This makes it easy to find the culprits in the debugger. Debugging XOMPerformance.java, there is only one culprit - the call to Collections.sort() in DefaultLocationPath. Sorting should always be unnecessary - since we deal with nodes that are either already sorted in a forward or reverse direction, we at most need to perform a reverse and a merge. BTW, I am seeing this version as being /slower/ for that test than svn head - 5.9s versus 4.9s. Seems like a lot of overhead wrapping iterators in lists in iterators... Aleksander, how does this perform in your real world case? Ok, I have gone one step further and turned off the sort whenever the location path only uses the parent, self, or child axes (a very common case). The theory here is that given a list of nodes X in document order, then X/parent::, X/self::, and X/child::* are also in document order; that a whole location path consisting of these axes is also ordered follows by induction. Other axes do not support this assertion, and removing the sort for them would trigger All tests still pass. The performance gains are back towards the title of this bug - XOMPerformance goes from 4.9s to 97ms on my machine. Its a shame I can't get this simple trick to work for descendant-or-self but this is quite a big win as it stands. BTW, I've already started to adapt jBPM-BPEL to take advantage of Jaxen LazyLists: |
|||||||||||||||||||||||||||||||||||||||||||||||||||||
BTW if bob or peter take a look at this... as this touches all over the code, I was planning to submit as a zip of the entire tree, not patches, with the code in some distinct package (org.jaxen2 currently, but I'll probably rename that to org.jaxen.v2 to stay under the right root).
I don't touch the saxpath stuff, and wanted to leave the navigators alone, but unfortunately Navigator has a method to return an XPath so it was simpler to copy everything. I see that method as a cyclic dependency, I think it should be dropped. If I fix this there's less duplicate code to submit, but it makes a non-backward compatible change in jaxen. Your preference.
If you have the IDEA xml file for the haus coding conventions, it would be great if it could be posted somewhere? I've got something approximately the same I'm using to format my code but it would be handy to be able to format all this the way you want. Of course, you can do that once you see the code...
My tests share their data with the jaxen tests. This means that when jaxen's gone deliberately off-spec (for substring()) one or other must break. I'm not keen on checking in broken tests but thats already happened for the 'preceding' axis tests? I'd rather comment out those tests for both, so gump works.
Any comments, better ideas appreciated.