Issue Details (XML | Word | Printable)

Key: JAXEN-31
Type: Improvement Improvement
Status: Open Open
Priority: Major Major
Assignee: Unassigned
Reporter: Brian Ewins
Votes: 3
Watchers: 3
Operations

If you were logged in you would be able to see more operations.
jaxen

if jaxen stopped using lists it would be 37x faster

Created: 11/Aug/04 06:05 AM   Updated: 23/Oct/09 05:41 PM
Return to search
Component/s: None
Affects Version/s: 1.0
Fix Version/s: 2.0

Time Tracking:
Original Estimate: 1 week
Original Estimate - 1 week
Remaining Estimate: 1 week
Remaining Estimate - 1 week
Time Spent: Not Specified
Time Spent - Not Specified

File Attachments: 1. Text File Brian_Ewins_JAXEN-31_perf.patch (32 kB)
2. Zip Archive jaxen.zip (1.36 MB)
3. Text File perf.patch (43 kB)
4. Text File perf.patch (42 kB)
5. Text File perf.patch (41 kB)
6. File stuff.tgz (148 kB)



 Description  « Hide

The short story - I've rewritten jaxen to use a nodeset abstraction instead of a list to accumulate results. With the unit tests at nearly 100%, the performance test on dom4j runs in ~200ms on my 1GHz powerbook, compared to ~7400ms for the nightly jaxen build. The failing unit tests are for paths not covered in the performance test.

The long story - I wrote an implementation of the draft ISO Schematron spec on top of Jaxen, instead of the more usual XSLT. The schematron spec is designed so this can be done, mainly for performance reasons. Once I had enough working I was able to compare jaxen with xalan on the 'screamatron' torture test, which is using schematron to validate the schematron 1.5 spec. I was suprised to find that jaxen was about 2x /slower/ than xalan, and marginally slower with a jdom navigator than a dom4j navigator. Looking at why it was so slow, it became clear that the problem was that throughout jaxen it's building lists of intermediate results to filter down, instead of generating the 'next' result on demand. There's comments about this in several places in the code, but the list thing is so ingrained it needs more than a little refactoring to fix.

I decided to have a go. This change produces apis incompatible with previous versions of jaxen, so this will be a lump of code not a wee patch. I've basically cloned jaxen to 'org.jaxen2' to work on it. I'm nearly done with the changes, failing tests for the non-xpath extension functions which I havent implemented yet, patterns (not touched at all yet), and a couple of minor bugs.

I'll post the code on jira once I'm done, but here's the gist of the changes I made:

  • Introduce a Nodeset abstraction instead of a list...
    public interface extends Iterator {
    Object current();
    int position();
    // use of last() replaces the internal iterator with an iterator over a list caching results.
    // it's basically a fallback on what jaxen used to do.
    int last();
    }
    the reason for this is obvious, eg you can just test 'hasNext()' to check the boolean value of a nodeset.
  • change the signature of Expr to allow the selection of a fast code path
    public interface Expr {
    // these require context because of variable/function refs only. see later.
    boolean isBoolean(Context ctx);
    boolean isNumber(Context ctx);
    boolean isString(Context ctx);
    boolean isNodeset(Context ctx);
    boolean booleanValue(Context ctx);
    double numberValue(Context ctx);
    String stringValue(Context ctx);
    Nodeset nodesetValue(Context ctx);
    // etc
    }
    this is really for my own sanity as there's so much code to maintain. Throughout jaxen, everything is untyped, so its fairly easy to (eg) pass a nodeSet when you meant to pass a node; there's instanceof checks, == null checks, and casts everywhere to deal with this. Once you put everything behind an interface like the one above, making objects responsible for their own casts, a lot of the code is simplified, eg addition is just expr1.numberValue(ctx) + expr2.numberValue(ctx). All but a couple of classes inherit from abstract base classes that implement all but one of those methods (eg StringExprBase leaves stringValue() abstract.)

Variable/Function refs don't have a type without context in my version, and like Jaxen I look up functions, variables every time. This isn't smart - in xpath we can resolve all variable, function refs beforehand, and get faster code as a result. I'll probably change things so I have 'bound' and 'unbound' exprs, add 'BoundExpr bind(VariableContext,FunctionContext)'. I'll still need to pass these as members of Context to support the 'evaluate' extension function though.

Another approach to the same concern that I didn't try is to pass in a typed context, eg 'StringContext', 'NodesetContext', etc (think of this as allowing things like perl's 'wantarray()'. That approach would make it easier to expand the number of supported types (eg for xpath2) in future, I'll try it once the tests pass.

  • moved all the functions to a single utility class
    public class Functions {
    String substring(String s, double start, double end);
    // etc
    }
    this makes it easier to unit test the xpath functions, since they are mostly context-independent. Functions that need a navigator get it passed as the first arg, functions that optionally use the context node instead of an argument have that function added as a method of Context. Incidentally I changed some of the unit tests to get mine to pass - in particular I changed the substring() tests back to match the spec (they don't currently).

No function in my version returns null. (compare to eg the name, namespaceuri functions of Navigator). The xpath spec uses "" or an empty nodeset throughout, so do I, this gets rid of a lot of == null checks.

Some of the bugs against the current version of jaxen are fixed by this code - eg the use of double throughout allows gt/lt comparisons of longs (bug jaxen-28), + both of the bugs I filed previously.

I'd estimate I can finish this in the next week.



Brian Ewins added a comment - 11/Aug/04 07:15 AM

BTW if bob or peter take a look at this... as this touches all over the code, I was planning to submit as a zip of the entire tree, not patches, with the code in some distinct package (org.jaxen2 currently, but I'll probably rename that to org.jaxen.v2 to stay under the right root).

I don't touch the saxpath stuff, and wanted to leave the navigators alone, but unfortunately Navigator has a method to return an XPath so it was simpler to copy everything. I see that method as a cyclic dependency, I think it should be dropped. If I fix this there's less duplicate code to submit, but it makes a non-backward compatible change in jaxen. Your preference.

If you have the IDEA xml file for the haus coding conventions, it would be great if it could be posted somewhere? I've got something approximately the same I'm using to format my code but it would be handy to be able to format all this the way you want. Of course, you can do that once you see the code...

My tests share their data with the jaxen tests. This means that when jaxen's gone deliberately off-spec (for substring()) one or other must break. I'm not keen on checking in broken tests but thats already happened for the 'preceding' axis tests? I'd rather comment out those tests for both, so gump works.

Any comments, better ideas appreciated.


peter royal added a comment - 11/Aug/04 11:12 AM

wow, awesome! i'll certainly take a look!

if you made a zip of it as org.jaxen, then one could easily pull up both source trees in a diff program to view the changes.. we can then decide on where to house the changes


Brian Ewins added a comment - 14/Aug/04 05:46 PM

ok here it is. This passes the tests 100%. Its still just about 37x faster on the performance test, but no faster on the functional tests - I think those are largely I/O bound. I've also put in my schematron impl (org.jaxen.schematron package). this has a performance test too, which compares the xsl/jaxen performance on a more complex set of patterns. While I can pass the jaxen tests 100%, some of these tests fail (see the commented out assertion test in the performance test). So I don't think this code is quite right yet. The schematron tests run about 20% faster than xalan with this code (nearly twice as fast as with old jaxen). Also, I found several more jaxen tests that violated the spec, due to * patterns matching more than the principal node type (sec 2.3). I've put the corrected tests in but commented them out, so tests.xml will pass both jaxen and my code. Have fun!


peter royal added a comment - 20/Feb/05 11:30 AM

this is a big change. i believe brian said that this direct approach may have problems.

either way, the idea is a 2.0 idea


Brian Ewins added a comment - 15/May/05 07:34 PM

This is a revised attack on a number of v2 issues. Largely rewritten from scratch with jdk1.5 work-a-like public interface, a javacc-generated parser, an interal class structure designed around the OPTMINCONTEXT algorithm. While I havent gone as far as that algorithm in this code, its still 3x faster on the perf test, and comparable on the navigator test. It should be possible with some renames and a bit of surgery on BaseXPath to bolt this into the existing Jaxen APIs.


Brian Ewins added a comment - 19/May/05 04:14 AM

A bit more fiddling with this new build... I'm back to MUCH MUCH faster (jaxen runs the performance test in 31 seconds, the new code does it in 200 /milliseconds/), but still passing all the navigator tests; touch wood that it won't all blow up later on. Performance on the navigator tests is largely the same as jaxen's because it calls result.size() to test for a pass/fail - forcing my code to evaluate all nodes.

The differences from the previous upload were mainly in making RelativeLocationPath use an iterator instead of a loop, so it's as lazy as it can be; and relying on an optimised navigator.getNodeType() instead of navigator.isElement() and friends. I've profiled the code and don't see any more low-hanging fruit, so I'm going to work on building this as a (hefty) patch to the main code now.


peter royal added a comment - 19/May/05 06:54 AM

how about working on this in a branch in CVS?


Brian Ewins added a comment - 25/May/05 08:45 AM

A branch in CVS will be a good idea, but not quite yet. I'm doing too much messing around with packages still, cvs would be painful. Also, when I upgraded MacOS I realized that I no longer remembered the passphrase for my ssh key (which keychain had been managing for me). I am such a loser . I'll sort this out soon.

Anyhoo, I am slowly getting to enough stability to go for a branch (2.0-alpha?). I'm working on backwards compatibility at the moment, the patch doesn't need to be as radical as that attachment. Sometime this week.


Aleksander Adamowski added a comment - 12/Oct/09 05:24 AM

So, are there any plans to incorporate the optimizations to Jaxen?

Accorging to benchmarks, Jaxen is the slowest XPath implementation among the competition:
http://www.asciiarmor.com/post/33736620/java-xpath-1-0-engine-comparison-performance


Elliotte Rusty Harold added a comment - 12/Oct/09 06:32 AM

Thanks for posting this. For the moment, I'm more concerned about the three conformance failures they say they've found. I'll have to check those out. Correctness trumps speed in my book.

That said, if someone felt like implementing this idea without changing the public API, and could demonstrate that this did noticeably improve performance, it would certainly go in. However if the optimization requires changes to the public API, especially backwards incompatible changes, we'd have to look more closely, and possibly release it as 2.0. Possible, but it would certainly take longer if we're going to break existing clients.


Aleksander Adamowski added a comment - 12/Oct/09 06:41 AM

Do current unit tests have enough coverage to demonstrate API-wise backwards compatibility? I could give a shot at refreshing Brian's stuff.


Elliotte Rusty Harold added a comment - 12/Oct/09 07:04 AM

FYI, the tests you cite were run with Jaxen 1.0. This is years out of date. Since then, a lot of work has gone into both conformance and optimization so we likely do better now. I'd be curious to see updated results.

That said, the optimization suggested here may still be a good idea. As far as API compatibility, just make sure you don't change any existing public or protected class or method signatures, and you should be good. And of course the unti tests should all pass too.


Aleksander Adamowski added a comment - 12/Oct/09 08:15 AM

OK, I've dug out some old posts on jaxen-user made by Brian in Nov 2004:

http://markmail.org/message/fhl327jznaxw3fbp
http://markmail.org/message/c226vid7yaoofdfm

They sort of lay out the proposed course of action WRT optimizing Jaxen.

Do you agree with assertions presented there?

As I understand, none of this has happened and 1.1.1 and 1.1.2 releases were made in the mean time.
Do those plans need to be revised, then? Or are they more or less still correct?


Aleksander Adamowski added a comment - 16/Oct/09 12:16 PM

I've contacted Brian and he has prepared a cleaned up version of the patch with minimized differences (now it's only 30 kb).

I'll be working on it, but just in case I'm attaching it here.

The patch shouldn't break API compatibility (modulo some issues left to iron out) and Brian has just recently fixed some issues with it:

  • off-by-one error in PredicateSet
  • sort and reverse support in DefaultLocationPath

Additional notes from Brian:

"In one case I just added an ArrayList wrapper which is a bit wasteful, to make the sort work I implemented a load more methods in LazyList

the tests that are failing are: testUnboundNamespaceUsedInXPathExpression - this is likely to be because the test no longer triggers. There is code in BaseXpath now that unwraps some JaxenRuntimeExceptions, which was actually supposed to catch this test. Not sure why its not firing now. However, this test is only failing on illegal xpaths anyway - not a high priority

testPrecedingAxisInDocumentOrder - this one should have been fixed by the sort() mentioned above.

testIdentitySetUsageInDefaultNameStep - seems to be some issue with the way I'm uniquing the nodes after finding them. (there is a uniqueness check, but obviously its not working)

testAncestorAxis, testPrecedingSiblingAxisIsInDocumentOrder,testPrecedingAxisIsInDocumentOrder,testAncestorOrSelfAxis - all to do with the sort order of nodes.

I think the rest of the 21 are all dupes of these

The kind of thing that is supposed to be much faster with this, btw, is doing things like boolean(//x) - which tests for the existence of a node in the entire tree. Since this just involves forward axes, the lazy-list version will stop when it hits the first matching node, rather than continuing to accumulate node-sets. There's some overhead, reverse-axis tests may be slower, count() and the like which require collection of all nodes are likely to be unaffected."


Brian Ewins added a comment - 19/Oct/09 07:05 AM

Aleksander commented that there were still a few hotspots in this version, checking isElement() etc. I had a look over the code for any obvious optimisations. I think the tests we do there are unavoidable, and the number of calls ought to match up with released version of jaxen. However, I did spot a few further improvements.

First thing: it turns out that removing the new ArrayList() from "Collections.reverse(new ArrayList(contextNodeSet));" in DefaultLocationPath took my failed tests down to 6 (really 2, duplicated because IDEA runs the suites and the tests). Its unclear to me why this would have any effect at all (which is worrying).

Next, optimisations: there were a few things that jumped out. Any time we call context.size() we force the LazyLists introduced in the previous patch to fill up. The code quite nastily uses 'list.size() == 0' to test for emptiness, rather than using 'list.isEmpty()'. That's just bad, there's no need to know the actual size to test if a list is empty, and the LazyLists are optimised for that case; so I replaced those. There was also an avoidable use of size() testing 'size() ==1' prior to a list.get(0); I fixed this to use the iterator. DefaultStep was not using iterators, so non-name-steps were evaluating the size. Fixing all of this did not change the test situation at all, but should have a good impact on perf.

There's two users of size() which are more complicated: PredicateSet and Context. Context is mutable, which is a big mistake; worse, the size, position, and nodeset are mutable separately. In order for some of the logic to work, if the nodeset changes, the position may need reset to zero; but only if the position is not explicitly set before it is used. Anyway, I've updated that code slightly so that the right thing happens, but where possible we avoid calling nodeset.size(). I've not updated the PredicateSet code - which is more complicated - so it remains the case that uses of predicates will trigger slow paths.

patch attached.


Elliotte Rusty Harold added a comment - 19/Oct/09 07:51 AM

Looking good. I would like to get the failed tests down to zero though.

We could probably make Context immutable as part of a larger revision.

Brian, you're a committer right?


Brian Ewins added a comment - 19/Oct/09 08:13 AM

(A bit off topic here) yes, or more accurately I was a committer. I hadn't pushed anything for a while, and then an ssh key vulnerability was announced, which led to all vulnerable ssh keys - including mine - being disabled everywhere. I never submitted a new one, so effectively I'm out of the loop. I'm still monitoring the mailing list and following you on twitter tho!

Also, I wouldn't expect to see this land until all the tests pass; its not quite there yet, though its not far off (2 real failures now). So pushing all of this isn't an option for now, though as I see you've just commented elsewhere, landing bits of it, like isEmpty(), might be useful.


Brian Ewins added a comment - 19/Oct/09 01:11 PM

This version of the patch passes all tests (w00t!), and reduces the number of places that force the lazy list to be filled. I've refactored the lazy list itself a bit, so that a separate method, fill(), is called when filling is forced for the first time. This makes it easy to find the culprits in the debugger.

Debugging XOMPerformance.java, there is only one culprit - the call to Collections.sort() in DefaultLocationPath. Sorting should always be unnecessary - since we deal with nodes that are either already sorted in a forward or reverse direction, we at most need to perform a reverse and a merge.

BTW, I am seeing this version as being /slower/ for that test than svn head - 5.9s versus 4.9s. Seems like a lot of overhead wrapping iterators in lists in iterators... Aleksander, how does this perform in your real world case?


Brian Ewins added a comment - 20/Oct/09 04:30 AM

Ok, I have gone one step further and turned off the sort whenever the location path only uses the parent, self, or child axes (a very common case). The theory here is that given a list of nodes X in document order, then X/parent::, X/self::, and X/child::* are also in document order; that a whole location path consisting of these axes is also ordered follows by induction. Other axes do not support this assertion, and removing the sort for them would trigger JAXEN-55. Do you agree with this analysis Elliotte?

All tests still pass. The performance gains are back towards the title of this bug - XOMPerformance goes from 4.9s to 97ms on my machine. Its a shame I can't get this simple trick to work for descendant-or-self but this is quite a big win as it stands.


Aleksander Adamowski added a comment - 23/Oct/09 05:41 PM

BTW, I've already started to adapt jBPM-BPEL to take advantage of Jaxen LazyLists:

https://jira.jboss.org/jira/browse/BPEL-313