Primarily for backward compatibility I suppose. The existing XPath rules in PMD, and those used by users of PMD, are already written like //SomeType. Also, the GUI designer for crafting the XPath queries is not aware of this optimization, and would fail to correctly apply the modified query (not that it couldn't be changed).
I think we'd prefer to keep it an internal PMD behavior if possible, instead of part of the external API. Ideally, a Rule designer doesn't need to know anything about this, and they can still think of things in terms of top XPath from the Root of the Java AST. This also leaves PMD with the most options going forward. Perhaps when Jaxen 1.2 or 2.0 comes out and it performs much better than 1.0 we can switch tactics, transparently to the users.
Also, we're not assuming every XPath can be used with the Visitor approach. While those built into PMD can, who knows what custom queries are out there. If the XPath is not one whose parsed Expr matching the exact structure we know we can handle, then we will fall back to using the slow, but fully functional, original query from the Root of the Java AST. This doesn't speak to the need to modify the query, but to examining the Jaxen AST produced.
This in unrelated to your interest in the above, but is instead part of the logical progression of PMD usage of XPath. I'm currently working on taking this one step further to generate a Java Rule from the Jaxen AST. The goal is to get this working at compile time, and perhaps at runtime (via ASM/BCEL) once the solution is well understood. This is a rather fun problem. My initial brain dead approach, generating looping node-sets filters in series from the XPath steps/predicates, results in rules which are a fair bit faster than even the truncated XPath queries. I'm now looking into optimization scenarios where I can combine adjacent filters into a single looping construct. I found the Gottlob, Koch, Pichler paper via Google, and am looking into that now. Part of the paper talks about scenarios in which only certain parts of the context is needed, and it is consistent with the ideas I had (reassuring that). I'm hoping only to create a full node-set when absolutely necessary. I'm also looking to short-circuit node-set creation when ultimately the result will be used as a particular data-type (e.g. a node-set being converted to a boolean, can immediately produce the boolean result once the first node is determined). I need to read the paper closer to see if they mention this as a viable approach too. I'm also considering some count() related optimizations, among other things. Essentially I can choose which parts of the XPath spec I want to handle, cut various corners, and optimize in ways that Jaxen likely cannot/should not.
Perhaps the final result of this in PMD will be a single XPath Rule writing interface, but with 3 possible tiers of performance, depending on the XPath query:
1) Pure XPath on the entire AST. (slow)
2) Modified Jaxen XPath on branches of the AST. (average)
3) Compiled XPath on branches of the AST. (fast)
All of this would not be possible, without significant extra work, if Jaxen did not expose the underlying AST as it is does. Further I like the design of the interfaces in Jaxen. They really made it easier for me to come away with a deeper understand of the XPath spec, and get me head around the problems I was trying to solve.
Kudos to the Jaxen team!
Sorry, I ramble a lot I know.