As you say this is to do with 3.7. Short answer is that we had this in the javacc version of the parser ages ago, the implementation is at the bottom of this comment. Saxpath should have been binned years ago, etc etc.
Long answer: that tokenizer was written to be stateful, using the previous token to determine what the current token should be emitted as. The same fix would probably work here. A direct translation doesn't correspond to a change in 'star()', but a second switch statement in 'nextToken()'.
An implementation would be: instead of storing 'previousToken' in XPathLexer, store 'boolean expectOperator = false;'. Remove the switch statement from identifierOrOperatorName(), and place it at the end of nextToken(), use it to set 'expectOperator'. In the first switch statement in nextToken, the entry for '*' should read:
case '*':
{
if (expectOperator) {
token = star();
} else {
// or something less ambiguous; I havent checked the knock-on effects of this
token = identifier();
}
break;
and the entry for default:
default:
{
if ( isIdentifierStartChar( LA(1) ) )
{
if (expectOperator) {
token = operatorName();
} else {
token = identifier();
}
}
}
this boils down to much the same change as in the first comment, but there should be a bit less duplication of the previousToken logic.
For completeness, here's how it looked in javacc:
/*
If there is a preceding token and the preceding token is not one of @, ::,
(, [, , or an Operator, then a * must be recognized as a MultiplyOperator
and an NCName must be recognized as an OperatorName.
*/
<OP> TOKEN: {
<MultiplyOperator: "*"> : DEFAULT
| <AND:"and"> : DEFAULT |
| <OR:"or"> : DEFAULT |
| <MOD:"mod"> : DEFAULT |
<DIV:"div"> : DEFAULT
} |
<*> TOKEN: {
<LEFT_PAREN:"("> : DEFAULT
| <RIGHT_PAREN:")"> : OP |
| <LEFT_SQUARE:"["> : DEFAULT |
| <RIGHT_SQUARE:"]"> : OP |
| <PERIODS:".."> : OP |
| <Number: <Digits> ("." (<Digits>)?)? |
"." <Digits>> |
| <PERIOD:"."> : OP |
| <AT:"@"> : DEFAULT |
| <COMMA:","> : DEFAULT |
| <COLONS:"::"> : DEFAULT |
| <SLASHES: "//"> :DEFAULT |
| <SLASH: "/"> :DEFAULT |
| <PIPE: " |
"> :DEFAULT |
| <PLUS: "+"> :DEFAULT |
| <MINUS: "-"> :DEFAULT |
| <EQ: "="> :DEFAULT |
| <NE: "!="> :DEFAULT |
| <LTE: "<="> :DEFAULT |
| <LT: "<"> :DEFAULT |
| <GTE: ">="> :DEFAULT |
| <GT: ">"> :DEFAULT |
| <Literal: "\"" (~["\""])* "\"" |
"'" (~["'"])* "'"> : OP |
<VariableReference: "$" <QName>> : OP
} |
I can confirm this one. I've added a test case for it.