groovy
  1. groovy
  2. GROOVY-2318

Lexer fails on forward slash used in division

    Details

    • Type: Bug Bug
    • Status: Closed Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 1.1-rc-2
    • Fix Version/s: None
    • Component/s: lexer
    • Labels:
      None
    • Environment:
      OS X Tiger 10.4.11
    • Testcase included:
      yes
    • Number of attachments :
      0

      Description

      This simple test is failing (I have groovy-all on classpath):

      import groovyjarjarantlr.*;
      import java.io.*;
      import org.codehaus.groovy.antlr.parser.GroovyLexer;
      public class Main {
          public static void main(String[] args) throws TokenStreamException {
              String exp = "println 4 / 2 + 3";
              InputStream inputStream = new ByteArrayInputStream(exp.getBytes());
              GroovyLexer lexer = new GroovyLexer(inputStream);
              while (true) {
                  Token token = lexer.nextToken();
                  if (token.getType() == Token.EOF_TYPE) return;
                  System.out.println("token = " + token);
              }
          }
      }
      

      with error:

      token = ["println",<84>,line=1,col=1]
      token = ["4",<194>,line=1,col=9]
      Exception in thread "main" line 1:18: unexpected char: 0xFFFF
              at org.codehaus.groovy.antlr.parser.GroovyLexer.nextToken(GroovyLexer.java:687)
              at Main.main(Main.java:14)
      Java Result: 1
      

      If I replace '/' character with '*' or something else it works fine.

        Activity

        Hide
        Martin Adamek added a comment -

        This will become invalid I guess, as I found at http://groovy.codehaus.org/Migration+From+Classic+to+JSR+syntax that '\' was used in old syntax for division and now one should use intdiv() function? Is that correct? If so, why it is compilable if one uses '/' for int division?

        Show
        Martin Adamek added a comment - This will become invalid I guess, as I found at http://groovy.codehaus.org/Migration+From+Classic+to+JSR+syntax that '\' was used in old syntax for division and now one should use intdiv() function? Is that correct? If so, why it is compilable if one uses '/' for int division?
        Hide
        Paul King added a comment -

        Add code tags

        Show
        Paul King added a comment - Add code tags
        Paul King made changes -
        Field Original Value New Value
        Description This simple test is failing (I have groovy-all on classpath):

        import groovyjarjarantlr.*;
        import java.io.*;
        import org.codehaus.groovy.antlr.parser.GroovyLexer;
        public class Main {
            public static void main(String[] args) throws TokenStreamException {
                String exp = "println 4 / 2 + 3";
                InputStream inputStream = new ByteArrayInputStream(exp.getBytes());
                GroovyLexer lexer = new GroovyLexer(inputStream);
                while (true) {
                    Token token = lexer.nextToken();
                    if (token.getType() == Token.EOF_TYPE) return;
                    System.out.println("token = " + token);
                }
            }
        }

        with error:

        token = ["println",<84>,line=1,col=1]
        token = ["4",<194>,line=1,col=9]
        Exception in thread "main" line 1:18: unexpected char: 0xFFFF
                at org.codehaus.groovy.antlr.parser.GroovyLexer.nextToken(GroovyLexer.java:687)
                at Main.main(Main.java:14)
        Java Result: 1

        If I replace '/' character with '*' or something else it works fine.
        This simple test is failing (I have groovy-all on classpath):

        {code}
        import groovyjarjarantlr.*;
        import java.io.*;
        import org.codehaus.groovy.antlr.parser.GroovyLexer;
        public class Main {
            public static void main(String[] args) throws TokenStreamException {
                String exp = "println 4 / 2 + 3";
                InputStream inputStream = new ByteArrayInputStream(exp.getBytes());
                GroovyLexer lexer = new GroovyLexer(inputStream);
                while (true) {
                    Token token = lexer.nextToken();
                    if (token.getType() == Token.EOF_TYPE) return;
                    System.out.println("token = " + token);
                }
            }
        }
        {code}

        with error:

        {code}
        token = ["println",<84>,line=1,col=1]
        token = ["4",<194>,line=1,col=9]
        Exception in thread "main" line 1:18: unexpected char: 0xFFFF
                at org.codehaus.groovy.antlr.parser.GroovyLexer.nextToken(GroovyLexer.java:687)
                at Main.main(Main.java:14)
        Java Result: 1
        {code}

        If I replace '/' character with '*' or something else it works fine.
        Hide
        Roshan Dawrani added a comment -

        Just wanted to add that it is failing as in "println 4 / 2 + 3", "/" is seen as starting the regular expression and since it does not find the matching closing "/" for the regular expression, it fails.

        So, "println 4 / 2 + 3 /" yields the tokens as:

        token = ["println",<84>,line=1,col=1,lineLast=1,colLast=8]
        token = ["4",<194>,line=1,col=9,lineLast=1,colLast=10]
        token = [" 2 + 3 ",<85>,line=1,col=11,lineLast=1,colLast=20] // "2 + 3" seen as the regex value
        
        Show
        Roshan Dawrani added a comment - Just wanted to add that it is failing as in "println 4 / 2 + 3", "/" is seen as starting the regular expression and since it does not find the matching closing "/" for the regular expression, it fails. So, "println 4 / 2 + 3 /" yields the tokens as: token = ["println",<84>,line=1,col=1,lineLast=1,colLast=8] token = ["4",<194>,line=1,col=9,lineLast=1,colLast=10] token = [" 2 + 3 ",<85>,line=1,col=11,lineLast=1,colLast=20] // "2 + 3" seen as the regex value
        Hide
        Roshan Dawrani added a comment - - edited

        I think I know now why the lexer is failing in processing "/" in your code when you try to tokenize "println 4 / 2 + 3".

        The reason it is failing is to see "/" as division operator and is seeing it as regex starting "/", because it needs to know the type of last token it has processed, which your code is failing to set.

        I have made a small change to your code to set the token type correctly to make it like:

        package org.codehaus.groovy.antlr.parser;
        
        import groovyjarjarantlr.*;
        import java.io.*;
        import org.codehaus.groovy.antlr.parser.GroovyLexer;
        public class Main {
            public static void main(String[] args) throws TokenStreamException {
                String exp = "println 4 / 2 + 3";
                InputStream inputStream = new ByteArrayInputStream(exp.getBytes());
                GroovyLexer lexer = new GroovyLexer(inputStream);
                while (true) {
                    Token token = lexer.nextToken();
                    if (token.getType() == Token.EOF_TYPE) return;
                    // Roshan: make a note that last token processed is "4", an int, so "/" is to be taken as division operator
                    lexer.lastSigTokenType = token.getType();
                    System.out.println("token = " + token);
                }
            }
        }
        

        and now it yields the tokens correctly as:

        token = ["println",<84>,line=1,col=1,lineLast=1,colLast=8]
        token = ["4",<194>,line=1,col=9,lineLast=1,colLast=10]
        token = ["/",<186>,line=1,col=11,lineLast=1,colLast=12]
        token = ["2",<194>,line=1,col=13,lineLast=1,colLast=14]
        token = ["+",<144>,line=1,col=15,lineLast=1,colLast=16]
        token = ["3",<194>,line=1,col=17,lineLast=1,colLast=18]
        

        Hope it helps.
        Roshan

        Show
        Roshan Dawrani added a comment - - edited I think I know now why the lexer is failing in processing "/" in your code when you try to tokenize "println 4 / 2 + 3". The reason it is failing is to see "/" as division operator and is seeing it as regex starting "/", because it needs to know the type of last token it has processed, which your code is failing to set. I have made a small change to your code to set the token type correctly to make it like: package org.codehaus.groovy.antlr.parser; import groovyjarjarantlr.*; import java.io.*; import org.codehaus.groovy.antlr.parser.GroovyLexer; public class Main { public static void main( String [] args) throws TokenStreamException { String exp = "println 4 / 2 + 3" ; InputStream inputStream = new ByteArrayInputStream(exp.getBytes()); GroovyLexer lexer = new GroovyLexer(inputStream); while ( true ) { Token token = lexer.nextToken(); if (token.getType() == Token.EOF_TYPE) return ; // Roshan: make a note that last token processed is "4" , an int , so "/" is to be taken as division operator lexer.lastSigTokenType = token.getType(); System .out.println( "token = " + token); } } } and now it yields the tokens correctly as: token = ["println",<84>,line=1,col=1,lineLast=1,colLast=8] token = ["4",<194>,line=1,col=9,lineLast=1,colLast=10] token = ["/",<186>,line=1,col=11,lineLast=1,colLast=12] token = ["2",<194>,line=1,col=13,lineLast=1,colLast=14] token = ["+",<144>,line=1,col=15,lineLast=1,colLast=16] token = ["3",<194>,line=1,col=17,lineLast=1,colLast=18] Hope it helps. Roshan
        blackdrag blackdrag made changes -
        Status Open [ 1 ] Closed [ 6 ]
        Assignee Jochen Theodorou [ blackdrag ] Roshan Dawrani [ roshandawrani ]
        Resolution Fixed [ 1 ]

          People

          • Assignee:
            Roshan Dawrani
            Reporter:
            Martin Adamek
          • Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved: