|
|
|
It adds about 1000 more lines to the lexer, but the C# grammar in the file sharing section here: http://www.antlr.org/
shows a way to do unicode identifiers that more closely follows the C# standard. It even allows escaped unicode characters in identifiers like \u0000 or \U00000000. The license is BSD. |
|||||||||||||||||||||||||||||||||||||||||||||||||||
protected
ID_LETTER : ('_' | 'a'..'z' | 'A'..'Z' | {System.Char.IsLetter(LA(1))}? '\u0080'..'\uFFFE');
Your boo scripts may require utf encoding.