antlr4 - How to define a token which is all those characters in set A, except those in sub-set B? -
in rfc2616 (http/1.1) definition of 'token' in section '2.2 basic rules' given as:
token = 1*<any char except ctls or separators>
from section, i've got following fragments, , want define 'token':
lexer grammar acceptencoding; token: /* (char excluding (ctrl | separators)) */ fragment char: [\u0000-\u007f]; fragment ctrl: [\u0000-\u001f] | \u007f; fragment separators: [()<>@,;:\\;"/\[\]?={|}] | sp | ht; fragment sp: ' '; fragment ht: '\t';
how approximate hypothetical 'excluding' operator definition of token
?
there no set/range math in antlr. can combine several sets/ranges via or operator. typical rule number of disjoint ranges looks like:
fragment letter_when_unquoted: '0'..'9' | 'a'..'z' | '$' | '_' | '\u0080'..'\uffff' ;
Comments
Post a Comment