antlr4 - How to define a token which is all those characters in set A, except those in sub-set B? -


in rfc2616 (http/1.1) definition of 'token' in section '2.2 basic rules' given as:

token          = 1*<any char except ctls or separators> 

from section, i've got following fragments, , want define 'token':

lexer grammar acceptencoding;  token: /* (char excluding (ctrl | separators)) */  fragment char: [\u0000-\u007f]; fragment ctrl: [\u0000-\u001f] | \u007f; fragment separators: [()<>@,;:\\;"/\[\]?={|}] | sp | ht; fragment sp: ' '; fragment ht: '\t'; 

how approximate hypothetical 'excluding' operator definition of token?

there no set/range math in antlr. can combine several sets/ranges via or operator. typical rule number of disjoint ranges looks like:

fragment letter_when_unquoted:     '0'..'9'     | 'a'..'z'     | '$'     | '_'     | '\u0080'..'\uffff' ; 

Comments

Popular posts from this blog

c# - Binding a comma separated list to a List<int> in asp.net web api -

Delphi 7 and decode UTF-8 base64 -

html - Is there any way to exclude a single element from the style? (Bootstrap) -