Tokens |
Top Previous Next |
Examples > Cocor import > Tokens
An automated translation of the token specification of Coco/R into the regular expressions of the TextTransformers in principle should be feasible. But this is not, what shall be done here, because this would mean a considerable effort, in particular, because of the different manner in which character sets are defined. Furthermore the definitions of token are only a little part of a translating project. The token definitions of the Coco/R compiler description shall be translated here directly.
In Coco/R at first the used character sets are defined and then used by an EBNF-definition of the token. In the TextTransformer this two step procedure could be applied too, but normally the token are defined together with its character sets. Hereby several predefined character sets can be used, which don't exist in Coco/R.
The according lines from Cr_17.atg are:
CHARACTERS letter = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz_" . digit = "0123456789" . cntl = CHR(0)..CHR(31). tab = CHR(9) . eol = CHR(13). lf = CHR(10) . back = CHR(92) . noQuote1 = ANY - '"' - cntl - back . noQuote2 = ANY - "'" - cntl - back . graphic = ANY - cntl .
TOKENS ident = letter {letter | digit} . str = '"' {noQuote1 | back graphic } '"' | "'" {noQuote2 | back graphic } "'" . badstring = '"' {noQuote1 | back graphic } ( eol | lf ) | "'" {noQuote2 | back graphic } ( eol | lf ) . number = digit {digit} .
In the TextTransformer these character sets can be expressed as follows:
noQuote1 = [^\"[:cntrl:]\\] noQuote2 = [^'[:cntrl:]\\] graphic = [^[:cntrl:]]
Thus the tokens are:
IDENT :: = [[:alpha:]_]\w* STRING ::= \"([^\"[:cntrl:]\\]|\\[^[:cntrl:]])*\" \ |'([^'[:cntrl:]\\]|\\[^[:cntrl:]])*' BADSTRING ::= \"([^\"[:cntrl:]\\]|\\[^[:cntrl:]])*(\r|\n) \ |'([^'[:cntrl:]\\]|\\[^[:cntrl:]])*(\r|\n) NUMBER ::= \d+
|
This page belongs to the TextTransformer Documentation |
Home Content German |