Top  Previous  Next

Scripts > Token definitions > Literals


Each special word of a text, each number and generally each part of a text can be considered as an individual literal token. A literal simply is a special sequence of characters.

For example the word "TETRA" is a 'T'' followed by an 'E' a  'T', 'R' and an 'A'.


According to their simplicity and their importance for syntactical analysis literals have not to be defined separately on the token page. You can define them directly inside of a production. This is possible in two ways: inclusion of the text in quotation marks, e.g. "TETRA" putting an underscore in front of the text, e.g. _TETRA,


In the second case a named literal token is produced that is inserted on the token page automatically. The special advantages of named literal tokens are discussed separately. The simple literal tokens usually suffice. For example a rule to parse a salutation could look like


( "Mr" | "Mrs" ) name


Hereby "Mr" and "Mrs" are meaning themselves, while name could denote a different regular expression or a production.


Inside such a token each character means itself. That holds not generally for regular expressions. For example the smiley




defined in the syntax of regular expressions looks like:




Here the hyphen and the parenthesis have a Meta meaning, so that they must be preceded by a backslash to get back their originally meaning.


Some characters, which could be represented otherwise, such as line breaks, can be used as escape sequences also within the definition of literal tokens.




This page belongs to the TextTransformer Documentation

Home  Content  German