Scripts > Token definitions

Literals

Top Previous Next

Each special word of a text, each number and generally each part of a text can be considered as an individual literal token. A literal simply is a special sequence of characters.

For example the word "TETRA" is a 'T'' followed by an 'E' a 'T', 'R' and an 'A'.

According to their simplicity and their importance for syntactical analysis literals have not to be defined separately on the token page. You can define them directly inside of a production. This is possible in two ways:

1.	by inclusion of the text in quotation marks, e.g. "TETRA"

2.	by putting an underscore in front of the text, e.g. _TETRA,

In the second case a named literal token is produced that is inserted on the token page automatically. The special advantages of named literal tokens are discussed separately. The simple literal tokens usually suffice. For example a rule to parse a salutation could look like

( "Mr" | "Mrs" ) name

Hereby "Mr" and "Mrs" are meaning themselves, while name could denote a different regular expression or a production.

Inside such a token each character means itself. That holds not generally for regular expressions. For example the smiley

";-)"

defined in the syntax of regular expressions looks like:

";\-\)"

Here the hyphen and the parenthesis have a Meta meaning, so that they must be preceded by a backslash to get back their originally meaning.

Some characters, which could be represented otherwise, such as line breaks, can be used as escape sequences also within the definition of literal tokens.

This page belongs to the TextTransformer Documentation

Home Content German