Tokens

Top  Previous  Next

Examples > Conversion of an Atari text > Tokens

 

To translate not readable characters into readable characters, at first they have be found in the text. The text elements - tokens - which the translation program has to search for, have to be defined on the second page of the TETRA program. If you click the Tokens tab of the register with the mouse, a list of names of the defined terminal symbols is show on the left side of the page.

 

 

Atari_Token_en

 

 

If a name of the list is selected, then the definition of the corresponding symbol appears in the text window. With an exception of one definition all definitions of the Atari project are similar to each other: a backslash '\' followed by an 'x' and two numbers. If e.g. the symbol ue is selected, then the following expression appears in the text window: \x81. This expression is a number in a hexadecimal notation, which is assigned to the character by the ANSI-set. Instead of this expression also empty square "" could have been written which occurs in a place of the text, which shall be replaced by a 'ü'. The token text windows then would look the same, however, for all umlauts.

 

The expression: \x81 can be found easily, if you go back to the editor on the TETRA working page. There you can place the text cursor before the unknown character and click the right mouse button. Now a popup menu appears, where you can select: Show hexadecimal character.

 

    HexCodeMenu_en

 

A dialog appears, which shows the hexadecimal expression. The expression is copied into the clipboard automatically and it can be inserted into the definition of a symbol.

 

 

  HexCodeDialog_en

 

 

 

One of the symbol definitions is much more complicated than the others:

 

normal_text =

[^\x11\x12\x15\x16\x17\x18\x81\x84\x94\x99\x9E\x8E\x9A\x9C]+

 

This expression defines the text sections, with no special characters and no text attributes. Inside of the brackets there is a negation symbol '^'. This symbol is followed by a list of all hexadecimal expressions, which are used for the other token definitions. The square brackets are used to define a set of characters, here: the set of all characters, which are no special characters and no text attribute; e.g. a letter of the alphabet or a punctuation mark. The plus sign following the square bracket indicates that at least one character of the set has to occur, but also a sequence of such characters with arbitrary length is allowed.

normal_text normal_text also includes line breaks, tabulators and blanks. These characters per default are ignored in TETRA projects. For the Atari project this standard setting was changed. In the menu Options->Project options all ignorable characters are disabled.

 

   AtariOptions_en

 

 



This page belongs to the TextTransformer Documentation

Home  Content  German