Word bounds

Top  Previous  Next

User interface > Main menu > Menu: Options > Project options > Parser/Scanner > Word bounds

 

This option only applies to the recognition of literal tokens, but not on regular expressions.

If the word bounds option is activated, a token will be recognized only, if it begins or ends with a word bound. A word bound mostly is the transition of an alphanumerical character or the underscore and a character which doesn't belong to this class \w. Exactly a word bound is defined by three cases:

 

The character adjacent to the token is not member of \w

The exterior character of the token doesn't belong to \w

The token is situated at the begin or the end of the input

 

 

Example:

 

In the text:

 

"sindbad the seaman",

 

following expressions have two word bounds: "sindbad", "the" and "seaman".

Only one word bound is in: "bad", sea" and "man".

 

At this example you can see, that it is possible to analyze the internal structure of single literal words, if you deactivate the word bounds.

 

Normally it is recommended to activate word bounds, because otherwise there is a great danger of wrong recognitions. For example: if word bounds are deactivated and the token "end" is defined, the beginning of the name of a variable in the following line is wrong recognized:

 

endVar := 10;

end

 

 

 

 



This page belongs to the TextTransformer Documentation

Home  Content  German