boost regular expression library

Top  Previous  Next

Scripts > Token definitions > Regular expressions > boost regular expression library

 

The TextTransformer uses the regular expression library of Dr John Maddock at http://www.boost.org, which is a part of the whole boost library. The syntax of the expression is the same as described there as POSIX-Extended Regular Expression Syntax.

The boost regular expressions can be modified by flags. With two exceptions the default  flags ( = extended) are kept.

 

const tt_syntax_option_type _boost_regex_normal =

     boost::regex_constants::extended

     & ~boost::regex_constants::no_escape_in_lists

     & ~boost::regex_constants::collate;

 

 

The flag "no_escape_in_lists" is negated. So a backslash has to be put in front of itself inside of the definition of a character set.

 

The flag "collate" is negated, to be able to define character sets in the order, by which the characters are listed in the ANSI-table.

 

The flag "extended" specifies that the grammar recognized by the regular expression engine is the same as that used by POSIX extended regular expressions in IEEE Std 1003.1-2001, Portable Operating System Interface (POSIX ), Base Definitions and Headers, Section 9, Regular Expressions (FWD.1).

 

When the expression is compiled as a POSIX-compatible regex then the matching algorithms will match the first possible matching string, if more than one string starting at a given location can match then it matches the longest possible string.

 

This matching algorithm is essential for the TextTransformer. So unfortunately non-greedy repeats and look-ahead asserts aren't supported. Backreferences aren't disabled explicitely, but normally they will not work correctly, as the enumeration of subexpressions is moved in SKIP-expressions. Backreferences have to be avoided.

 

 



This page belongs to the TextTransformer Documentation

Home  Content  German