
Top  Previous  Next

Scripts > Token definitions > Predefined tokens > Identifier


The following predefined tokens can be inserted in a project by a pop-up menu.



ID:        [a-zA-Z_]\w*


Identifiers beginning with a character of the alphabet or the underscore and followed by an arbitrary number of alphanumeric characters or the underscore (= \w). It is important, that an identifier cannot begin with a digit. Otherwise it would recognize numbers too. Mostly it is recommended, to define an extra token for numbers.







This are regular expressions for Uniform Resource Identifiers (URI), as for example website addresses. URI's are described in RFC 3986


The three expressions are variants of an expression given on the mentioned page. As opposed to the latter these expressions don't recognize any empty text. A URI has to start with the "scheme" expression described there, i.e. with characters on which a colon follows. In addition, a URI has to be delimited from the surrounded text. The three expressions given here are different in the way of this separation.



URI_WS_DELIM : (([^:/?#]+):)(//([^/?#\\s]*))?([^?#\\s]*)(\\?([^#\\s]*))?(#([\\s]*))?


This URI is delimited by white spaces from the rest of the text. So the URI may not contain white spaces. For example URI_WS_DELIM recognizes the following text:



URI_QUOTE_DELIM : "(([^:/?#]+):)(//([^/?#"]*))?([^?#"]*)(\\?([^#"]*))?(#([^"]*))?"


This URI is delimited by double quotes from the rest of the text. So the URI could have been written in a manner, that it contains white spaces. For exampleURI_QUOTE_DELIM recognizes the following text:





URI_ANGLE_DELIM : <(([^:/?#]+):)(//([^/?#>]*))?([^?#>]*)(\\?([^#>]*))?(#([^>]*))?>


This URI is delimited by angle brackets from the rest of the text. So the URI could have been written in a manner, that it contains white spaces. For example URI_ANGLE_DELIM recognizes the following text:








The following sections are recognized by the sub-expression in the example above:


     $1 = http:

     $2 = http

     $3 = //

     $4 =

     $5 = /pub/ietf/uri/

     $6 = <undefined>

     $7 = <undefined>

     $8 = #Related

     $9 = Related


where <undefined> indicates that the component is not present. Therefore, we can determine the value of the five components described in RFC 3986 as


     scheme    = $2

     authority = $4

     path      = $5

     query     = $7

     fragment  = $9


This page belongs to the TextTransformer Documentation

Home  Content  German