Top  Previous  Next

Scripts > Productions > BREAK


By means of the BREAK symbol loops - (...)* or (...)+ - can be left. If the parser finds the BREAK symbol inside of a loop, the loop is left and the parsing will continue with the symbols following the loop.




( "a" "b" "c"| "d" | BREAK)+  "e"   


matches following texts:


"a b c d a b c e"

"d d d e"



(A | BREAK)+  is equivalent to (A)*


Because at the BREAK symbol the loop is left immediately, other nodes cannot follow the BREAK symbol. To connect an action with the BREAK symbol, the action must be written in front of the BREAK:


( "a" "b" "c"| "d" | {{out << "break";}} BREAK)+  "e"



The BREAK symbol must be written into the same production, where the loop is defined, which will be left by the BREAK symbol. Outside of a loop the BREAK symbol has no meaning. It is not possible to split the example above into two productions:


xxx ::=

( "a" "b" "c"| "d" | Break)+  "e"


Break ::=

{{out << "break";}} BREAK   // wrong


In these aspects the use of the BREAK symbol is similarly to the use of "break" in c++. Indeed, in the generated code, BREAK will be substituted by "break".


By means of the BREAK symbol you can analyze structures, which would be an irresolvable problem for a normal top down EBNF syntax based analysis. For example a text could be structured like


(";" "a" )+ ";" "b"



This text is syntactically valid, but causes the warning message:


";" is the start and successor of nullable structures


In this case the warning may not ignored. Parsing of the input:


"; a ; b"


leads to an error. After recognition of "; a" there is a conflict between a continuation by a new loop or by ";" "b". The TextTransformer in such cases chooses the first alternative. "a" will be expected and not "b". 

By means of the BREAK symbol the production can be reformulated to


(";" ( "a" | BREAK ) )+ "b"  


Now the input is recognized correctly. After recognition of the second semicolon at the beginning of the second loop the BREAK alternative will be chosen, the loop will be left and the following "b" will be recognized.

You may think, that a different reformulation of the first production would have had the same result:


(";" ( "a" )? )+ "b" 


But this rule also would recognize texts, which were not intended originally. For example:





This page belongs to the TextTransformer Documentation

Home  Content  German