Line breaks

Top  Previous  Next

Glossary > Line breaks

 

Line breaks in text files are represented by special control characters. In the Windows operation systems this is a combination of a carriage return character '\r' and a linefeed character '\n'. If e.g. the single characters of the three-line text:

 

1. Line

2. Line

3. Line

 

are listed in a table, this table looks like:

 

1

.

 

L

i

n

e

\r

\n

2

.

 

L

i

n

e

\r

\n

3

.

 

L

i

n

e

 

 

 

In UNIX operation systems instead of the '\r\n' combination a single linefeed character is used. The same text there is a little bit shorter:

 

1

.

 

L

i

n

e

\n

2

.

 

L

i

n

e

\n

3

.

 

L

i

n

e

 

 

In the TextTransformer editor the Windows convention is always used. If UNIX texts are loaded, then a carriage return character is put in front of all linefeed characters automatically. A corresponding warning note is shown than.

 

In most transformation projects line breaks are ignored. If they are, however, constitutive for the structure of the text to be analyzed, you have to take the additional '\r' characters into account. In the extreme case the exetution of a project then can have different results, depending on whether it happens in the TETRA-working bench or in the transformation manager or with the command line tool, which both treat the unchanged texts.

 

 

Both, on Windows and on UNIX, line breaks are recognized by the following token (if the line breaks aren't ignored):

 

 

EOL  ::= \r?\n

 



This page belongs to the TextTransformer Documentation

Home  Content  German