Friedl scheme

Top  Previous  Next

Glossary > Friedl scheme

 

The Friedl scheme is a pattern, which J.E.F. Friedl formulates in his book: Regular Expressions, to avoid eternal matching.

 

Friedl mentions as an example, which is an extremely unhappy case of eternal matching:

 

"(\\.|[^"\\\r\n]+)*"

 

This expression defines a string as consisting of the outer double quotes, and an inner repetition. The repetition consists in an alternative of a backslash followed by an arbitrary character (to assure, that the backslash not stands immediately before the closing double quote) and a sequence of characters, which are neither line breaking characters nor a double quote (which would finish the string). 

The problem with this expression is, that sequences of characters before a backslash can be interpreted in different manners and that (with a POSIX-NFA regex, as is used in the TextTransformer) all permutations will be tested. Because of the nested repeat-operators '*' and '+', there are extremely much possibilities (three characters are a sequence of one and two characters or a sequence of two and one character). Such a testing (backtracking) leads to a very slow recognition of the whole text. Friedl gives an example, where the evaluation of such an expression exceeds the lifetime of the programmer.

 

The eternal matching can be avoided, if the expression above is reformulated as:

 

"([^"\\\r\n]*(\\.[^"\\\r\n]*)*)"

 

The new expression is subject to the scheme:

 

opening normal * ( special normal* )* closing

where the beginning of special and normal may not be the same

 

normal forbids a backslash: [^"\\\r\n]

and special demands it: \\.

( opening == closing == " )

 

At every location of the text there is a clear alternative now. Indeed, the Friedl scheme is an LL(1)-optimization.

 



This page belongs to the TextTransformer Documentation

Home  Content  German