Syntax tree

Top  Previous  Next

Introduction > Syntax tree

 

Patterns of tokens can be described by rules similar to those, which describe regular expressions. For example a main clause consists of a sequence of words, followed by a dot, or a table consists of the header with the names of the columns followed by a sequence of rows.

 

The syntactical analysis shows the context of a token inside of a grammar.  This context can be presented as a tree structure.

 

Energie_en

 

In this example the tokens are very simple. Each of them only is one character: "E", "m", "*", "c", "^", and "2". In the picture the tokens or terminal symbols are the leaves of the tree, written at the bottom. Graphical these leaves are characterized by the fact, that they only are connected by one single line; grammatically they are indivisible (see remark).

This makes the difference to the other nodes of the tree, which represent so-called non-terminal symbols. Non-terminal symbols can be divided into the terminal symbols. In the graphic, they are starting points of branches.

 

In the TextTransformer the syntax tree of the picture would be separated into the three structures of the non-terminal symbols and look like:

 

Energie2_en

 

 

Remark: That terminal symbols are grammatical indivisible, doesn't mean, that they can't be divided into characters. In more complex tokens as used for the example, that will be the case.

 



This page belongs to the TextTransformer Documentation

Home  Content  German