Alphabet (formal languages)

Source: Wikipedia, the free encyclopedia.

In

countable
(e.g., ), or even
uncountable
(e.g., ).

Infinite sequences of symbols may be considered as well (see Omega language
).

It is often necessary for practical purposes to restrict the symbols in an alphabet so that they are unambiguous when interpreted. For instance, if the two-member alphabet is {00,0}, a string written on paper as "000" is ambiguous because it is unclear if it is a sequence of three "0" symbols, a "00" followed by a "0", or a "0" followed by a "00".

Notation

By definition, the alphabet of a formal language over is the set , which can be any non-empty set of symbols from which every string in is built. For example, the set can be the alphabet of the formal language that means "all variable identifiers in the C programming language". Notice that it is not required to use every symbol in the alphabet of for its strings.

Given an alphabet , the set of all strings of length over the alphabet is indicated by . The set of all finite strings (regardless of their length) is indicated by the Kleene star operator as , and is also called the Kleene closure of . The notation indicates the set of all infinite sequences over the alphabet , and indicates the set of all finite or infinite sequences.

For example, using the binary alphabet {0,1}, the strings ε, 0, 1, 00, 01, 10, 11, 000, etc. are all in the Kleene closure of the alphabet (where ε represents the empty string).

Applications

Alphabets are important in the use of

formal languages, automata and semiautomata. In most cases, for defining instances of automata, such as deterministic finite automata (DFAs), it is required to specify an alphabet from which the input strings for the automaton are built. In these applications, an alphabet is usually required to be a finite set
, but is not otherwise restricted.

When using automata,

character set
of the text to be processed by these algorithms, or a subset of allowable characters from the character set.

See also

References

Literature