Lex (software)

Source: Wikipedia, the free encyclopedia.
Lex
Original author(s)Mike Lesk, Eric Schmidt
Initial release1975; 49 years ago (1975)
Repository
Written in
Cross-platform
TypeCommand
LicensePlan 9: MIT License

Lex is a

parser generator and is the standard lexical analyzer generator on many Unix and Unix-like systems. An equivalent tool is specified as part of the POSIX standard.[3]

Lex reads an input stream specifying the lexical analyzer and writes source code which implements the lexical analyzer in the C programming language.

In addition to C, some old versions of Lex could generate a lexer in Ratfor.[4]

History

Lex was originally written by Mike Lesk and Eric Schmidt[5] and described in 1975.[6][7] In the following years, Lex became standard lexical analyzer generator on many

Bell Laboratories license.[8]
Although originally distributed as proprietary software, some versions of Lex are now
flex
, or the "fast lexical analyzer", is not derived from proprietary coding.

Structure of a Lex file

The structure of a Lex file is intentionally similar to that of a yacc file: files are divided into three sections, separated by lines that contain only two percent signs, as follows:

Example of a Lex file

The following is an example Lex file for the

flex
version of Lex. It recognizes strings of numbers (positive integers) in the input, and simply prints them out.

/*** Definition section ***/

%{
/* C code to be copied verbatim */
#include <stdio.h>
%}

%%
    /*** Rules section ***/

    /* [0-9]+ matches a string of one or more digits */
[0-9]+  {
            /* yytext is a string containing the matched text. */
            printf("Saw an integer: %s\n", yytext);
        }

.|\n    {   /* Ignore all other characters. */   }

%%
/*** C Code section ***/

int main(void)
{
    /* Call the lexer, then quit. */
    yylex();
    return 0;
}

If this input is given to flex, it will be converted into a C file, lex.yy.c. This can be compiled into an executable which matches and outputs strings of integers. For example, given the input:

abc123z.!&*2gj6

the program will print:

Saw an integer: 123
Saw an integer: 2
Saw an integer: 6

Using Lex with other programming tools

Using Lex with parser generators

Lex, as with other lexical analyzers, limits rules to those which can be described by

Bison. Parser generators use a formal grammar
to parse an input stream.

It is typically preferable to have a parser, one generated by Yacc for instance, accept a stream of tokens (a "token-stream") as input, rather than having to process a stream of characters (a "character-stream") directly. Lex is often used to produce such a token-stream.

Scannerless parsing refers to parsing the input character-stream directly, without a distinct lexer.

Lex and make

make is a utility that can be used to maintain programs involving Lex. Make assumes that a file that has an extension of .l is a Lex source file. The make internal macro LFLAGS can be used to specify Lex options to be invoked automatically by make.[9]

See also

References

  1. .
  2. .
  3. ^ The Open Group Base Specifications Issue 7, 2018 edition § Shell & Utilities § Utilities § lex
  4. .
  5. ^ Lesk, M.E.; Schmidt, E. "Lex – A Lexical Analyzer Generator". Archived from the original on 2012-07-28. Retrieved August 16, 2010.
  6. ^ Lesk, M.E.; Schmidt, E. (July 21, 1975). "Lex – A Lexical Analyzer Generator" (PDF). UNIX TIME-SHARING SYSTEM:UNIX PROGRAMMER’S MANUAL, Seventh Edition, Volume 2B. bell-labs.com. Retrieved Dec 20, 2011.
  7. ^ Lesk, M.E. (October 1975). "Lex – A Lexical Analyzer Generator". Comp. Sci. Tech. Rep. No. 39. Murray Hill, New Jersey: Bell Laboratories.
  8. ^ The Insider's Guide To The Universe (PDF). Charles River Data Systems, Inc. 1983. p. 13.
  9. ^ "make". The Open Group Base Specifications (6). The IEEE and The Open Group. 2004. IEEE Std 1003.1, 2004 Edition.

External links