Escape sequence
In computer science, an escape sequence is a combination of characters that has a meaning other than the literal characters contained therein;[1] it is marked by one or more preceding (and possibly terminating) characters.[2]
Examples
- In C and many derivative programming languages, a string escape sequence is a series of two or more characters, starting with a backslash
\
.[3]- Note that in C a backslash immediately followed by a newline does not constitute an escape sequence, but splices physical source lines into logical ones in the second translation phase, whereas string escape sequences are converted in the fifth translation phase.[4]
- To represent the backslash character itself,
\\
can be used, whereby the first backslash indicates an escape and the second specifies that a backslash is being escaped.[5] - A character may be escaped in multiple different ways. Assuming ASCII encoding, the escape sequences
\x5c
(hexadecimal),\\
,\134
(octal) and\x5C
all encode the same character: the backslash\
.
- For devices that respond to ANSI escape sequences, the combination of three or more characters beginning with the ASCII "escape" character (decimal character code 27) followed by the left-bracket character
[
(decimal character code 91) defines an escape sequence.
Control sequences
When directed this series of
With the introduction of ANSI terminals most escape sequences began with the two characters "ESC" then "[" or a specially-allocated CSI character with a code 155 (decimal).
Not all control sequences used an escape character; for example:
- modem control sequences used by AT/
- Data General terminal control sequences,[8][9][10] but they often were still called escape sequences, and the very common use of "escaping" special characters in programming languages and command-line parameters today often use the "backslash" character to begin the sequence.
Escape sequences in communications are commonly used when a computer and a peripheral have only a single channel through which to send information back and forth (so escape sequences are an example of
Keyboard
An escape character is usually assigned to the Esc key on a computer keyboard, and can be sent in other ways than as part of an escape sequence. For example, the Esc key may be used as an input character in editors such as vi,[13] or for backing up one level in a menu in some applications.[14] The Hewlett Packard HP 2640 terminals had a key for a "display functions" mode which would display graphics for all control characters, including Esc, to aid in debugging applications.
If the Esc key and other keys that send escape sequences are both supposed to be meaningful to an application, an ambiguity arises if a
Escape sequences date back at least to the 1874 Baudot code.[15][16][17]
Modem control
The
The Hayes command set is
Comparison with control characters
A control character is a character that, in isolation, has some control function, such as carriage return (CR). Escape sequences, by contrast, consist of one or more escape characters which change the interpretation of subsequent characters.
ASCII video data terminals
The
The later
Use in DOS and Windows
A utility,
Use in Linux and Unix displays
The default text terminal, and text windows (such as using xterm) respond to ANSI escape sequences.
Quoting escape
Overview
When an escape character is needed within the quoted/escaped string, there are two strategies used within programming and scripting languages:
- doubled delimiter (e.g.
'He didn''t do it.'
)[21] - secondary escape sequence
An example of the latter is in the use of the caret (^
). E.g. this outputs "You can do so via Cut&Paste" in CMD. (otherwise, the ampersand has a restricted use)[22]
echo You can do so via Cut^&Paste
In detail
A common use of escape sequences is in fact to remove control characters found in a binary data stream so that they will not cause their control function by mistake. In this case, the control character is replaced by a defined "escape character" (which need not be the US-ASCII escape character) and one or more other characters; after exiting the context where the control character would have caused an action, the sequence is recognized and replaced by the removed character.[22] To transmit the "escape character" itself, two copies are sent.[21]
In many
Samples
For example, the single quotation mark character might be expressed as '\''
since writing '''
is not acceptable.
Many modern programming languages specify the doublequote character ("
) as a delimiter for a string literal. The backslash escape character typically provides ways to include doublequotes inside a string literal, such as by modifying the meaning of the doublequote character embedded in the string (\"
), or by modifying the meaning of a sequence of characters including the hexadecimal value of a doublequote character (\x22
). Both sequences encode a literal doublequote ("
).
print "Nancy said "Hello World!" to the crowd.";
produces a syntax error, whereas:
print "Nancy said \"Hello World!\" to the crowd."; ### example of \"
produces the intended output. Another alternative:
print "Nancy said \x22Hello World!\x22 to the crowd."; ### example of \x22
uses "\x" to indicate the following two characters are hexadecimal digits, "22" being the ASCII value for a doublequote in hexadecimal.
C, C++, Java, and Ruby all allow exactly the same two backslash escape styles. The PostScript language and Microsoft Rich Text Format also use backslash escapes. The quoted-printable encoding uses the equals sign as an escape character.
Another similar (and partially overlapping) syntactic trick is stropping.
Some programming languages also provide other ways to represent special characters in literals, without requiring an escape character (see e.g.
See also
- Control character
- Escape character
- printf format string
- format control string
References
- ^ "Escape Sequence".
- ^ "Characters". The Java Tutorials.
- ^ "Escape Sequences".
Character combinations consisting of a backslash
\
followed by a letter or by a combination of digits are called escape sequences. - ^ "ISO/IEC 9899:201x Committee Draft N1570" (PDF).
5.1.1.2 Translation phases, 2.: Each instance of a backslash character (
\
) immediately followed by a new-line character is deleted, splicing physical source lines to form logical source lines. [...] - ^ "Escape sequences". IBM.
- ^ "Chapter 5 – AT Commands" (PDF).
- ^ "AT Command Set and Register Summary for Analog Modem Modules".
- ^ "Data General terminals: discussion of".
- ^ "What's a Terminal?".
- ^ "Data General DG210 DG211 Terminal Emulation Software".
- ^ "Escape sequence".
- ^ "Terminals & Printers Handbook Glossary".
- ^ "Twelve Useful "vi" Commands".
vi commands […] Pressing the Esc (Escape) key is how you […]
- PCworld. 2009-10-29.
- ^ "What is ASCII? The Economist explains". The Economist. 2013-06-09.
- ^ "Baudot and CCITT code".
The Baudot code, invented in 1870 and patented in 1874 by J. Baudot is […]
- ^ "Guide to the use of Character Sets in Europe".
elements C0 and C1 of control characters […] a 5-bit code patented by Jean-Maurice-Emile Baudot (1845-1903) in 1874
- ^ "Basic Hayes AT Command Set". 2011-02-05.
+++ - "Escape Sequence" - This command initiates an escape sequence to return the modem to the on-line command mode
- ^ "Modem Programming Basics".
When a modem is in command mode, the modem can accept commands from you
- ^ 17. Understanding ANSI.SYS - Special Edition Using MS-DOS 6.22.
- ^ a b "Apostrophe Editing ('aaa') (FORTRAN 77 Language Reference)".
Within the field, two consecutive apostrophes […]
- ^ a b "CMD - Batch - Escaping with Caret".