Null character
The null character is a
Originally, its meaning was like
A null-terminated string is a commonly used data structure in the C programming language, its many derivative languages and other programming contexts that uses a null character to indicate the end of a string.[6][7] This design allows a string to be any length at the cost of only one extra character of memory. The common competing design for a string stores the length of the string as an integer data type, but this limits the size of the string to the range of the integer (for example, 255 for a byte).
For byte storage, the null character can be called a null byte.
Representation
Since the null character is not a
In a string literal, the null character is often represented as the escape sequence \0
(for example, "abc\0def"
). Similar notation is often used for a character literal (i.e. '\0'
) although that is often equivalent to the numeric literal for zero (0
).[8] In many languages (such as C, which introduced this notation), this is not a separate escape sequence, but an octal escape sequence with a single octal digit 0; as a consequence, \0
must not be followed by any of the digits 0
through 7
; otherwise it is interpreted as the start of a longer octal escape sequence.[9] Other escape sequences that are found in use in various languages are \000
, \x00
, \z
, or \u0000
.
A null character can be placed in a
%00
.
The ability to represent a null character does not always mean the resulting string will be correctly interpreted, as many programs will consider the null to be the end of the string. Thus, the ability to type it (in case of
In software documentation, the null character is often represented with the text NUL (or NULL although that may mean the null pointer). In Unicode, there is a character for this: U+2400 ␀ SYMBOL FOR NULL.
In caret notation the null character is ^@
. On some keyboards, one can enter a null character by holding down Ctrl and pressing @ (on US layouts just Ctrl+2 will often work, there being no need for ⇧ Shift to get the @ sign).
References
- .
NUL (Null): The all-zeros character which may serve to accomplish time fill and media fill.
- ^ "The set of control characters of the ISO 646" (PDF). Secretariat ISO/TC 97/SC 2. 1975-12-01. p. 4.4. Archived from the original (PDF) on 2014-05-12.
Position: 0/0, Name: Null, Abbreviation: Nul
- ^ "Unicode Character 'NULL' (U+0000)". Retrieved 2018-10-20.
- ^ "C0 Controls and Basic Latin" (PDF). Unicode Consortium. 2018. Retrieved 2018-10-20.
- ^ "A byte with all bits set to 0, called the null character, shall exist in the basic execution character set; it is used to terminate a character string literal." — ANSI/ISO 9899:1990 (the ANSI C standard), section 5.2.1
- ^ "A string is a contiguous sequence of characters terminated by and including the first null character" — ANSI/ISO 9899:1990 (the ANSI C standard), section 7.1.1
- ^ Working Draft, Standard for Programming Language C++ (PDF) (ISO 14882 standard working draft), ISO/IEC, 28 February 2011, p. 427, N3242=11-0012, retrieved 27 February 2013,
A null-terminated byte string, or NTBS, is a character sequence whose highest-addressed element with defined content has the value zero (the terminating null character); no other element in the sequence has the value zero.
- ^ Kernighan and Ritchie, C, p. 38: "The character constant '\0' represents the character with value zero, the null character. '\0' is often written instead of 0 to emphasize the character nature of some expression, but the numeric value is just 0."
- ^ In YAML this combination is a separate escape sequence.
- ^ Null Byte Injection WASC Threat Classification Null Byte Attack section.
External links
- Null Byte Injection WASC Threat Classification Null Byte Attack section
- Poison Null Byte Introduction Introduction to Null Byte Attack
- Apple null byte injection QR code vulnerability