Filename extension
This article needs additional citations for verification. (November 2015) |
A filename extension, file name extension or file extension is a suffix to the name of a computer file (for example, .txt
, .mp3
, .exe
) that indicates a characteristic of the file contents or its intended use. A filename extension is typically delimited from the rest of the filename with a full stop (period), but in some systems[1] it is separated with spaces.
Some
Operating system and file system support
The Multics file system stores the file name as a single string, not split into base name and extension components, allowing the "." to be just another character allowed in file names. It allows for variable-length filenames, permitting more than one dot, and hence multiple suffixes, as well as no dot, and hence no suffix. Some components of Multics, and applications running on it, use suffixes to indicate file types, but not all files are required to have a suffix — for example, executables and ordinary text files usually have no suffixes in their names.
File systems for
files.tar.gz
(the .tar
indicates that the file is a tar archive of one or more files, and the .gz
indicates that the tar archive file is compressed with gzipCTSS was an early operating system in which the filename and file type were separately stored. Continuing this practice, and also using a dot as a separator for display and input purposes (while not storing the dot), were various DEC operating systems (such as RT-11), followed by CP/M and subsequently DOS.
In DOS and 16-bit Windows, file names have a maximum of 8 characters, a period, and an extension of up to three letters. The FAT file system for DOS and Windows stores file names as an 8-character name and a three-character extension. The period character is not stored.
The High Performance File System (HPFS), used in Microsoft and IBM's OS/2 stores the file name as a single string, with the "." character as just another character in the file name. The convention of using suffixes continued, even though HPFS supports extended attributes for files, allowing a file's type to be stored in the file as an extended attribute.
Microsoft's
Windows 95, with VFAT, introduced support for long file names, and removed the 8.3 name/extension split in file names from non-NT Windows.
The
In Commodore systems, files can only have four extensions: PRG, SEQ, USR, REL. However, these are used to separate data types used by a program and are irrelevant for identifying their contents.
With the advent of graphical user interfaces, the issue of file management and interface behavior arose. Microsoft Windows allowed multiple applications to be associated with a given extension, and different actions were available for selecting the required application, such as a context menu offering a choice between viewing, editing or printing the file. The assumption was still that any extension represented a single file type; there was an unambiguous mapping between extension and icon.
When the Internet age first arrived, those using Windows systems that were still restricted to 8.3 filename formats had to create web pages with names ending in .HTM
, while those using Macintosh or UNIX computers could use the recommended .html
filename extension. This also became a problem for programmers experimenting with the Java programming language, since it requires the four-letter suffix .java
for source code files and the five-letter suffix .class
for Java compiler object code output files.[3]
Content type
Filename extensions may be considered a type of metadata.[4] They are commonly used to imply information about the way data might be stored in the file. The exact definition, giving the criteria for deciding what part of the file name is its extension, belongs to the rules of the specific file system used; usually the extension is the substring which follows the last occurrence, if any, of the dot character (example: txt
is the extension of the filename readme.txt
, and html
the extension of index.html
).
On file systems of some mainframe systems such as
COM
or BAT
indicate that a file is a program executable. In OS/360 and successors, the part of the dataset name following the last period, called the low level qualifier, is treated as an extension by some software, e.g., TSOThe filename extension was originally used to determine the file's generic type.[
Compared to MIME type
In many
There is no standard mapping between filename extensions and media types, resulting in possible mismatches in interpretation between authors, web servers, and client software when transferring files over the Internet. For instance, a content author may specify the extension svgz for a compressed
Executable programs
![]() | This section may require cleanup to meet Wikipedia's quality standards. The specific problem is: intractable construction. (November 2015) |
The use of a filename extension in a command name appears occasionally, usually as a side effect of the command having been implemented as a script, e.g., for the
On association-based systems, the filename extension is generally mapped to a single, system-wide selection of interpreter for that extension (such as ".py" meaning to use Python), and the command itself is runnable from the command line even if the extension is omitted (assuming appropriate setup is done). If the implementation language is changed, the command name extension is changed as well, and the OS provides a consistent API by allowing the same extensionless version of the command to be used in both cases. This method suffers somewhat from the essentially global nature of the association mapping, as well as from developers' incomplete avoidance of extensions when calling programs, and that developers can not force that avoidance. Windows is the only remaining widespread employer of this mechanism.
On systems with interpreter directives, including virtually all versions of Unix, command name extensions have no special significance, and are by standard practice not used, since the primary method to set interpreters for scripts is to start them with a single line specifying the interpreter to use. In these environments, including the extension in a command name unnecessarily exposes an implementation detail which puts all references to the commands from other programs at future risk if the implementation changes. For example, it would be perfectly normal for a shell script to be reimplemented in Python or Ruby, and later in C or C++, all of which would change the name of the command were extensions used. Without extensions, a program always has the same extension-less name, with only the interpreter directive or magic number changing, and references to the program from other programs remain valid.
Security issues
File extensions alone are not a reliable indicator of a file's type, as the extension can be modified without changing the file's contents, such as to disguise
file
are meant to be used instead, and will read the file's header to determine its content.[citation neededMalware such as Trojan horses typically takes the form of an executable, but any file type that performs input/output operations may contain malicious code. A few data file types such as PDFs have been found to be vulnerable to exploits that cause buffer overflows.[11] There have been instances of malware crafted to exploit such vulnerabilities in some Windows applications when opening a file with an overly long, unhandled filename extension.
A virus may couple itself with an executable without actually modifying the executable. These viruses, known as companion viruses, attach themselves in such a way that they are executed when the original file is requested. One way such a virus does this involves giving the virus the same name as the target file, but with a different extension to which the operating system gives priority, and often assigning the former a "hidden" attribute to conceal the malware's existence. The efficacy of this approach depends on whether the user attempts to open the intended file by entering a command and whether the user includes the extension. Later versions of DOS and Windows check for and attempt to run .COM
files first by default, followed by .EXE
and finally .BAT
files. In this case, the infected file is the one with the .COM
extension, which the user unwittingly executes.[10][11]
Some viruses take advantage of the similarity between the ".com" top-level domain and the .COM
filename extension by emailing malicious, executable command-file attachments under names superficially similar to URLs (e.g., "myparty.yahoo.com"), with the effect that unaware users click on email-embedded links that they think lead to websites but actually download and execute the malicious attachments.[citation needed]
See also
References
- ^ "What Is a File?" (PDF). z/VM 7.2 CMS Primer (PDF). IBM. 2021-12-05. p. 7. SC24-6265-01.
One thing you need to know about creating files with z/VM is that each file needs its own three-part identifier. The first part of the identifier is the file name. The second part is the file type. And the third part is the file mode. These three file identifiers are often abbreviated fn ft fm.
- ^ "Mac Creator and File Type codes". livecode.byu.edu. Retrieved 2022-09-02.
- ^ "javac – Java programming language compiler". Sun Microsystems, Inc. 2004. Retrieved 2009-05-31.
Source code file names must have .java suffixes, class file names must have .class suffixes, and both source and class files must have root names that identify the class.
- ISBN 9780782151282. Retrieved 2 October 2017.
- ^ File Extension .RPM Details from filext.com
- ^ File Extension .QIF Details from filext.com
- ^ File Extension .GBA Details from filext.com
- ^ Commandname Extensions Considered Harmful
- ISBN 978-1-59749-268-3. Retrieved 2025-02-25.
- ^ ISBN 0-13-101405-6. Retrieved 2025-02-25.
- ^ ISBN 1-56592-682-X. Retrieved 2025-02-25.
External links
Media related to Filename extensions at Wikimedia Commons
- Database of filename extensions at FileInfo.com