ISO basic Latin alphabet

The ISO basic Latin alphabet is an international standard (beginning with ISO/IEC 646) for a

international standards and used widely in international communication. They are the same letters that comprise the current English alphabet. Since medieval times, they are also the same letters of the modern Latin alphabet. The order is also important for sorting words into alphabetical order

.

The two sets contain the following 26 letters each:[1]

ISO basic Latin alphabet
Uppercase letter set	A	B	C	D	E	F	G	H	I	J	K	L	M	N	O	P	Q	R	S	T	U	V	W	X	Y	Z
Lowercase letter set	a	b	c	d	e	f	g	h	i	j	k	l	m	n	o	p	q	r	s	t	u	v	w	x	y	z

History

By the 1960s it became apparent to the

Unicode Latin), have continued to define the 26 × 2 letters of the English alphabet as the basic Latin script with extensions to handle other letters in other languages.^[1]

Terminology

The Unicode block that contains the alphabet is called "C0 Controls and Basic Latin". Two subheadings exist:^[2]

"Uppercase Latin alphabet": the letters start at U+0041 and contain the string LATIN CAPITAL LETTER in their descriptions
"Lowercase Latin alphabet": the letters start at U+0061 and contain the string LATIN SMALL LETTER in their descriptions

There are also another two sets in the

Halfwidth and Fullwidth Forms block:^[3]

Uppercase: the letters start at U+FF21 and contain the string FULLWIDTH LATIN CAPITAL LETTER in their descriptions
Lowercase: the letters start at U+FF41 and contain the string FULLWIDTH LATIN SMALL LETTER in their descriptions

Timeline for encoding standards

1865 International Morse Code was standardized at the International Telegraphy Congress in Paris, and was later made the standard by the International Telecommunication Union (ITU)

1950s
Radiotelephony Spelling Alphabet by ICAO^[4]

Timeline for widely used computer codes supporting the alphabet

1963:
American Standards Association, which became the American National Standards Institute
in 1969)

1963/1964: EBCDIC (developed by IBM and supporting the same alphabetic characters as ASCII, but with different code values)
1965-04-30: Ratified by ECMA as ECMA-6^[5] based on work the ECMA's Technical Committee TC1 had carried out since December 1960.^[5]
1972:
ISO/IEC
standard)
1983: ITU-T Rec. T.51 |
ISO/IEC 6937
(a multi-byte extension of ASCII)
1987: ISO/IEC 8859-1:1987 (8-bit character encoding)
- Subsequently, other versions and parts of ISO/IEC 8859 have been published.
Mid-to-late 1980s: Windows-1250, Windows-1252, and other encodings used in Microsoft Windows (some roughly similar to ISO/IEC 8859-1)
1990:
C0 Controls and Basic Latin
" using the same alphabetic code values as ASCII and ISO/IEC 646
- Subsequently, other versions of Unicode have been published and it later became a joint
  ISO/IEC
  standard as well, as identified below.
1993:
ISO/IEC 10646-1
:1993, ISO/IEC standard for characters in Unicode 1.1
- Subsequently, other versions of ISO/IEC 10646-1 and one of ISO/IEC 10646-2 have been published. Since 2003, the standards have been published under the name "ISO/IEC 10646" without the separation into two parts.
1997: Windows Glyph List 4

Representation

16-segment display (plus the Arabic numerals

).

In ASCII the letters belong to the

ISO/IEC 10646

they are occupying the positions in hexadecimal notation 41 to 5A for uppercase and 61 to 7A for lowercase.

Not case sensitive, all letters have code words in the

ICAO spelling alphabet and can be represented with Morse code

.

Usage

All of the lowercase letters are used in the

X-SAMPA and SAMPA

these letters have the same sound value as in IPA.

Alphabets containing the same set of letters

The list below only includes alphabets that include all the 26 letters but exclude:

letters whose
diacritical marks
make them distinct letters.
multigraphs that constitute distinct letters.
ligatures that are distinct letters.

Notable omissions due to these rules include

Esperanto, Filipino and German. The German alphabet is sometimes considered by tradition to contain only 26 letters (with ⟨ä⟩, ⟨ö⟩, ⟨ü⟩ considered variants and ⟨ß⟩ considered a ligature of ⟨ſ⟩ (long s) and ⟨s⟩), but the current German orthographic rules include ⟨ä⟩, ⟨ö⟩, ⟨ü⟩, ⟨ß⟩ in the alphabet placed after ⟨Z⟩. In Spanish orthography, the letters ⟨n⟩ and ⟨ñ⟩ are distinct; the tilde

is not considered a diacritic in this case.

Alphabet	Diacritic	Multigraphs (not constituting distinct letters)	Ligatures
Afrikaans alphabet	ô, ö, ú, û, ü, ý	uu ⟩ Trigraphs: ⟨aai⟩, ⟨eeu⟩, ⟨oei⟩, ⟨ooi⟩	ŉ (N‑apostrophe)
Aragonese alphabet (Academia de l'Aragonés orthography)	á, é, í, ó, ú, ü, lꞏl	⟨ tz ⟩
Catalan alphabet	à, é, è, í, ï, ó, ò, ú, ü, ç, lꞏl	⟨ ss ⟩
Dutch alphabet	ä, é, è, ë, ï, ö, ü	The uu ⟩
English alphabet	only in loanwords (see below)¹	⟨ ng ⟩	æ, œ (both archaic)
French alphabet	ù, û, ü, ÿ	⟨ eî ⟩	æ (rare), œ (mandatory)
Hmong Latin alphabet	none	⟨ txh⟩, ⟨ndhl ⟩
Italian alphabet (extended)^[a]	ù	⟨ sci ⟩
Ido alphabet *	none	⟨ qu⟩, ⟨ch⟩, ⟨sh ⟩
Indonesian alphabet	only in learning materials (see below)⁴	⟨ ai ⟩, ⟨au⟩, ⟨ei⟩, ⟨oi⟩
Interlingua alphabet *	only in unassimilated loanwords (see below)²	⟨ rh⟩, ⟨sh ⟩
Javanese Latin alphabet	é, è	⟨ sy⟩, ⟨th ⟩
Latino sine flexione alphabet*	only an optional accent for unusual stress (see below)³	⟨ rh⟩, ⟨th⟩ ^[8]
Luxembourgish alphabet	ä, é, ë	⟨ sch ⟩
Malay alphabet	only in learning materials (see below)⁴	⟨ sy ⟩
Portuguese alphabet^[b]	ô, à, ç	⟨ õe ⟩
Sundanese Latin alphabet	é	⟨ ng⟩, ⟨ny ⟩
Xhosa alphabet	only in learning materials (see below)⁵	⟨ zh ⟩
Zulu alphabet	none	⟨ xh ⟩

* Constructed languages

English is one of the few modern European languages requiring no diacritics for native words (although a
coöperation").^[c]^[9]

cafe, from French: café).^[10]

Latino sine flexione, a.k.a. "Peano's Interlingua", allows but does not require the placement of an accent for unusual stress. (It antedates the other "Interlingua" by roughly four decades.)
Malay and Indonesian (based on Malay) use all the Latin alphabet and require no diacritics and ligatures. However, Malay and Indonesian learning materials may use ⟨é⟩ (E with acute) to clarify the pronunciation of the letter E; in that case, ⟨e⟩ is pronounced /ə/ while ⟨é⟩ is pronounced /e/ and (è) is pronounced /ɛ/. Many of the 700+ languages of Indonesia also use the Indonesian alphabet to write their languages, some—such as Javanese—adding diacritics é and è, and some omitting q, x, and z.
Xhosa is usually written without diacritics, but may optionally use diacritics over ⟨a, e, i, o, u⟩ for tones: ⟨à, á, â, ä⟩.

Column numbering

The Roman (Latin) alphabet is commonly used for column numbering in a table or chart. This avoids confusion with row numbers using Arabic numerals. For example, a 3-by-3 table would contain columns A, B, and C, set against rows 1, 2, and 3. If more columns are needed beyond Z (normally the final letter of the alphabet), the column immediately after Z is AA, followed by AB, and so on^[11] (see bijective base-26 system). This can be seen by scrolling far to the right in a spreadsheet program such as Microsoft Excel or LibreOffice Calc.

The letters are often used for indexing nested bullet points. In this case after the 26th it is more common to use AA, BB, CC, ... instead of base-26 numbers.

Notes

^ The Italian alphabet is traditionally considered to have only 21 letters, excluding j, k, w, x, y. However, in practice these letters occur in a number of loanwords. J also occurs in some native Italian proper names as a variant of writing semivocalic i.
^ Note for Portuguese: k and y (but not w) were part of the alphabet until several spelling reforms during the 20th century, the aim of which was to change the etymological Portuguese spelling into an easier phonetic spelling. These letters were replaced by other letters having the same sound: thus psychologia became psicologia, kioske became quiosque, martyr became mártir, etc. Nowadays k, w, and y are only found in foreign words and their derived terms and in scientific abbreviations (e.g. km, byronismo). These letters are considered part of the alphabet again following the
1990 Portuguese Language Orthographic Agreement, which came into effect on January 1, 2009, in Brazil. See Reforms of Portuguese orthography
.

^ As an example of an article containing a diaeresis in "coöperate", as well as accents on loan words in English, such as a cedilla in "façades" and a circumflex in the word "crêpe", see Grafton, Anthony (October 23, 2006). "Books: The Nutty Professors, The history of academic charisma". The New Yorker.

References

^ ^a ^b ^c "Internationalisation standardization of 7-bit codes, ISO 646". Trans-European Research and Education Networking Association (TERENA). Retrieved October 3, 2010.
^ "C0 Controls and Basic Latin" (PDF). Unicode.org. Retrieved August 8, 2016.
^ "Halfwidth and Fullwidth Forms" (PDF). Unicode.org. Retrieved August 8, 2016.
^ "The Postal History of ICAO". www.icao.int. Archived from the original on February 12, 2019. Retrieved February 17, 2019.
^
European Computer Manufacturers Association (Ecma). March 1985. Archived from the original (PDF) on May 29, 2016. Retrieved May 29, 2016. The Technical Committee TC1 of ECMA
met for the first time in December 1960 to prepare standard codes for Input/Output purposes. On April 30, 1965, Standard ECMA-6 was adopted by the General Assembly of ECMA.

^ "Unicode character database". The Unicode Standard. Retrieved March 22, 2013.
ISBN 0-201-56788-1
.

Ager, Simon. "Latino sine Flexione". Omniglot
. Latino sine Flexione alphabet. Retrieved April 14, 2023.

^ "The New Yorker's odd mark — the diaeresis". December 16, 2010. Archived from the original on December 16, 2010.

^ "Introduction al IED (in anglese)". www.interlingua.com. Retrieved September 21, 2020.

^ "How To Switch From Letters to Numbers for Columns in Excel". Indeed. Retrieved November 21, 2024.

t
e
Latin script

History

Spread

Romanization

Roman numerals

Ligatures

Alphabets (list)

Classical Latin alphabet

ISO basic Latin alphabet

Phonetic alphabets
International Phonetic Alphabet

X-SAMPA

Spelling alphabet

Letters (list)

Letters of the ISO basic Latin alphabet

Aa Bb Cc Dd Ee Ff Gg Hh Ii Jj Kk Ll Mm Nn Oo Pp Qq Rr Ss Tt Uu Vv Ww Xx Yy Zz

Additional Latin letters

Æ æ Ɑ ɑ Ʌ ʌ Ꞵ ꞵ
Ð ð
Ɛ ɛ Ə ə Ǝ ə Ɣ ɣ Ƣ ƣ Ɩ ɩ Ɥ ɥ Ꟛ ꟛ Ŋ ŋ Œ œ Ɔ ɔ
Ɤ ɤ
Kʼ ĸ
Ʀ ʀ
ẞ ß Ʃ ʃ Ɯ ɯ Ʊ ʊ Ꞷ ꞷ Ʋ ʋ Ƿ ƿ

Ȝ ȝ ϴ θ Ʒ ʒ Ƹ ƹ Þ þ Ȣ ȣ Ꭓ ꭓ Ɂ ʔ ɂ
ʕ

ǀ

ǁ

ǂ

ǃ

ʘ
ʻ
ˀ
ʼ Ꞌ ꞌ
ᴬ

ᴮ

ꟲ

ᴰ

ᴱ

ꟳ

ᴳ

ᴴ

ᴵ

ᴶ

ᴷ

ᴸ

ᴹ

ᵃ

ᵉ

ᵋ

ʰ

ⁱ

ᶤ

ⁿ

ᵒ

ᵓ

ᵘ

ᶶ

ᵛ

ʷ

ʸ

ᶻ

ᶿ

Ꜣ ꜣ

Ꜥ ꜥ

3
Ꜫ ꜫ Ꜭ ꜭ

7

⁷

Multigraphs
Digraphs

Ch

Dz

Dž

Gh

IJ

Lj

Ll

Ly

Nh

Nj

Ny

Sh

Sz

Th

Trigraphs

dzs

eau

Tetragraphs

ough

tzsch
Keyboard layouts (list)

QWERTY

QWERTZ

AZERTY

Dvorak

Colemak

BÉPO

Neo

Historical Standards

ISO/IEC 646

Western Latin character sets

Current Standards

Unicode

DIN 91379: Unicode subset for Europe

Lists

Precomposed Latin characters in Unicode

Letters used in mathematics

List of typographical symbols and punctuation marks

Diacritics

Palaeography

v
t
e
Character encodings
Early telecommunications

Telegraph code
Needle

Morse
Non-Latin

Wabun/Kana

Chinese

Cyrillic

Baudot and Murray

Fieldata

ASCII
ISO/IEC 646

BCDIC

Teletex and Videotex/Teletext
T.51/ISO/IEC 6937

ITU T.61

ITU T.101

World System Teletext
background

sets

Transcode

ISO/IEC 8859

Approved parts
-1 (Western Europe)

-2 (Central Europe)

-3 (Maltese/Esperanto)

-4 (North Europe)

-5 (Cyrillic)

-6 (Arabic)

-7 (Greek)

-8 (Hebrew)

-9 (Turkish)

-10 (Nordic)

-11 (Thai)

-13 (Baltic)

-14 (Celtic)

-15 (New Western Europe)

-16 (Romanian)

Abandoned parts
-12 (Devanagari)

Proposed but not approved
KOI-8 Cyrillic

Sámi

Adaptations
Welsh

Estonian

Ukrainian Cyrillic

Bibliographic use

MARC-8
ANSEL

CCCII/EACC

ISO 5426

5426-2

5427

5428

6438

6862

National standards

ArmSCII

Big5

BraSCII

BSCII

CNS 11643

DIN 66003

ELOT 927

GOST 10859

GB 2312

GB 12345

GB 12052

GB 18030

HKSCS

ISCII

JIS X 0201

JIS X 0208

JIS X 0212

JIS X 0213

KOI-7

KPS 9566

KS X 1001

KS X 1002

LST 1564

LST 1590-4

PASCII

Shift JIS

SI 960

TIS-620

TSCII

VISCII

VSCII

YUSCII

ISO/IEC 2022

ISO/IEC 8859

ISO/IEC 10367

Extended Unix Code / EUC

Mac OS Code pages
("scripts")

Armenian

Arabic

Barents Cyrillic

Celtic

Central European

Croatian

Cyrillic

Devanagari

Farsi (Persian)

Font X (Kermit)

Gaelic

Georgian

Greek

Gujarati

Gurmukhi

Hebrew

Iceland

Inuit

Keyboard

Latin (Kermit)

Maltese/Esperanto

Ogham

Roman

Romanian

Sámi

Turkish

Turkic Cyrillic

Ukrainian

VT100

DOS code pages

437

737

850

858

861

862

863

864

865

866

867

868

869

899

904

932

936

942

949

950

951

1040

1043

1046

1098

1115

1116

1117

1118

1127

ABICOMP

CS Indic

CSX Indic

CSX+ Indic

CWI-2

Iran System

Kamenický

Mazovia

MIK

IBM AIX code pages

895

896

912

915

921

922

1006

1008

1009

1010

1012

1013

1014

1015

1016

1017

1018

1019

1046

1133

Windows code pages

CER-GS

932

936 (GBK)

950

Extended Latin-8

1250

1251

1252

1253

1254

1255

1256

1257

1258

1270

Cyrillic + French

Cyrillic + German

Polytonic Greek

EBCDIC code pages

Japanese language in EBCDIC

DKOI

DEC terminals (VTx)

Multinational (MCS)

National Replacement (NRCS)
French Canadian

Swiss

Spanish

United Kingdom

Dutch

Finnish

French

Norwegian and Danish

Swedish

Norwegian and Danish (alternative)

8-bit Greek

8-bit Turkish

SI 960

Hebrew

Special Graphics

Technical (TCS)

Platform specific

1052

1053

1054

1055

1058

Acorn RISC OS

Amstrad CPC

Apple II

ATASCII

Atari ST

BICS

Casio calculators

CDC

Compucolor 8001

Compucolor II

CP/M+

DEC RADIX 50

DEC MCS/NRCS

DG International

Galaksija

GEM

GSM 03.38

HP Roman

HP FOCAL

HP RPL

SQUOZE

LICS

LMBCS

MSX

NEC APC

NeXT

PETSCII

PostScript Standard

PostScript Latin 1

SAM Coupé

Sega SC-3000

Sharp calculators

Sharp MZ

Sinclair QL

Teletext

TI calculators

TRS-80

Ventura International

WISCII

XCCS

ZX80

ZX81

ZX Spectrum

Unicode / ISO/IEC 10646

UTF-1

UTF-7

UTF-8

UTF-16

UTF-32

UTF-EBCDIC

GB 18030

DIN 91379

BOCU-1

CESU-8

SCSU

TACE16

Comparison of Unicode encodings

TeX typesetting system

Cork

LY1

OML

OMS

OT1

Miscellaneous code pages

ABICOMP

ASMO 449

Digital encoding of APL symbols
ISO-IR-68

ARIB STD-B24

Fieldata

HZ

IEC-P27-1

INIS
7-bit

8-bit

ISO-IR-169

ISO 2033

KOI
KOI8-R

KOI8-RU

KOI8-U

Mojikyō

SEASCII

Stanford/ITS

Symbol

TRON

Unified Hangul Code

Control character

Morse prosigns

C0 and C1 control codes
ISO/IEC 6429

JIS X 0211

Unicode control, format and separator characters

Whitespace characters

Related topics

CCSID

Character encodings in HTML

Charset detection

Han unification

Hardware code page

MICR code

Mojibake

Variable-length encoding

Character sets

Retrieved from "https://en.wikipedia.org/w/index.php?title=ISO_basic_Latin_alphabet&oldid=1278799821"

[8] The Italian alphabet is traditionally considered to have only 21 letters, excluding j, k, w, x, y. However, in practice these letters occur in a number of loanwords. J also occurs in some native Italian proper names as a variant of writing semivocalic i.

[10] Note for Portuguese: k and y (but not w) were part of the alphabet until several spelling reforms during the 20th century, the aim of which was to change the etymological Portuguese spelling into an easier phonetic spelling. These letters were replaced by other letters having the same sound: thus psychologia became psicologia, kioske became quiosque, martyr became mártir, etc. Nowadays k, w, and y are only found in foreign words and their derived terms and in scientific abbreviations (e.g. km, byronismo). These letters are considered part of the alphabet again following the
1990 Portuguese Language Orthographic Agreement, which came into effect on January 1, 2009, in Brazil. See Reforms of Portuguese orthography
.

[11] As an example of an article containing a diaeresis in "coöperate", as well as accents on loan words in English, such as a cedilla in "façades" and a circumflex in the word "crêpe", see Grafton, Anthony (October 23, 2006). "Books: The Nutty Professors, The history of academic charisma". The New Yorker.

[terena-1] "Internationalisation standardization of 7-bit codes, ISO 646". Trans-European Research and Education Networking Association (TERENA). Retrieved October 3, 2010.

[2] "C0 Controls and Basic Latin" (PDF). Unicode.org. Retrieved August 8, 2016.

[3] "Halfwidth and Fullwidth Forms" (PDF). Unicode.org. Retrieved August 8, 2016.

[4] "The Postal History of ICAO". www.icao.int. Archived from the original on February 12, 2019. Retrieved February 17, 2019.

[ECMA-6_1985-5] 
European Computer Manufacturers Association (Ecma). March 1985. Archived from the original (PDF) on May 29, 2016. Retrieved May 29, 2016. The Technical Committee TC1 of ECMA
met for the first time in December 1960 to prepare standard codes for Input/Output purposes. On April 30, 1965, Standard ECMA-6 was adopted by the General Assembly of ECMA.

[6] "Unicode character database". The Unicode Standard. Retrieved March 22, 2013.

[Unicode1.0-7] ISBN 0-201-56788-1
.

[9] Ager, Simon. "Latino sine Flexione". Omniglot
. Latino sine Flexione alphabet. Retrieved April 14, 2023.

[12] "The New Yorker's odd mark — the diaeresis". December 16, 2010. Archived from the original on December 16, 2010.

[13] "Introduction al IED (in anglese)". www.interlingua.com. Retrieved September 21, 2020.

[14] "How To Switch From Letters to Numbers for Columns in Excel". Indeed. Retrieved November 21, 2024.

[1]

[2]

[3]

[5]

[a]

[8]

[b]

[c]

[9]

[10]

[11]