ISO 639 macrolanguage

Source: Wikipedia, the free encyclopedia.
(Redirected from
Macrolanguage
)

A macrolanguage is a book-keeping mechanism for the ISO 639 international standard of language codes. Macrolanguages are established to assist mapping between different sets of ISO language codes. Specifically, there may be a many-to-one correspondence between ISO 639-3, intended to identify all the thousands of languages of the world, and either of two other sets, ISO 639-1, established to identify languages in computer systems, and ISO 639-2, which encodes a few hundred languages for library cataloguing and bibliographic purposes. When such many-to-one ISO 639-2 codes are included in an ISO 639-3 context, they are called "macrolanguages" to distinguish them from the corresponding individual languages of ISO 639-3.[1] According to the ISO,

Some existing code elements in ISO 639-2, and the corresponding code elements in ISO 639-1, are designated in those parts of ISO 639 as individual language code elements, yet are in a one-to-many relationship with individual language code elements in [ISO 639-3]. For purposes of [ISO 639-3], they are considered to be macrolanguage code elements.

— ISO 639-3: Relationship between ISO 639-3 and the other parts of ISO 639[2]

ISO 639-3 is curated by SIL International, ISO 639-2 is curated by the Library of Congress (USA).

The mapping often has the implication that it covers borderline cases where two language varieties may be considered strongly divergent dialects of the same language or very closely related languages (dialect continua); it may also encompass situations when there are language varieties that are considered to be varieties of the same language on the grounds of ethnic, cultural, and political considerations, rather than linguistic reasons.[dubious ] However, this is not its primary function and the classification is not evenly applied.

For example,

separate languages
. Basically, ISO 639-2 and ISO 639-3 use different criteria for dividing language varieties into languages, 639-2 uses shared writing systems and literature more whereas 639-3 focuses on mutual intelligibility and shared lexicon. The macrolanguages exist within the ISO 639-3 code set to make mapping between the two sets easier.

The use of macrolanguages was applied in

san, adopted in 15 December 2023, though it already existed as individual language for several years.[5]

Some of the macrolanguages had no individual language (as defined by 639-3) in ISO 639-2, e.g. "ara" (

Standard Arabic) that were considered by ISO 639-2 to be dialects of one language ("ara") are now in ISO 639-3 in certain contexts considered to be individual languages themselves. This is an attempt to deal with varieties that may be linguistically distinct from each other, but are treated by their speakers as forms of the same language, e.g. in cases of diglossia
. For example,

ISO 639-2 also includes codes for collections of languages; these are not the same as macrolanguages. These collections of languages are excluded from ISO 639-3, because they never refer to individual languages. Most such codes are included in ISO 639-5.

Types of macrolanguages

  • elements that have no ISO 639-2 code: 4 (bnc, hbs, kln, luy)
  • elements that have no ISO 639-1 code: 29
  • elements that do have ISO 639-1 codes: 33
  • elements whose individual languages have ISO 639-1 codes: 4
    • aka – tw
    • hbs – bs, hr, sr
    • msa – id
    • nor – nb, nn

List of macrolanguages

This list only includes official data from https://iso639-3.sil.org/code_tables/macrolanguage_mappings/data.

ISO 639-1 ISO 639-2 ISO 639-3 Number of individual languages Name of macrolanguage
ak aka
aka
2
Akan language
ar ara
ara
28 + retired 2
Arabic language
ay aym
aym
2 Aymara language
az aze
aze
2 Azerbaijani language
(-) bal
bal
3
Baluchi language
(-) bik
bik
8 + retired 1 Bikol language
(-) (-)
bnc
5
Bontok language
(-) bua
bua
3
Buriat language
(-) chm
chm
2 Mari language (Russia)
cr cre
cre
6
Cree language
(-) del
del
2
Delaware language
(-) den
den
2 Slavey language (Athapascan)
(-) din
din
5 Dinka language
(-) doi
doi
2 Dogri language
et est
est
2 Estonian language
fa fas/per
fas
2 Persian language
ff ful
ful
9
Fulah language
(-) gba
gba
6 + retired 1
Gbaya language
(Central African Republic)
(-) gon
gon
3 + retired 1 Gondi language
(-) grb
grb
5 Grebo language
gn grn
grn
5
Guaraní language
(-) hai
hai
2 Haida language
(-)[8] (-)
hbs
4 Serbo-Croatian
(-) hmn
hmn
25 + retired 1 Hmong language
iu iku
iku
2
Inuktitut language
ik ipk
ipk
2
Inupiaq language
(-) jrb
jrb
4 + retired 1
Judeo-Arabic languages
kr kau
kau
3 Kanuri language
(-) (-)
kln
9 Kalenjin languages
(-) kok
kok
2 Konkani language
kv kom
kom
2 Komi language
kg kon
kon
3 Kongo language
(-) kpe
kpe
2 Kpelle language
ku kur
kur
3 Kurdish language
(-) lah
lah
7 + retired 1
Lahnda language
lv lav
lav
2 Latvian language
(-) (-)
luy
14
Luyia language
(-) man
man
6 + retired 1 Manding languages
mg mlg
mlg
11 + retired 1 Malagasy language
mn mon
mon
2 Mongolian language
ms msa/may
msa
36 + retired 1 Malay language
(-) mwr
mwr
6 Marwari language
ne nep
nep
2 Nepali language
no nor
nor
2 Norwegian language
oj oji
oji
7
Ojibwa language
or ori
ori
2
Oriya language
om orm
orm
4 Oromo language
ps pus
pus
3
Pashto language
qu que
que
43 + retired 1
Quechua language
(-) raj
raj
6 Rajasthani language
(-) rom
rom
7 Romany language
sa san
san
2 Sanskrit language
sq sqi/alb
sqi
4 Albanian language
sc srd
srd
4 Sardinian language
sw swa
swa
2 Swahili language
(-) syr
syr
2 Syriac language
(-) tmh
tmh
4 Tuareg languages
uz uzb
uzb
2 Uzbek language
yi yid
yid
2
Yiddish language
(-) zap
zap
58 + retired 1
Zapotec language
za zha
zha
16 + retired 2 Zhuang languages
zh zho/chi
zho
16 Chinese language
(-) zza
zza
2 Zaza language
34 59 63 441 + retired 15 total codes
ISO 639-1 ISO 639-2 ISO 639-3 Number of individual languages Name of macrolanguage

List of macrolanguages and the individual languages

This is a complete list of the individual language codes that comprise the macrolanguages in the ISO 639-3 code tables as of 6 March 2023.[9]

aaa–ezz

aka

Akan. Its ISO 639-1
code is ak. There are two individual language codes assigned:

ara

ara is the ISO 639-3 language code for Arabic. Its ISO 639-1 code is ar. There are twenty-eight individual language codes assigned:

The following codes were previously part of ara:

aym

aym is the ISO 639-3 language code for Aymara. Its ISO 639-1 code is ay. There are two individual language codes assigned:

  • Southern Aymara
  • Central Aymara

aze

aze is the ISO 639-3 language code for Azerbaijani. Its ISO 639-1 code is az. There are two individual language codes assigned:

  • South Azerbaijani
  • North Azerbaijani

bal

Baluchi
. There are three individual language codes assigned:

  • Southern Balochi
  • Western Balochi
  • Eastern Balochi

bik

Bikol
. There are eight individual language codes assigned:

The following code was previously part of bik:

  • Albay Bicolano
    (Split into Buhi'non Bikol [ubl], Libon Bikol [lbl], Miraya Bikol [rbl], and West Albay Bikol [fbl] on 18 January 2010)

bnc

Bontok
. There are five individual language codes assigned:

  • Eastern Bontok
  • Central Bontok
  • Southern Bontok
  • Northern Bontok
  • Southwestern Bontok

bua

Buriat
. There are three individual language codes assigned:

  • Mongolia Buriat
  • Russia Buriat
  • China Buriat

chm

chm is the ISO 639-3 language code for Mari, a language located in Russia. There are two individual language codes assigned:

  • Eastern Mari
  • Western Mari

cre

Cree. Its ISO 639-1
code is cr. There are six individual language codes assigned:

In addition, there are six closely associated individual codes:

In addition, there is one other language without individual codes closely associated, but not part of, this macrolanguage code:

del

Delaware
. There are two individual language codes assigned:

den

den is the ISO 639-3 language code for Slave. There are two individual language codes assigned:

din

din is the ISO 639-3 language code for Dinka. There are five individual language codes assigned:

  • South Central Dinka
  • Southwestern Dinka
  • Northeastern Dinka
  • Northwestern Dinka
  • Southeastern Dinka

doi

doi is the ISO 639-3 language code for Dogri. There are two individual language codes assigned:

est

est is the ISO 639-3 language code for Estonian. Its ISO 639-1 code is et. There are two individual language codes assigned:

faa–jzz

fas

fas is the ISO 639-3 language code for Persian. Its ISO 639-1 code is fa. There are two individual language codes assigned:

ful

Fulah (also spelled Fula). Its ISO 639-1
code is ff. There are nine individual language codes assigned for varieties of Fulah:

gba

Gbaya located in the Central African Republic
. There are six individual language codes assigned:

  • Bokoto
  • Gbaya-Bossangoa
  • Gbaya-Bozoum
  • Gbaya-Mbodomo
  • Southwest Gbaya
  • Northwest Gbaya

The following code was previously part of gba:

  • mdo – Southwest Gbaya (Split into Southwest Gbaya [gso] (new identifier) and Gbaya-Mbodomo [gmm] on 14 January 2008)

gon

gon is the ISO 639-3 language code for Gondi. There are three individual language codes assigned:

  • Aheri Gondi
  • Northern Gondi
  • Adilabad Gondi

The following code was previously part of gon:

  • ggo – Southern Gondi (Split into [esg] Aheri Gondi and [wsg] Adilabad Gondi on 15 January 2016)

grb

grb is the ISO 639-3 language code for Grebo. There are five individual language codes assigned:

  • Northern Grebo
  • Gboloo Grebo
  • Southern Grebo
  • Central Grebo
  • Barclayville Grebo

grn

grn is the ISO 639-3 language code for Guarani. Its ISO 639-1 code is gn. There are five individual language codes assigned:

  • Western Bolivian Guaraní
  • Paraguayan Guaraní
  • Eastern Bolivian Guaraní
  • gunMbyá Guaraní
  • Chiripá

hai

hai is the ISO 639-3 language code for Haida. There are two individual language codes assigned:

  • Southern Haida
  • Northern Haida

hbs

Serbo-Croatian
. It formerly had an ISO 639-1 code sh but deprecated in 2000. There are four individual language codes assigned:

hmn

hmn is the ISO 639-3 language code for Hmong. There are twenty-five individual language codes assigned:

  • Chuanqiandian Cluster Miao
  • Northern Qiandong Miao
  • Southern Mashan Hmong
  • Central Huishui Hmong
  • Large Flowery Miao
  • Eastern Huishui Hmong
  • Southwestern Guiyang Hmong
  • Southwestern Huishui Hmong
  • Northern Huishui Hmong
  • hmjGe
  • Luopohe Hmong
  • Central Mashan Hmong
  • Northern Mashan Hmong
  • Eastern Qiandong Miao
  • Southern Qiandong Miao
  • Western Mashan Hmong
  • Southern Guiyang Hmong
  • Hmong Shua
  • Hmong Njua
  • Horned Miao
  • Northern Guiyang Hmong
  • Western Xiangxi Miao
  • Eastern Xiangxi Miao
  • Hmong Daw
  • Small Flowery Miao

The following code was previously part of hmn:

  • blu – Hmong Njua (Split into Hmong Njua [hnj] (new identifier), Chuanqiandian Cluster Miao [cqd], Horned Miao [hrm], and Small Flowery Miao [sfm] on 14 January 2008)

iku

Inuktitut. Its ISO 639-1
code is iu. There are two individual language codes assigned:

ipk

Inupiaq. Its ISO 639-1
code is ik. There are two individual language codes assigned:

  • North Alaskan Inupiatun
  • Northwest Alaska Inupiatun

jrb

Judeo-Arabic
. There are four individual language codes assigned:

The following code was previously part of jrb:

kaa–ozz

kau

kau is the ISO 639-2 and ISO 639-3 language code for the Kanuri. Its ISO 639-1 code is kr. There are three individual language codes assigned in ISO 639-3 for varieties of Kanuri:

There are two other related languages that are not considered part of the macrolanguage under ISO 639:

kln

Kalenjin
. There are nine individual language codes assigned:

kok

kok is the ISO 639-3 language code for Konkani (macrolanguage). There are two individual language codes assigned:

  • Goan Konkani
  • Konkani (individual language)

Both languages are referred to as Konkani by their respective speakers.

kom

kom is the ISO 639-3 language code for Komi. Its ISO 639-1 code is kv. There are two individual language codes assigned:

kon

kon is the ISO 639-3 language code for Kongo. Its ISO 639-1 code is kg. There are three individual language codes assigned:

  • Koongo
  • San Salvador Kongo
  • Laari

kpe

kpe is the ISO 639-3 language code for Kpelle. There are two individual language codes assigned:

  • Guinea Kpelle
  • Liberia Kpelle

kur

kur is the ISO 639-3 language code for Kurdish. Its ISO 639-1 code is ku. There are three individual language codes assigned:

lah

Lahnda
. There are seven individual language codes assigned.

lah does not include Panjabi/Punjabi (pan).

The following code was previously part of lah:

  • Mirpur Panjabi
    (Moved to code "phr" on 12 January 2015)

lav

lav is the ISO 639-3 language code for Latvian. Its ISO 639-1 code is lv. There are two individual language codes assigned:

luy

luy is the ISO 639-3 language code for Luyia. There are fourteen individual language codes assigned:

man

man is the ISO 639-3 language code for Mandingo. There are six individual language codes assigned:

  • Eastern Maninkakan
  • Konyanka Maninka
  • Western Maninkakan
  • mnkMandinka
  • Sankaran Maninka
  • Kita Maninkakan

The following codes were previously part of man:

mlg

mlg is the ISO 639-3 language code for Malagasy. Its ISO 639-1 code is mg. There are eleven individual language codes assigned:

  • Bara Malagasy
  • Northern Betsimisaraka Malagasy
  • Southern Betsimisaraka Malagasy
  • Masikoro Malagasy
  • Plateau Malagasy
  • Sakalava Malagasy
  • Tandroy-Mahafaly Malagasy
  • Tesaka Malagasy
  • Tanosy Malagasy
  • Antankarana Malagasy
  • Tsimihety Malagasy

The following codes were previously part of mlg:

  • Southern Betsimisaraka Malagasy
    (Split into Southern Betsimisaraka [bzc] and Tesaka Malagasy [tkg] on 18 May 2011)

mon

mon is the ISO 639-3 language code for Mongolian. Its ISO 639-1 code is mn. There are two individual language codes assigned:

msa

msa is the ISO 639-3 language code for Malay (macrolanguage). Its ISO 639-1 code is ms. There are thirty-six individual language codes assigned:

The following code was previously part of msa:

  • mly – Malay (individual language) (Split into Standard Malay [zsm], Haji [hji], Papuan Malay [pmy], and Malay [zlm] on 18 February 2008)

In addition, there is an individual code not part of this macrolanguage because it is categorized as a historical language:

  • Old Malay

mwr

mwr is the ISO 639-3 language code for Marwari. There are six individual language codes assigned:

  • Dhundari
  • Mewari
  • Marwari (Pakistan)
  • Marwari (India)
  • swvShekhawati
  • Merwari

nep

nep is the ISO 639-3 language code for Nepali (macrolanguage). Its ISO 639-1 code is ne. There are two individual language codes assigned:

nor

nor is the ISO 639-3 language code for Norwegian. Its ISO 639-1 code is no. There are two individual language codes assigned:

oji

oji is the ISO 639-3 language code for Ojibwa. Its ISO 639-1 code is oj. There are seven individual language codes assigned:

In addition, there are three closely associated individual codes:

In addition, there are two other languages without individual codes closely associated, but not part of, this macrolanguage code:

ori

ori is the ISO 639-3 language code for Oriya (macrolanguage). Its ISO 639-1 code is or. There are two individual language codes assigned:

orm

orm is the ISO 639-3 language code for Oromo. Its ISO 639-1 code is om. There are four individual language codes assigned:

paa–zzz

pus

Pashto. Its ISO 639-1
code is ps. There are three individual language codes assigned:

que

Quechua. Its ISO 639-1
code is qu. There are forty-three individual language codes assigned:

The following code was previously part of que:

  • Chilean Quechua
    (Moved to code "quh" on 15 January 2016)

raj

raj is the ISO 639-3 language code for Rajasthani. There are six individual language codes assigned:

rom

Romany
. There are seven individual language codes assigned:

In addition, there are nine individual codes not part of this macrolanguage but they are categorized as mixed languages:

  • Erromintxela
  • Romano-Greek
  • Traveller Danish
  • Angloromani
  • Traveller Norwegian
  • Lomavren
  • rmrCaló
  • Tavringer Romani
  • Romano-Serbian

san

san is the ISO 639-3 language code for Sanskrit. Its ISO 639-1 code is sa. There are two individual language codes assigned:

sqi

sqi is the ISO 639-3 language code for Albanian. Its ISO 639-1 code is sq. There are four individual language codes assigned:

srd

srd is the ISO 639-3 language code for Sardinian. Its ISO 639-1 code is sc. There are four individual language codes assigned:

swa

swa is the ISO 639-3 language code for Swahili. Its ISO 639-1 code is sw. There are two individual language codes assigned:

  • Congo Swahili
  • swh – Swahili (individual language)

syr

syr is the ISO 639-3 language code for Syriac. There are two individual language codes assigned:

  • Assyrian Neo-Aramaic
  • Chaldean Neo-Aramaic

tmh

Tamashek
. There are four individual language codes assigned:

  • Tamasheq
  • Tahaggart Tamahaq
  • Tayart Tamajeq
  • Tawallammat Tamajaq

uzb

uzb is the ISO 639-3 language code for Uzbek. Its ISO 639-1 code is uz. There are two individual language codes assigned:

yid

Yiddish. Its ISO 639-1
code is yi. There are two individual language codes assigned:

  • Eastern Yiddish
  • Western Yiddish

zap

Zapotec
. There are fifty-eight individual language codes assigned.

The following codes were previously part of zap:

  • ztc – Lachirioag Zapotec (Moved to Yatee Zapotec [zty] on 18 July 2007)

In addition, there is an individual code not part of this macrolanguage because it is categorized as a historical language:

  • Ancient Zapotec

zha

zha is the ISO 639-3 language code for Zhuang. Its ISO 639-1 code is za. There are sixteen individual language codes assigned:

The following codes were previously part of zha:

  • ccx – Northern Zhuang (Split into Guibian Zh [zgn], Liujiang Zh [zlj], Qiubei Zh [zqe], Guibei Zh [zgb], Youjiang Zh [zyj], Central Hongshuihe Zh [zch], Eastern Hongshuihe Zh [zeh], Liuqian Zh [zlq], Yongbei Zh [zyb], and Lianshan Zh [zln]. on 14 January 2008)
  • ccy – Southern Zhuang (Split into Nong Zhuang [zhn], Yang Zhuang [zyg], Yongnan Zhuang [zyn], Zuojiang Zhuang [zzj], and Dai Zhuang [zhd] on 18 July 2007)

zho

zho is the ISO 639-3 language code for Chinese. Its ISO 639-1 code is zh. There are sixteen individual language codes assigned, most of which are not actually languages but rather groups of Sinitic languages distinguished by isoglosses:

Although the Dungan language (dng) is a dialect of Mandarin, it is not listed under Chinese in ISO 639-3 due to separate historical and cultural development.[11]

ISO 639 also lists codes for Old Chinese (och) and Late Middle Chinese (ltc)). They are not listed under Chinese in ISO 639-3 because they are categorized as ancient and historical languages, respectively.

zza

zza is the ISO 639-3 language code for Zaza. There are two individual language codes assigned:

  • Dimli
    (individual language)
  • Kirmanjki
    (individual language)

See also

  • Microlanguage

References

  1. ^ ISO 639-3: Scope of denotation for language identifiers: Macrolanguages
  2. ^ "Relationships to other parts of ISO 639 | ISO 639-3".
  3. ^ Lewis, M. Paul, ed. (2009). Ethnologue. Dallas: SIL International.
  4. ^ "Scope of denotation for language identifiers". SIL International.
  5. ^ "Comments received for ISO 639-3 Change Request 2011-041" (PDF). SIL International. October 31, 2023. Retrieved 21 December 2023.
  6. ^ "Documentation for ISO 639 identifier: ara". SIL International.
  7. ^ "Documentation for ISO 639 identifier: arb". SIL International.
  8. ^
    ISO 639-2/RA Change Notice
    ISO
    639-1
    Code
    ISO
    639-2
    Code
    English
    name of
    Language
    French
    name of
    Language
    Date
    Added or
    Changed
    Category
    of Change
    Notes
    [-sh] (none) Serbo-Croatian serbo-croate 2000-02-18 Dep This code was deprecated in 2000 because there were separate language codes for each individual language represented (Serbian, Croatian, and then Bosnian was added). It was published in a revision of ISO 639-1, but was never included in ISO 639-2. It is considered a macrolanguage (general name for a cluster of closely related individual languages) in ISO 639-3. Its deprecated status was reaffirmed by the ISO 639 JAC in 2005.
    sr srp [scc] Serbian serbe 2008-06-28 CC ISO 639-2/B code deprecated in favor of ISO 639-2/T code
    hr hrv [scr] Croatian croate 2008-06-28 CC ISO 639-2/B code deprecated in favor of ISO 639-2/T code
  9. ^ "ISO 639-3 Macrolanguage Mappings". SIL International. 2023-03-06.
  10. ^ "Change Request Documentation: 2022-006". ISO 639-3. SIL International. Retrieved 27 January 2023.
  11. .

External links