Wikipedia:Geographical names
This is an essay. It contains the advice or opinions of one or more Wikipedia contributors. This page is not an encyclopedia article, nor is it one of Wikipedia's policies or guidelines, as it has not been thoroughly vetted by the community. Some essays represent widespread norms; others only represent minority viewpoints. |
Wikipedia has over 700,000 articles about geographical entities such as villages, districts, lakes, rivers, mountains and protected areas. Their infoboxes vary considerably in layout and the information they support. The article title holds the common English form but the article may also give the common names used in the local language(s), official names, former names, other names and nicknames. Non-Latin script may be followed by a romanized or phonetic form.
All non-English forms of a name should be marked up so they are rendered correctly by a screen reader. This essay proposes standard ways to gather, validate and format the different names in the article text and in infoboxes, and outlines a migration approach. The core proposal is to adapt all the geographical entity infoboxes to use a standard child template, {{infobox geonames}}, which will undertake validation and formatting of the names.
Current situation
There are several hundred geo-infoboxes used in over 700,000 articles about geographical entities. As of February 2022 {{Infobox settlement}} was used in over 543,000 articles, {{Infobox river}} in 28,870, {{Infobox mountain}} in 26,448, {{Infobox building}} in 24,502, and so on down to a long tail of infoboxes like {{Infobox Tibetan Buddhist monastery}} (286 articles) or {{Infobox dive site}} (18 articles). As shown in #Sample infobox templates (below) the infoboxes are very inconsistent in the name-related parameters they accept, and as shown in #Current usage examples (below) they are also very inconsistent in the format they render.
Non-English names are common even in countries where English is the national language. A place in California might have former names in Spanish and indigenous languages. A place in England may have former names in Common Brittonic or Old English. In France, there may be variants of local names in Breton, Occitan or Corsican. India has a wealth of languages and scripts. Due to lack of consistent support for non-English names, editors may struggle with the default formatting, as with
- |native_name = {{nobold|四国}}
- |native_name = {{lang|tr|Anadolu Selçuklu Devleti}} {{lang|fa|سلجوقیان روم}} Saljūqiyān-i Rūm
Introducing standard validation and formatting for names in all geo-infoboxes will give a more consistent reader experience, reduce accessibility problems with screen readers, and make life easier for editors.
Proposed guidelines
1. | Articles about geographical entities may provide extensive information about names, including the different types of name, etymology, pronunciation, non-Latin script, romanization and so on. However, the information does not have to all be crammed into the infobox and the lead sentence. As illustrated in the article on the Nile, it may be relegated to a section on naming. |
2. | Any non-English name in Latin script should be rendered in italics with proper HTML mark-up for a screen reader, and the language should be rendered before the name,
|
3. | If a non-English name in Latin script may be rendered in English pronunciation, and readers will not be particularly interested in the language, the language need not be identified.
Example: |
4. | Names in non-Latin script may be followed by an italicized romanized or phonetic form if relevant, and the language should be identified.
Example: |
5. | A list of names of the same type in an infobox should be formatted as a horizontal list if it will fit on one line. Otherwise it should be formatted as a simple vertical list. Thus: French: Bruxelles • Dutch: Brussel
|
Identifying languages
Non-English names are often formatted using {{lang}} or {{native name}}. However, both these templates require a 2- or 3-digit ISO code. Many editors do not know what these codes are, and many former place names are in languages that do not have an ISO code. Thus
The solution is to enhance the {{lang}} and {{native name}} templates, or create a new {{lang2}} template to allow the full names of languages as an alternative to the ISO code. Thus {{lang2|German|München}} and {{lang2|de|München}} should both be accepted and render the same result. {{infobox geonames}} would implement the same logic.
- If a language is not found in the list of ISO codes that gives corresponding language names, check for it in a list of language names that gives corresponding ISO codes
- The second list may include languages such as Chirr, Phuthi or Erzgebirgisch with ISO code "mis", meaning they have no ISO code
- Both lists will also include the name of the Wikipedia article for the language, for use as a link
- If the language is not known, use the language code "und"
- Use the ISO code for HTML tagging and the corresponding language name for display purposes
- Flag articles with unrecognized languages for manual follow-up
The enhanced or new template should also accept and display a romanised or phonetic version of the name. E.g.
{{lang2|ar|بَغْدَاد|baɣˈdaːd}}
or{{lang2|Arabic|بَغْدَاد|baɣˈdaːd}}
would render
Arabic: بَغْدَاد [baɣˈdaːd]
with the non-Latin name tagged with the html lang=ar.
Standard infobox parameters
See #Sample infobox templates (below) for parameters used in different infoboxes. Assuming the parameter names used in {{infobox settlement}} will prevail, and that official names, native names and other names can all have languages and may all have Romanized forms, the parameters could be
Alternative 1: Explicit
|name =
|official_name =
|official_name_lang =
|official_name_roman =
<!-- Use |official_name2 = |official_name_lang2 = |official_name_roman2 = etc. for additional names, up to five -->
|native_name =
|native_name_lang =
|native_name_roman =
<!-- Use |native_name2 = |native_name_lang2 = |native_name_roman2 = etc. for additional names, up to five -->
|former_name =
|former_name_lang =
|former_name_roman =
<!-- Use |former_name2 = |former_name_lang2 = |former_name_roman2 = etc. for additional names, up to five -->
|other_name =
|other_name_lang =
|other_name_roman =
<!-- Use |other_name2 = |other_name_lang2 = |other_name_roman2 = etc. for additional names, up to five -->
|nickname =
Alternative 2: Templated
|name =
|official_name = <!-- {{lang2|<language>|<name>|<roman form>}} or
{{lang2 list |lang1=<language>|name1=<name> |roman1=<roman form> |lang2=<language>|name2=<name> |roman2=<roman form> ... }} -->
|native_name = <!-- {{lang2|<language>|<name>|<roman form>}} or
{{lang2 list |lang1=<language>|name1=<name> |roman1=<roman form> |lang2=<language>|name2=<name> |roman2=<roman form> ... }} -->
|former_name = <!-- {{lang2=<language>|<name>|<roman form>}} or
{{lang2 list |lang1=<language>|name1=<name> |roman1=<roman form> |lang2=<language>|name2=<name> |roman2=<roman form> ... }} -->
|other_name = <!-- {{lang2|<language>|<name>|<roman form>}} or
{{lang2 list |lang1=<language>|name1=<name> |roman1=<roman form> |lang2=<language>|name2=<name> |roman2=<roman form> ... }} -->
|nickname =
Comparison of alternatives
In both alternatives the editor must enter the same information:
- |official_name = name
- |official_name_lang = language
- |official_name_roman = roman form
or
- |official_name = {{lang2|language | name | roman form}}
The first format is probably slightly easier for the novice editors, who may be put off by the curly brackets and vertical bars in the second form. Articles about major geographical entities like Cairo, Brahmaputra River or Mount Everest attract seasoned editors who can deal with formatting issues. But the majority of geographical articles are stubs like Orto, Corse-du-Sud, Maquan River or Klinkit Creek Peak, where the editors may find even a simple infobox a bit of a challenge.
The first form also makes it easier to ensure that languages are rendered correctly, since the {{infobox geonames}} template can see and validate all the parameters, for example checking for unusual characters in a name such as ":" or "(" that may indicate attempts to pre-format them. With the second approach {{infobox geonames}} can only see the result rendered by {{lang2}}, and cannot be sure that only the correct formatting template has been used. This essay therefore recommends the first, explicit alternative.
Rendered layout
See #Current usage examples for the various ways in which geographical infoboxes render name information. There is no reason why they should be so inconsistent. The obvious way to standardize collection, validation and rendering of name data is to use a child infobox that can be shared by all the geographical entity infoboxes. To demonstrate, {{Infobox geonames parent}} embeds child {{infobox geonames}}, which formats the names. This is just a crude mock-up of the alternative 2 format, with no real validation and formatting, but illustrates the concept. The code at the left (or below on a phone) renders the result at the right.
| |||||||||||||
Other data | Specialized information about the geographical entity |
---|
{{Infobox geonames parent
|name=Article name
|native_name = Native name or names
|official_name = List of official names
|former_name= Former names
|other_name= Other names
|nickname= Nicknames
|image=File:Przełęcz Karkonoska - panorama.jpg
|otherdata=Specialized information about the geographical entity
}}
This is a rough first cut. The format rendered by {{infobox geonames}} should be carefully reviewed and adjusted. Logic must be added to validate the languages and ensure that names, languages, non-Latin scripts and lists of names are formatted correctly, and titles must be pluralized as needed. But once this is done, the standard validations and formatting will then be picked up automatically by all geo-infoboxes that embed {{infobox geonames}}.
General migration approach
{{lang}}, {{native name}} etc. should be enhanced to support language names as an alternative to language codes, and to support romanized or phonetic forms. This can be done at any time, and will have no impact on existing articles.
Migration to a more standard way of collecting, validating and formatting names can be done infobox by infobox.
- Every effort should be made to minimize disruption.
- A geo-infobox change that introduces red error messages in the text of many articles where there were no error messages before is unacceptable
- The preferred approach is to flag issues using a hidden tracking category, and allow gnomes to work through the flagged formatting replacing it by the new standard. Once almost all the non-standard formatting has been eliminated, the geo-infobox may start to render red error messages.
Two types of change may be introduced independently:
- The geo-infobox is changed to use the new {{infobox geonames}}
- The geo-infobox is changed to eliminate non-standard parameter names
Converting to {{infobox geonames}}
- The first step for each geo-infobox is to obtain agreement on its talk page and associated project talk page to migrate to the standard {{infobox geonames}}
- A version of the geo-infobox using {{infobox geonames}} is prepared and carefully tested
- This version will use the standard parameter names, but will also accept variants to provide backward compatibility
- Assuming no problems, the standardized geo-infobox template will be cut into production, passing "mode=transition" to {{infobox geonames}}. In this mode, {{infobox geonames}} will populate tracking categories with error messages, but will attempt to format the data provided, and will not generate red error messages.
- Once the tracking categories have mostly been cleared, the geo-infobox will start passing "mode=strict" to {{infobox geonames}}. In this mode, {{infobox geonames}} will generate red error messages
Standardizing parameter names
In the long run, it will be easier for editors if all geo-infoboxes use the same names for the same parameters.
- The geo-infobox passes {{infobox geonames}} parameters with the standard names, but also passes the old parameter names:
|other_name={{{other_name|{{{name_other|}}} }}}
- The documentation is changed to show both parameter names:
|other_name= <!-- or |name_other = -->
- At some point, the old name is deprecated, with articles that use it put into maintenance categories
- Gnomes work through changing to the standard parameter names
- Eventually the old parameter names are dropped, and flagged as errors when the article is in edit mode
Providing support for the standard parameter names is important. Removing variant usage is less important, and should not be allowed to get in the way of the main thrust to standardize name validation and formatting.
Appendices
Sample infobox templates
See Category:Place infobox templates for the complete set.
Type | Template | Example | Count[a] | Parameters |
---|---|---|---|---|
Divisions | ||||
Continent | {{Infobox continent}} | Africa | 56 | title |
Island | {{Infobox islands}} | Borneo | 8,317 | name, native_name (or local_name), native_name_link[b], native_name_lang, sobriquet (or nickname), etymology |
Country | {{Infobox country}} | Albania | 5,769 | name, conventional_long_name, common_name, native_name, linking_name |
Settlement | {{Infobox settlement}} | Brussels | 543,470 | name, official_name, other_name, native_name, native_name_lang, etymology, nickname |
Structures | ||||
Airport | {{Infobox airport}} | Frankfurt Airport | 15,543 | name, nativename, nativename-a (non-western characters), nativename-r (Romanized) |
Amusement park | {{Infobox amusement park}} | Epcot | 1,027 | name, previous_names |
Ancient site | {{Infobox ancient site}} | Nineveh | 4,653 | name, native_name, native_name_lang, alternate_name |
Bridge | {{Infobox bridge}} | Band-e Kaisar | 5,684 | name, native_name, native_name_lang, official_name, other_name, named_for |
Building | {{Infobox building}} | Palace of Versailles | 24,502 | name, native_name, native_name_lang, former_names, alternate_names, etymology |
Cemetery | {{Infobox cemetery}} | Glasnevin Cemetery | 1,416 | name, native_name, native_name_lang |
Church | {{Infobox church}} | Durham Cathedral | 13,394 | name, fullname, other name, native_name, native_name_lang, former name |
Dam | {{Infobox dam}} | Red Bluff Diversion Dam | 4,159 | name, name_official |
Dzong | {{Infobox Tibetan Buddhist monastery}} | Potala Palace | 286 | name + language specifics[c] |
Hindu temple | {{Infobox Hindu temple}} | Meenakshi Temple, Madurai |
2,274 | name, native_name, native_name_lang |
Historic site | {{Infobox historic site}} | Diocletian's Palace | 10,063 | name, native_name, native_language, native_name2, native_language2, native_name3, native_language3, other_name, etymology |
Power station | {{Infobox power station}} | Ekibastuz GRES-2 Power Station | 2,852 | name, name_official |
Natural geography | ||||
Mountain | {{Infobox mountain}} | Central Eastern Alps | 26,448 | name, other_name, etymology, nickname, native_name, native_name_lang, translation, pronunciation, authority |
Body of water | {{Infobox body of water}} | Lake Sevan | 17,050 | name, native_name, other_name |
River | {{Infobox river}} | Nile | 28,870 | name, native_name, name_other, name_etymology, nickname |
Canal | {{Infobox canal}} | Royal Canal | 584 | name |
Glacier | {{Infobox glacier}} | Vatnajökull | 1,622 | name, other_name |
Landform | {{Infobox landform}} | Pongo de Manseriche | 1,147 | name, other_name |
Mountain pass | {{Infobox mountain pass}} | Khunjerab Pass | 1,303 | name, other_name |
Stratigraphic unit | {{Infobox rockunit}} | Burgess Shale | 6326 | name |
Valley | {{Infobox valley}} | Alay Valley | 737 | name, other_name, native_name, translation |
Waterfall | {{Infobox waterfall}} | Angel Falls | 1,345 | name |
Ecology, parks etc. | ||||
Ecoregion | {{Infobox ecoregion}} | Alto Paraná Atlantic forests | 919 | name |
Park | {{Infobox park}} | Park Güell | 6,693 | name, alt_name, native_name, native_name_lang |
Protected area | {{Infobox protected area}} | Gran Paradiso National Park | 13,312 | name, alt_name |
Site of Special Scientific Interest | {{Infobox Site of Special Scientific Interest}} | Lundy | 2,052 | name |
Trail | {{ Infobox hiking trail }} |
The Ridgeway | 1,164 | name |
World Heritage Site | {{Infobox UNESCO World Heritage Site}} | Park Güell | 1,587 | WHS, Official_name |
Zoo | {{Infobox zoo}} | Baghdad Zoo | 1,229 | name |
Miscellaneous not reviewed:
- {{Infobox attraction}} 690
- {{Infobox border}} 61
- {{Infobox campground}} 102
- {{Infobox cave}} 786
- {{Infobox climbing area}} 29
- {{Infobox climbing route}} 53
- {{Infobox cycling path}} 173
- {{Infobox dive site}} 18
- {{Infobox farm}} 57
- {{Infobox fictional location}} 342
- {{Infobox forest}} 341
- {{Infobox political division}} 74
- {{Infobox port}} 778
- {{Infobox port-of-entry}} 179
- {{Infobox property development}} 74
- {{Infobox seamount}} 201
- {{Infobox ski area}} 807
- {{Infobox ski jumping hill}} 110
- {{Infobox spring}} 253
- {{Infobox terrestrial impact site}} 152
- {{Infobox urban feature}} 201
- {{Infobox waterlock}} 151
- {{Infobox water park}} 142
Not checked:
- Category:Place infobox templates by country (33 C)
- Category:Buildings and structures infobox templates (5 C, 67 P)
- Category:Constituency infobox templates (22 P)
- Category:Country infobox templates (10 P)
- Category:Country subdivision infobox templates (5 C, 3 P)
- Category:IUCN Protected Area infobox templates (3 P)
- Category:Templates calling Infobox settlement (26 P)
Current usage examples
The examples below are taken from articles as of February 2022, with the infoboxes edited to remove information other than names, and to show a standard image. They illustrate the varied visual styles and approaches to presenting names, partly imposed by the infobox templates, and partly chosen by the editors.
Island
Country
Republic of Albania Republika e Shqipërisë (Albanian) | |
---|---|
Albania (
Settlement
Brussels
| |
---|---|
Nicknames: Capital of Europe, Comic City |
Brussels (
Airport
Frankfurt Airport Flughafen Frankfurt Main | |
---|---|
Summary |
Frankfurt Airport (
Ancient site
نَيْنَوَىٰ | |
Nineveh (
Bridge
Band-e Kaisar بند قیصر, | |
---|---|
Other name(s) | Pol-e Kaisar, Bridge of Valerian, Shadirwan |
The Band-e Kaisar (
Building
Palace of Versailles | |
---|---|
Château de Versailles (French) | |
The Palace of Versailles (
Historic site
Historical Complex of Split with the Palace of Diocletian | |
---|---|
Native name Croatian: Povijesna jezgra grada Splita s Dioklecijanovom palačom | |
Diocletian's Palace (Croatian: Dioklecijanova palača, pronounced [diɔklɛt͡sijǎːnɔʋa pǎlat͡ʃa]) is an ancient palace built for the Roman emperor Diocletian at the turn of the fourth century AD, which today forms about half the old town of Split, Croatia. While it is referred to as a "palace" because of its intended use as the retirement residence of Diocletian, the term can be misleading as the structure is massive and more resembles a large fortress: about half of it was for Diocletian's personal use, and the rest housed the military garrison.
Mountain
Central Eastern Alps | |
---|---|
The Central Eastern Alps (German: Zentralalpen or Zentrale Ostalpen), also referred to as Austrian Central Alps (German: Österreichische Zentralalpen) or just Central Alps, comprise the main chain of the Eastern Alps in Austria and the adjacent regions of Switzerland, Liechtenstein, Italy and Slovenia. South them is the Southern Limestone Alps.
Body of water
River
Nile | |
---|---|
The Nile is a major north-flowing
Valley
Alay Valley | |
---|---|
Naming | |
Native name | Алай өрөөнү (Kyrgyz) |
The Alay Valley (
Notes
- ^ Transclusion count as of February 2022
- ^ link to the article about the language used for the native name
- ^ Infobox Tibetan Buddhist monastery collects the following parameters for native name: |t=ཇོ་ཁང་ |w=Jo-khang |to = {{{to}}} |ipa={{IPA|{{{ipa}}}}} |z={{{z}}} |thdl=thdl |e={{{e}}} |tc=大昭寺 |s={{{s}}} |p=Dàzhāosì