Wiktionary
thwiktionary
https://th.wiktionary.org/wiki/%E0%B8%A7%E0%B8%B4%E0%B8%81%E0%B8%B4%E0%B8%9E%E0%B8%88%E0%B8%99%E0%B8%B2%E0%B8%99%E0%B8%B8%E0%B8%81%E0%B8%A3%E0%B8%A1:%E0%B8%AB%E0%B8%99%E0%B9%89%E0%B8%B2%E0%B8%AB%E0%B8%A5%E0%B8%B1%E0%B8%81
MediaWiki 1.46.0-wmf.24
case-sensitive
สื่อ
พิเศษ
พูดคุย
ผู้ใช้
คุยกับผู้ใช้
วิกิพจนานุกรม
คุยเรื่องวิกิพจนานุกรม
ไฟล์
คุยเรื่องไฟล์
มีเดียวิกิ
คุยเรื่องมีเดียวิกิ
แม่แบบ
คุยเรื่องแม่แบบ
วิธีใช้
คุยเรื่องวิธีใช้
หมวดหมู่
คุยเรื่องหมวดหมู่
ภาคผนวก
คุยเรื่องภาคผนวก
ดัชนี
คุยเรื่องดัชนี
สัมผัส
คุยเรื่องสัมผัส
อรรถาภิธาน
คุยเรื่องอรรถาภิธาน
TimedText
TimedText talk
มอดูล
คุยเรื่องมอดูล
Event
Event talk
ເອົາ
0
5652
5720724
2189812
2026-04-21T04:48:00Z
Ai Ku Karng
17824
/* ภาษาลาว */
5720724
wikitext
text/x-wiki
{{also/auto}}
== ภาษาลาว ==
=== รากศัพท์ ===
{{inh+|lo|tai-pro|*ʔawᴬ}}; ร่วมเชื้อสายกับ{{cog|th|เอา}}, {{cog|tts|เอา}}, {{cog|nod|ᩐᩣ}}, {{cog|kkh|ᩐᩢᩣ}}, {{cog|khb|ᦀᧁ}}, {{cog|blt|ꪹꪮꪱ}}, {{cog|twh|ꪹꪮꪱ}}, {{cog|shn|ဢဝ်}}, {{cog|aho|𑜒𑜧}} หรือ {{m|aho|𑜒𑜧𑜈𑜫}} หรือ {{m|aho|𑜒𑜨𑜧}}, {{cog|za|aeu}}, {{cog|tdd|ᥟᥝ}}
=== การออกเสียง ===
{{lo-pron}}
=== คำกริยา ===
{{lo-verb}}
# {{lb|lo|สกรรม}} [[เอา]]
==== ลูกคำ ====
{{col4|lo
|ເອົາການ
|ເອົາການເອົາງານ
|ເອົາງານ
|ເອົາຈິງເອົາຈັງ
|ເອົາໃຈໃສ່
|ເອົາປຽບ
|ເອົາຜົວ
|ເອົາເມຍ
|ເອົາເລື່ອງ
|ເອົາໜ້າ
}}
pxifdp81ge8ulzd2sq906qyjff23gck
city
0
10271
5720694
2188018
2026-04-21T01:37:58Z
OctraBot
3198
บอต: แทนที่ข้อความโดยอัตโนมัติ (-\|เมืองใหญ่\}\} +|นคร}})
5720694
wikitext
text/x-wiki
== ภาษาอังกฤษ ==
{{wp|lang=en}}
=== รูปแบบอื่น ===
* {{alter|en|citie|cittie|cyte|cytee||เลิกใช้}}
=== รากศัพท์ ===
{{inh+|en|enm|city}}, {{m|enm|citie}}, {{m|enm|citee}}, {{m|enm|cite}}, จาก{{der|en|fro|cité}}, จาก{{der|en|la|cīvitās|t=[[citizenry]]; [[community]]; a city with its [[hinterland]]}}, จาก {{m|la|cīvis|t=[[native]]; [[townsman]]; [[citizen]]}}, จาก{{der|en|ine-pro|*ḱey-|t=to lie down, settle; home, family; love; beloved}}
ร่วมเชื้อสายกับ{{cog|ang|hīwan|g=p|t=members of one's household, servants}}; ดูเพิ่มที่ {{m|en|hewe}}; {{doublet|en|civitas}}
เข้าแทนที่คำพื้นถิ่น {{noncog|enm|burgh}}, {{m|enm|borough|t=[[fortified]] [[town]]; [[incorporated]] city}} และ {{m|enm|sted}}, {{m|enm|stede|t=[[place]], [[stead]]; city}}
=== การออกเสียง ===
[[ไฟล์:Empire State Building Aerial.JPG|thumb|Part of New York City, a large '''city''' with many tall buildings.]]
[[ไฟล์:Aerial view of Wells.jpg|thumb|Despite its small size, Wells is a '''city''' because of its cathedral.]]
* {{IPA|en|/ˈsɪti/}}
* {{IPA|en|/sɪtɪ/|a=Northern England}}
* {{IPA|en|/ˈsɪɾi/|a=US}}
* {{audio|en|en-us-city.ogg|a=US}}
* {{audio|en|En-uk-city.ogg|a=UK}}
* {{audio|en|EN-AU ck1 city.ogg|a=AU}}
* {{rhymes|en|ɪti}}
* {{hyphenation|en|ci|ty}}
=== คำนาม ===
{{en-noun}}
# [[เมือง]][[ใหญ่]], [[นคร]]
#: {{ux|en|São Paulo is the largest '''city''' in South America.}}
#* {{RQ:Ferguson Zollenstein|IV}}
#*: So this was my future home, I thought!{{...}}Backed by towering hills, the but faintly discernible purple line of the French boundary off to the southwest, a sky of palest Gobelin flecked with fat, fleecy little clouds, it in truth looked a dear little '''city'''; the '''city''' of one's dreams.
#* {{quote-journal|en|date=2014-06-14|volume=411|issue=8891|magazine={{w|The Economist}}
| title=[http://www.economist.com/news/science-and-technology/21604091-it-possible-sniff-out-problems-sewer-pipes-they-happen-its-gas It's a gas]
| passage=One of the hidden glories of Victorian engineering is proper drains. Isolating a '''city'''’s effluent and shipping it away in underground sewers has probably saved more lives than any medical procedure except vaccination.}}
#* {{quote-journal|en|date=2020 July 15|author=Mike Brown talks to Paul Clifton|title=Leading London's "hidden heroes"|journal=Rail|page=42|text=All our stations have changed. We have to constrain numbers. We have to mandate face coverings. These are massive changes in what is a public transport '''city'''. This is not a car '''city'''.}}
# {{label|en|บริเตน}} A settlement granted special status by royal charter or [[letters patent]]; traditionally, a settlement with a [[cathedral]] regardless of size.
#* '''1976''', Cornelius P. Darcy, ''The Encouragement of the Fine Arts in Lancashire, 1760-1860'', Manchester University Press ({{ISBN|9780719013300}}), page 20
#*: Manchester, incorporated in 1838, was made the centre of a bishopric in 1847 and became a '''city''' in 1853. Liverpool was transformed into a '''city''' by Royal Charter when the new diocese of Liverpool was created in 1880.
#* '''2014''', Graham Rutt, ''Cycling Britain's Cathedrals Volume 1'', Lulu.com ({{ISBN|9781326056049}}), page 307
#*: St Davids itself is the smallest '''city''' in Great Britain, with a population of less than 2,000.
# {{lb|en|ออสเตรเลีย}} [[เขต]][[ศูนย์กลาง]][[ธุรกิจ]], [[ตัว]]เมือง, [[ใน]]เมือง
#: {{ux|en|I'm going into the '''city''' today to do some shopping.}}
# {{lb|en|สแลง}} [[ปริมาณ]][[มหาศาล]] {{q|ใช้หลังคำนาม}}
#: ''It's video game '''city''' in here!''
==== คำจ่ากลุ่ม ====
* {{l|en|settlement}}
==== ลูกคำ ====
{{col4|en
| cathedral city
| cidiot
| citify
| citizen
| city and county
| city banker
| city block
| city boy
| city center
| city centre
| city clerk
| city desk
| city district
| city father
| city gent
| city girl
| city hall
| [[city limit]](s)
| city line
| city man
| city manager
| city map
| city planning
| city room
| city slicker
| city-state
| cityite
| cityscape
| citywide
| cross-city
| freedom of the city
| free of the city
| garden city
| Hanseatic city
| holy city
| host city
| [[inner city]], [[inner-city]]
| megacity
| sister city
| the city
| twin city
}}
{{col4|en|title=สถานที่ที่ลงท้ายด้วย ''City''
| Archer City
| Arkansas City
| Ashland City
| Atlantic City
| Bay City
| Beaver City
| Belize City
| Center City
| Charles City
| Columbia City
| Cross City
| Dade City
| Dakota City
| Dodge City
| Forrest City
| Garden City
| Granite City
| Hill City
| Ivy City
| Jefferson City
| Jersey City
| Johnson City
| Junction City
| Kansas City
| Lake City
| Long Island City
| Loup City
| Mexico City
| Nebraska City
| Ness City
| New York City
| Oklahoma City
| Panama City
| Pawnee City
| Pine City
| Quebec City
| Quezon City
| Rapid City
| Redwood City
| Reed City
| Rio Grande City
| Rogers City
| Sac City
| Salt Lake City
| Sioux City
| Surf City
| Tawas City
| Traverse City
| Tunnel City
| Union City
| Valley City
| Vatican City
| White City
| Yazoo City
| Yuba City
}}
==== คำเกี่ยวข้อง ====
* {{l|en|civic}}
* {{l|en|civil}}
==== คำสืบทอด ====
* {{desc|fr|City|bor=1}}
* {{desc|de|City|bor=1}}
* {{desc|it|city|bor=1}}
* {{desc|sv|city|bor=1}}
=== ดูเพิ่ม ===
* {{l|en|metropolis}}
* {{l|en|megalopolis}}
* {{l|en|megacity}}
* {{l|en|multicity}}
=== แหล่งข้อมูลอื่น ===
* {{R:Keywords|page=55}}
=== คำสลับอักษร ===
* {{anagrams|en|a=city|ICTY}}
{{topics|en|นคร}}
pqv3z7l1nqqnukaycvup5yywxcsis6r
เชียงใหม่
0
25817
5720690
2032418
2026-04-21T01:34:01Z
OctraBot
3198
5720690
wikitext
text/x-wiki
== ภาษาไทย ==
{{วิกิพีเดีย|จังหวัดเชียงใหม่}}
{{วิกิพีเดีย|เทศบาลนครเชียงใหม่}}
{{วิกิพีเดีย|มหาวิทยาลัยเชียงใหม่}}
[[ไฟล์:Amphoe Chiang Mai.svg|thumb|right|150px|เชียงใหม่]]
=== รากศัพท์ ===
{{คำประสม|th|เชียง|ใหม่}}
=== การออกเสียง ===
{{th-pron|เชียง-ไหฺม่}}
=== คำวิสามานยนาม ===
{{th-proper noun}}
# {{lang|th|([[จังหวัด]]~)}} ชื่อ[[จังหวัด]]ใน[[ภาค]][[เหนือ]]ของ[[ประเทศไทย]]
#: {{syn|th|ชม|q1=อักษรย่อ}} <!-- undecided former names พิงค์|นพบุรี|นพีสี -->
# ชื่อ[[เทศบาลนคร]]ในจังหวัดเชียงใหม่
# [[ชื่อ]][[มหาวิทยาลัย]][[ใน]][[กำกับ]]ของ[[รัฐ]] [[แห่ง]][[หนึ่ง]]ใน[[ประเทศไทย]]
==== คำแปลภาษาอื่น ====
{{trans-top|ชื่อจังหวัด}}
* ไทลื้อ: {{t+|khb|ᦵᦋᧂᦺᦖᧈ}}
* ไทใหญ่: {{t+|shn|ၵဵင်းမႆႇ}}, {{t+|shn|ၵဵင်းမႂ်ႇ}}
* ลาว: {{t+|lo|ຊຽງໃໝ່}}
* อังกฤษ: {{t+|en|Chiang Mai}}
{{trans-bottom}}
{{topics|th|เชียงใหม่|นครในไทย}}
afgegm2hy4ifdtdke6cis3zw09r1d6k
นนทบุรี
0
25852
5720692
1871885
2026-04-21T01:34:20Z
OctraBot
3198
/* คำแปลภาษาอื่น */
5720692
wikitext
text/x-wiki
== ภาษาไทย ==
{{วิกิพีเดีย|จังหวัดนนทบุรี}}
{{วิกิพีเดีย|เทศบาลนครนนทบุรี}}
[[ไฟล์:Amphoe Nonthaburi.png|thumb|150px|right|นนทบุรี]]
=== รากศัพท์ ===
{{com|th|นนท|บุรี}}
=== การออกเสียง ===
{{th-pron|นน-ทะ-บุ-รี|นน-บุ-รี}}
=== คำวิสามานยนาม ===
{{th-proper noun}}
# {{lang|th|([[จังหวัด]]~)}} ชื่อ[[จังหวัด]]ใน[[ภาคกลาง]]ของ[[ประเทศไทย]]
#: {{syn|th|นบ|q1=อักษรย่อ|นนท์|q2=ภาษาปาก}}
# ชื่อ[[เทศบาลนคร]]ในจังหวัดนนทบุรี
==== คำแปลภาษาอื่น ====
{{trans-top| (1) ชื่อจังหวัด}}
* รัสเซีย: {{t+|ru|Нонтхабури}}
* ลาว: {{t|lo|ນົນທະບຸລີ}}
* [[ภาษาอังกฤษ|อังกฤษ]] : {{t+|en|Nonthaburi}}
{{trans-bottom}}
{{topics|th|จังหวัดในไทย|นครในไทย}}
5lcdbk0efwjefc2xhrbw0lytsjcb82k
ประจวบคีรีขันธ์
0
25876
5720691
5644801
2026-04-21T01:34:09Z
OctraBot
3198
/* คำแปลภาษาอื่น */
5720691
wikitext
text/x-wiki
== ภาษาไทย ==
{{wp|จังหวัด+}}
[[ไฟล์:Amphoe Prachuap Khiri Khan.png|thumb|150px|right|ประจวบคีรีขันธ์]]
=== รากศัพท์ ===
{{คำประสม|th|ประจวบ|คีรี|ขันธ์}}
=== การออกเสียง ===
{{th-pron|ปฺระ-จวบ-คี-รี-ขัน}}
=== คำวิสามานยนาม ===
{{th-proper noun}}
# {{lang|th|([[จังหวัด]]~)}} ชื่อ[[จังหวัด]]ใน[[ภาคตะวันตก]]ของ[[ประเทศไทย]]
#: {{syn|th|ปข|q1=อักษรย่อ|ประจวบ|q2=ภาษาปาก}}
# ชื่อ[[อำเภอ]]เมืองในจังหวัดประจวบคีรีขันธ์
# ชื่อ[[เทศบาลเมือง]]ในจังหวัดประจวบคีรีขันธ์
====คำเกี่ยวข้อง====
* [[ปัจจันตคิรีเขตร]]
====คำแปลภาษาอื่น====
{{trans-top|ชื่อจังหวัด}}
* [[ภาษาจีน|จีน]] : [[班武里府]], [[巴蜀府]]
* [[ภาษาพม่า|พม่า]] : [[ပရာချွတ်ခီရိခန်း]]
* [[ภาษาอังกฤษ|อังกฤษ]] : [[Prachuap Khiri Khan]]
{{trans-bottom}}
{{topics|th|เมือง|จังหวัดในไทย|นครในไทย}}
pm8rzlzdifmu9kfilevcb9t1cv9zptf
levrette
0
27342
5720713
1118285
2026-04-21T02:11:32Z
OctraBot
3198
/* ภาษาฝรั่งเศส */ เก็บกวาด
5720713
wikitext
text/x-wiki
== ภาษาฝรั่งเศส ==
=== รากศัพท์ ===
จาก{{suffix|fr|lévrier|ette|id2=female}}.
=== การออกเสียง ===
* {{fr-IPA}}
* {{audio|fr|LL-Q150 (fra)-Mecanautes-levrette.wav|a=France}}
** {{audio|fr|LL-Q150 (fra)-Lepticed7-levrette.wav|a=<<France>> (<<Toulouse>>)}}
** {{audio|fr|LL-Q150 (fra)-Poslovitch-levrette.wav|a=<<France>> (<<Vosges>>)}}
* {{rhymes|fr|ɛt|s=2}}
=== คำนาม ===
{{fr-noun|f}}
# [[สุนัข]][[เกรย์ฮาวด์]][[ตัวเมีย]]
# {{lb|fr|slang}} [[ท่าหมา]]
#: {{uxi|fr|Moi, j'aime la '''levrette'''.|I like it '''doggy style'''.}}
#* {{RQ:Despentes King Kong|page=90|chapter=Porno sorcières|text=Censure et interdiction sont réclamées à cor et à cri par des militants effarés, comme si leur vie en dépendait. Cette attitude est objectivement surprenante : est-ce qu'une '''levrette''' en gros plan menace la sûreté de l'État ?}}
==== ลูกคำ ====
* {{l|fr|leuleu}}
==== คำสืบทอด ====
* {{desc|ro|levretă|bor=1}}
=== อ่านเพิ่ม ===
* {{R:fr:TLFi}}
{{C|fr|หมา|Sighthounds|Sex positions}}
c4rijsd46bzzyhnvrlnyief23ksg8jv
5720714
5720713
2026-04-21T02:11:54Z
OctraBot
3198
/* ภาษาฝรั่งเศส */
5720714
wikitext
text/x-wiki
== ภาษาฝรั่งเศส ==
=== รากศัพท์ ===
จาก{{suffix|fr|lévrier|ette|id2=female}}
=== การออกเสียง ===
* {{fr-IPA}}
* {{audio|fr|LL-Q150 (fra)-Mecanautes-levrette.wav|a=France}}
** {{audio|fr|LL-Q150 (fra)-Lepticed7-levrette.wav|a=<<France>> (<<Toulouse>>)}}
** {{audio|fr|LL-Q150 (fra)-Poslovitch-levrette.wav|a=<<France>> (<<Vosges>>)}}
* {{rhymes|fr|ɛt|s=2}}
=== คำนาม ===
{{fr-noun|f}}
# [[สุนัข]][[เกรย์ฮาวด์]][[ตัวเมีย]]
# {{lb|fr|slang}} [[ท่าหมา]]
#: {{uxi|fr|Moi, j'aime la '''levrette'''.|I like it '''doggy style'''.}}
#* {{RQ:Despentes King Kong|page=90|chapter=Porno sorcières|text=Censure et interdiction sont réclamées à cor et à cri par des militants effarés, comme si leur vie en dépendait. Cette attitude est objectivement surprenante : est-ce qu'une '''levrette''' en gros plan menace la sûreté de l'État ?}}
==== ลูกคำ ====
* {{l|fr|leuleu}}
==== คำสืบทอด ====
* {{desc|ro|levretă|bor=1}}
=== อ่านเพิ่ม ===
* {{R:fr:TLFi}}
{{C|fr|หมา|Sighthounds|Sex positions}}
ov4uxb7yhdpa9cgxwyv41de9a3po9wm
capital city
0
28316
5720695
5677150
2026-04-21T01:38:12Z
OctraBot
3198
บอต: แทนที่ข้อความโดยอัตโนมัติ (-\|เมืองใหญ่\}\} +|นคร}})
5720695
wikitext
text/x-wiki
== ภาษาอังกฤษ ==
=== การออกเสียง ===
* {{IPA|en|/ˌkæpɪtəl ˈsɪti/}}
* {{audio|en|EN-AU ck1 capital city.ogg|a=AU}}
=== คำนาม ===
{{en-noun}}
# {{senseid|en|Q5119}} [[เมืองหลวง]], [[เมือง]]ที่เป็น[[ที่ตั้ง]]ของ[[รัฐบาล]]
# เมือง (หรือหลายเมือง) ที่มีขนาดใหญ่กว่าหรือมีความสำคัญต่อประเทศมากกว่าเมืองอื่น ๆ ทั้งหมด โดยไม่คำนึงถึงที่ตั้งของรัฐบาลที่แท้จริง (เช่น [[มอสโก]]และ[[เซนต์ปีเตอร์สเบิร์ก]] โดยไม่คำนึงถึงการย้ายที่ตั้งของรัฐบาล[[รัสเซีย]] หรือ[[อัมสเตอร์ดัม]] ถึงแม้ว่ารัฐบาล[[เนเธอร์แลนด์]]จะตั้งอยู่ที่[[เฮก]]ตั้งแต่ปี 1588)
# {{label|en|AU}} [[เขตมหานคร]]หลัก ๆ ของ[[ออสเตรเลีย]] ([[ซิดนีย์]] [[เมลเบิร์น]] [[บริสเบน]] [[เพิร์ท]] [[แอดิเลด]]) ซึ่งทั้งหมดเป็นเมืองหลวงของ[[รัฐ]] และโดยนัยเดียวกันก็รวมถึงเขตมหานครของประเทศอื่น ๆ ด้วย ในขณะที่เมืองหลวงของรัฐอื่นอย่างเช่น [[โฮบาร์ต]] [[ดาร์วิน]] และเมืองหลวงของประเทศ [[แคนเบอร์รา]] มักจะไม่นับรวม
==== คำพ้องความ ====
* {{qualifier|การละ}} [[capital]]
{{C|en|นคร}}
1uxv4oxusbprr7e644vd9ayzq8942hb
capital gain
0
28987
5720715
1332023
2026-04-21T02:15:21Z
OctraBot
3198
/* ภาษาอังกฤษ */ เก็บกวาด
5720715
wikitext
text/x-wiki
== ภาษาอังกฤษ ==
=== การออกเสียง ===
* {{audio|en|LL-Q1860 (eng)-Vealhurl-capital gain.wav|a=Southern England}}
=== คำนาม ===
{{en-noun|~}}
# {{lb|en|economics|business|finance}} [[กำไรประเภททุน]]; การเพิ่มขึ้นของมูลค่าสินทรัพย์ประเภททุน; จำนวนที่มูลค่าหรือรายได้จากการขายสินทรัพย์ประเภททุนโดยเจ้าของเกินกว่าต้นทุนของเจ้าของ
#: {{ant|en|capital loss}}
#: {{rfquote-sense|en}}
6stokupp5u4u0vliq5kvuk0n3tfpiuk
5720717
5720715
2026-04-21T02:17:09Z
OctraBot
3198
/* คำนาม */
5720717
wikitext
text/x-wiki
== ภาษาอังกฤษ ==
=== การออกเสียง ===
* {{audio|en|LL-Q1860 (eng)-Vealhurl-capital gain.wav|a=Southern England}}
=== คำนาม ===
{{en-noun|~}}
# {{lb|en|economics|business|finance}} [[กำไรประเภททุน]]; การเพิ่มขึ้นของมูลค่าสินทรัพย์ประเภททุน; จำนวนที่มูลค่าหรือรายได้จากการขายสินทรัพย์ประเภททุนโดยเจ้าของ เกินกว่าต้นทุนของเจ้าของ
#: {{ant|en|capital loss}}
#: {{rfquote-sense|en}}
0ey9orpnzvhsc2waa4x13p6luk60ycn
เที่ยว
0
31114
5720719
1885100
2026-04-21T03:43:19Z
Ai Ku Karng
17824
/* ภาษาไทย */
5720719
wikitext
text/x-wiki
{{also/auto}}
== ภาษาไทย ==
=== รากศัพท์ ===
ร่วมเชื้อสายกับ{{cog|za|deuh|tr=แต่ว}}, {{cog|zzj|teuh|tr=แท่ว}}; สำหรับคำกริยา เทียบ{{cog|za|liuh|tr=ลิ่ว}}, {{cog|zzj|lieuh|tr=เลี่ยว}}, {{cog|lo|ທ່ຽວ}}
=== การออกเสียง ===
{{th-pron}}
=== คำนาม ===
{{th-noun}}
# ใช้เรียกการไปยังที่ซึ่ง[[กำหนด]]ไว้ครั้งหนึ่ง ๆ หรือ[[ไปกลับ]]รอบหนึ่ง ๆ
#: {{ux|th|เที่ยว[[ขึ้น]], เที่ยว[[ล่อง]], เที่ยวไป, เที่ยวกลับ}}
=== คำลักษณนาม ===
{{th-cls}}
# [[ลักษณนาม]]บอกอาการเช่นนั้น
#: {{ux|th|ไป ๒ เที่ยว}}
#: {{ux|th|มา ๓ เที่ยว}}
=== คำกริยา ===
{{th-verb}}
# [[กิริยา]]ที่ไปที่โน่นที่นี่[[เรื่อยไป]] มักใช้พูดประกอบกับกริยาอื่น
#: {{ux|th|เที่ยวหา, เที่ยวพูด, เที่ยวกิน, เที่ยวนอน}}
# ไปไหน ๆ เพื่อความ[[เพลิดเพลิน]]ตาม[[สบาย]]
#: {{ux|th|ไปเที่ยว, เดินเที่ยว, ท่องเที่ยว}}
# [[เตร็ดเตร่]]ไปเพื่อหาความ[[สนุก]]เพลิดเพลินตามที่ต่าง ๆ
#: {{ux|th|เที่ยวงานกาชาด}}
==== คำเกี่ยวข้อง ====
* [[ท่องเที่ยว]]
==== ดูเพิ่ม ====
* {{l|th|เทียว}}
j5qpgbjz1e8p3tmlwt8iuzyhtinqfth
เอา
0
32916
5720725
5030228
2026-04-21T04:48:42Z
Ai Ku Karng
17824
/* ภาษาไทย */
5720725
wikitext
text/x-wiki
{{also/auto}}
== ภาษาไทย ==
=== รากศัพท์ ===
{{inh+|th|tai-pro|*ʔawᴬ}}; ร่วมเชื้อสายกับ{{cog|tts|เอา}}, {{cog|lo|ເອົາ}}, {{cog|nod|ᩐᩣ}}, {{cog|kkh|ᩐᩢᩣ}}, {{cog|khb|ᦀᧁ}}, {{cog|blt|ꪹꪮꪱ}}, {{cog|shn|ဢဝ်}}, {{cog|aho|𑜒𑜧}} หรือ {{m|aho|𑜒𑜧𑜈𑜫}} หรือ {{m|aho|𑜒𑜨𑜧}}, {{cog|za|aeu}}, {{cog|tdd|ᥟᥝ}}
=== การออกเสียง ===
{{th-pron}}
=== คำกริยา ===
{{th-verb}}
# [[ยึด]]
#: {{ux|th|เอาไว้อยู่}}
# [[รับ]][[ไว้]]
#: {{ux|th|เขาให้ก็เอา}}
# [[พา]], [[นำ]]
#: {{ux|th|เอาตัวมา}}
# [[ต้องการ]]
#: {{ux|th|ทำเอาชื่อ}}
#: {{ux|th|ทำงานเอาหน้า}}
# [[ถือ]][[เป็น]][[สำคัญ]]
#: {{ux|th|เจรจาเอาถ้อยคำ}}
#: {{ux|th|เอาพี่เอาน้อง}}
# {{lb|th|ปาก}} [[คำ]]ใช้แทน[[กริยา]]อื่น ๆ บางคำได้
==== คำแปลภาษาอื่น ====
{{trans-top|พา, นำ}}
* คำเมือง: {{t+|nod|ᩐᩣ}}
* ไทดำ: {{t+|blt|ꪹꪮꪱ}}
* ไทใหญ่: {{t+|shn|ဢဝ်}}
* ลาว: {{t+|lo|ເອົາ}}
* อังกฤษ: {{t+|en|take|tr=เทค}}
{{trans-bottom}}
=== คำกริยาวิเศษณ์ ===
{{th-adv|-}}
# เมื่อใช้[[ลงท้าย]]กริยา เป็นการ[[เน้น]]กริยาแสดงถึงการ[[ตั้งหน้าตั้งตา]][[ทำ]][[ต่อเนื่อง]][[กัน]]
#: {{ux|th|กินเอา ๆ}}
== ภาษาคำเมือง ==
=== การออกเสียง ===
* {{IPA|nod|/ʔaw˧˧/|a=เชียงใหม่}}
=== คำกริยา ===
{{nod-verb}}
# {{lb|nod|สกรรม}} {{alternative form of|nod|ᩐᩣ}}
== ภาษาชอง ==
=== รากศัพท์ ===
{{inh+|cog|mkh-pro|*ʔaawʔ}}
=== การออกเสียง ===
* {{IPA|cog|/ʔaw/|a=จันทบุรี,ตราด,กาญจนบุรี}}
=== คำนาม ===
{{cog-noun}}
# [[เสื้อ]]
ghoxth4hdikrhgmbvnv5m8kwdru0jep
ທ່ອງທ່ຽວ
0
36283
5720721
1550623
2026-04-21T03:57:30Z
Ai Ku Karng
17824
/* ภาษาลาว */
5720721
wikitext
text/x-wiki
== ภาษาลาว ==
=== รากศัพท์ ===
{{com|lo|ທ່ອງ|ທ່ຽວ|t1=ท่อง|t2=เที่ยว}}; ร่วมเชื้อสายกับ{{cog|th|ท่องเที่ยว}}
=== การออกเสียง ===
{{lo-pron|ທ່ອງ-ທ່ຽວ}}
=== คำกริยา ===
{{lo-verb}}
# [[ท่องเที่ยว]]
opxsaderw42zcb5py7keqz8oteb2g2s
มอดูล:languages/data/exceptional
828
36360
5720769
5720544
2026-04-21T07:01:21Z
OctraBot
3198
บอต: แทนที่ข้อความโดยอัตโนมัติ (-standardChars +standard_chars)
5720769
Scribunto
text/plain
local m_langdata = require("Module:languages/data")
-- Loaded on demand, as it may not be needed (depending on the data).
local function u(...)
u = require("Module:string utilities").char
return u(...)
end
local c = m_langdata.chars
local p = m_langdata.puaChars
local s = m_langdata.shared
local m = {}
m["aav-khs-pro"] = {
"คาเซียนดั้งเดิม",
116773216,
"aav-khs",
"Latn",
type = "reconstructed",
}
m["aav-nic-pro"] = {
"นิโคบารีสดั้งเดิม",
116773793,
"aav-nic",
"Latn",
type = "reconstructed",
}
m["aav-pkl-pro"] = {
"ปนัร-คาซี-ลึงงัมดั้งเดิม",
116773259,
"aav-pkl",
"Latn",
type = "reconstructed",
}
m["aav-pro"] = { -- mkh-pro will merge into this
"ออสโตรเอเชียติกดั้งเดิม",
116773186,
"aav",
"Latn",
type = "reconstructed",
}
m["afa-pro"] = {
"แอฟโฟรเอเชียติกดั้งเดิม",
269125,
"afa",
"Latn",
type = "reconstructed",
}
m["alg-aga"] = {
"Agawam",
nil,
"alg-eas",
"Latn",
}
m["alg-pro"] = {
"แอลกองเคียนดั้งเดิม", -- silent u
7251834,
"alg",
"Latn",
type = "reconstructed",
sort_key = {remove_diacritics = "·"},
}
m["alv-ama"] = {
"Amasi",
4740400,
"nic-grs",
"Latn",
strip_diacritics = {remove_diacritics = c.grave .. c.acute .. c.circ .. c.tilde .. c.macron},
}
m["alv-bgu"] = {
"Baïnounk Gubëeher",
17002646,
"alv-bny",
"Latn",
}
m["alv-bua-pro"] = {
"Proto-Bua",
116773723,
"alv-bua",
"Latn",
type = "reconstructed",
}
m["alv-cng-pro"] = {
"Proto-Cangin",
116773726,
"alv-cng",
"Latn",
type = "reconstructed",
}
m["alv-edo-pro"] = {
"Proto-Edoid",
116773206,
"alv-edo",
"Latn",
type = "reconstructed",
}
m["alv-fli-pro"] = {
"Proto-Fali",
116773754,
"alv-fli",
"Latn",
type = "reconstructed",
}
m["alv-gbe-pro"] = {
"กเบดั้งเดิม",
116773208,
"alv-gbe",
"Latn",
type = "reconstructed",
}
m["alv-gng-pro"] = {
"Proto-Guang",
116773757,
"alv-gng",
"Latn",
type = "reconstructed",
}
m["alv-gtm-pro"] = {
"Proto-Central Togo",
116773732,
"alv-gtm",
"Latn",
type = "reconstructed",
}
m["alv-gwa"] = {
"Gwara",
16945580,
"nic-pla",
"Latn",
}
m["alv-hei-pro"] = {
"Proto-Heiban",
116773760,
"alv-hei",
"Latn",
type = "reconstructed",
}
m["alv-ido-pro"] = {
"Proto-Idomoid",
116773764,
"alv-ido",
"Latn",
type = "reconstructed",
}
m["alv-igb-pro"] = {
"Proto-Igboid",
116773765,
"alv-igb",
"Latn",
type = "reconstructed",
}
m["alv-kwa-pro"] = {
"Proto-Kwa",
116773780,
"alv-kwa",
"Latn",
type = "reconstructed",
}
m["alv-mum-pro"] = {
"Proto-Mumuye",
116773791,
"alv-mum",
"Latn",
type = "reconstructed",
}
m["alv-nup-pro"] = {
"Proto-Nupoid",
116773795,
"alv-nup",
"Latn",
type = "reconstructed",
}
m["alv-pro"] = {
"แอตแลนติก-คองโกดั้งเดิม",
116732838,
"alv",
"Latn",
type = "reconstructed",
}
m["alv-edk-pro"] = {
"Proto-Edekiri",
nil,
"alv-edk",
"Latn",
type = "reconstructed",
}
m["alv-yor-pro"] = {
"โยรูบาดั้งเดิม",
nil,
"alv-yor",
"Latn",
type = "reconstructed",
}
m["alv-yrd-pro"] = {
"โยรูบอยด์ดั้งเดิม",
116773824,
"alv-yrd",
"Latn",
type = "reconstructed",
}
m["alv-von-pro"] = {
"วอลตา-ไนเจอร์ดั้งเดิม",
116773820,
"alv-von",
"Latn",
type = "reconstructed",
}
m["apa-pro"] = {
"Proto-Apachean",
116773135,
"apa",
"Latn",
type = "reconstructed",
}
m["aql-pro"] = {
"แอลจิกดั้งเดิม",
18389588,
"aql",
"Latn",
type = "reconstructed",
sort_key = {remove_diacritics = "·"},
}
m["art-adu"] = {
"Adûni",
1232159,
"art",
"Latn",
type = "appendix-constructed",
}
m["art-bel"] = {
"Belter Creole",
108055510,
"art",
"Latn",
type = "appendix-constructed",
sort_key = {
remove_diacritics = c.acute,
from = {"ɒ"},
to = {"a"},
},
}
m["art-blk"] = {
"Bolak",
2909283,
"art",
"Latn",
type = "appendix-constructed",
}
m["art-bsp"] = {
"แบล็กสปีช",
686210,
"art",
"Latn, Teng",
type = "appendix-constructed",
}
m["art-com"] = {
"Communicationssprache",
35227,
"art",
"Latn",
type = "appendix-constructed",
}
m["art-dtk"] = {
"Dothraki",
2914733,
"art",
"Latn",
type = "appendix-constructed",
}
m["art-elo"] = {
"Eloi",
nil,
"art",
"Latn",
type = "appendix-constructed",
}
m["art-gld"] = {
"Goa'uld",
19823,
"art",
"Latn, Egyp, Mero",
type = "appendix-constructed",
}
m["art-lap"] = {
"Lapine",
6488195,
"art",
"Latn",
type = "appendix-constructed",
}
m["art-man"] = {
"Mandalorian",
54289,
"art",
"Latn",
type = "appendix-constructed",
}
m["art-mun"] = {
"Mundolinco",
851355,
"art",
"Latn",
type = "appendix-constructed",
}
m["art-nav"] = {
"Naʼvi",
316939,
"art",
"Latn",
type = "appendix-constructed",
}
m["art-vlh"] = {
"High Valyrian",
64483808,
"art",
"Latn",
type = "appendix-constructed",
}
m["ath-nic"] = {
"Nicola",
20609,
"ath-nor",
"Latn",
}
m["ath-pro"] = {
"Proto-Athabaskan",
104841722,
"ath",
"Latn",
type = "reconstructed",
}
m["auf-pro"] = {
"Proto-Arawa",
116773706,
"auf",
"Latn",
type = "reconstructed",
}
m["aus-alu"] = {
"Alungul",
16827670,
"aus-pmn",
"Latn",
}
m["aus-and"] = {
"Andjingith",
4754509,
"aus-pmn",
"Latn",
}
m["aus-ang"] = {
"Angkula",
16828520,
"aus-pmn",
"Latn",
}
m["aus-arn-pro"] = {
"Proto-Arnhem",
116773720,
"aus-arn",
"Latn",
type = "reconstructed",
}
m["aus-bra"] = {
"Barranbinya",
4863220,
"aus-pmn",
"Latn",
}
m["aus-brm"] = {
"Barunggam",
4865914,
"aus-pmn",
"Latn",
}
m["aus-cww-pro"] = {
"Proto-Central New South Wales",
116773199,
"aus-cww",
"Latn",
type = "reconstructed",
}
m["aus-dal-pro"] = {
"Proto-Daly",
116773743,
"aus-dal",
"Latn",
type = "reconstructed",
}
m["aus-guw"] = {
"Guwar",
6652138,
"aus-pam",
"Latn",
}
m["aus-lsw"] = {
"Little Swanport",
6652138,
"qfa-unc",
"Latn",
}
m["aus-mbi"] = {
"Mbiywom",
6799701,
"aus-pmn",
"Latn",
}
m["aus-ngk"] = {
"Ngkoth",
7022405,
"aus-pmn",
"Latn",
}
m["aus-nyu-pro"] = {
"Proto-Nyulnyulan",
116773797,
"aus-nyu",
"Latn",
type = "reconstructed",
}
m["aus-pam-pro"] = {
"Proto-Pama-Nyungan",
33942,
"aus-pam",
"Latn",
type = "reconstructed",
}
m["aus-tul"] = {
"Tulua",
16938541,
"aus-pam",
"Latn",
}
m["aus-uwi"] = {
"Uwinymil",
7903995,
"aus-arn",
"Latn",
}
m["aus-wdj-pro"] = {
"Proto-Iwaidjan",
116773767,
"aus-wdj",
"Latn",
type = "reconstructed",
}
m["aus-won"] = {
"Wong-gie",
nil,
"aus-pam",
"Latn",
}
m["aus-wul"] = {
"Wulguru",
8039196,
"aus-dyb",
"Latn",
}
m["aus-ynk"] = { -- contrast nny
"Yangkaal",
3913770,
"aus-tnk",
"Latn",
}
m["awd-amc-pro"] = {
"Proto-Amuesha-Chamicuro",
nil,
"awd",
"Latn",
type = "reconstructed",
}
m["awd-kmp-pro"] = {
"Proto-Kampa",
nil,
"awd",
"Latn",
type = "reconstructed",
}
m["awd-prw-pro"] = {
"Proto-Paresi-Waura",
nil,
"awd",
"Latn",
type = "reconstructed",
}
m["awd-ama"] = {
"Amarizana",
16827787,
"awd",
"Latn",
}
m["awd-ana"] = {
"Anauyá",
16828252,
"awd",
"Latn",
}
m["awd-apo"] = {
"Apolista",
16916645,
"awd",
"Latn",
}
m["awd-cab"] = {
"Cabre",
16850160,
"awd",
"Latn",
}
m["awd-gnu"] = {
"Guinau",
3504087,
"awd",
"Latn",
}
m["awd-kar"] = {
"Cariay",
16920253,
"awd",
"Latn",
}
m["awd-kaw"] = {
"Kawishana",
6379993,
"awd-nwk",
"Latn",
}
m["awd-kus"] = {
"Kustenau",
5196293,
"awd",
"Latn",
}
m["awd-man"] = {
"Manao",
6746920,
"awd",
"Latn",
}
m["awd-mar"] = {
"Marawan",
6755108,
"awd",
"Latn",
}
m["awd-mpr"] = {
"Maipure",
6736872,
"awd",
"Latn",
}
m["awd-mrt"] = {
"Mariaté",
16910017,
"awd-nwk",
"Latn",
}
m["awd-nwk-pro"] = {
"Proto-Nawiki",
116773234,
"awd-nwk",
"Latn",
type = "reconstructed",
}
m["awd-pai"] = {
"Paikoneka",
128807835,
"awd",
"Latn",
}
m["awd-pas"] = {
"Pasé",
7143168,
"awd-nwk",
"Latn",
}
m["awd-pro"] = {
"Proto-Arawak",
97573478,
"awd",
"Latn",
type = "reconstructed",
}
m["awd-she"] = {
"Shebayo",
7492248,
"awd",
"Latn",
}
m["awd-taa-pro"] = {
"Proto-Ta-Arawak",
116773282,
"awd-taa",
"Latn",
type = "reconstructed",
}
m["awd-wai"] = {
"Wainumá",
16910017,
"awd-nwk",
"Latn",
}
m["awd-yum"] = {
"Yumana",
8061062,
"awd-nwk",
"Latn",
}
m["azc-caz"] = {
"Cazcan",
5055514,
"azc",
"Latn",
}
m["azc-cup-pro"] = {
"Proto-Cupan",
116773738,
"azc-cup",
"Latn",
type = "reconstructed",
}
m["azc-ktn"] = {
"Kitanemuk",
3197558,
"azc-tak",
"Latn",
}
m["azc-nah-pro"] = {
"นาวันดั้งเดิม",
7251860,
"azc-nah",
"Latn",
type = "reconstructed",
}
m["azc-num-pro"] = {
"Proto-Numic",
116773247,
"azc-num",
"Latn",
type = "reconstructed",
}
m["azc-pro"] = {
"ยูโต-แอซเทกันดั้งเดิม",
96400333,
"azc",
"Latn",
type = "reconstructed",
}
m["azc-tak-pro"] = {
"Proto-Takic",
116773283,
"azc-tak",
"Latn",
type = "reconstructed",
}
m["azc-tat"] = {
"Tataviam",
743736,
"azc",
"Latn",
}
m["ber-pro"] = {
"เบอร์เบอร์ดั้งเดิม",
2855698,
"ber",
"Latn",
type = "reconstructed",
}
m["ber-fog"] = {
"Fogaha",
107610173,
"ber",
"Latn",
}
m["ber-zuw"] = {
"Zuwara",
4117169,
"ber",
"Latn",
}
m["bnt-bal"] = {
"Balong",
93935237,
"bnt-bbo",
"Latn",
}
m["bnt-bon"] = {
"Boma Nkuu",
nil,
"bnt",
"Latn",
}
m["bnt-boy"] = {
"Boma Yumu",
nil,
"bnt",
"Latn",
}
m["bnt-bwa"] = {
"Bwala",
128810345,
"bnt-tek",
"Latn",
}
m["bnt-cmw"] = {
"Chimwiini",
4958328,
"bnt-swh",
"Latn",
}
m["bnt-ind"] = {
"Indanga",
51412803,
"bnt",
"Latn",
}
m["bnt-lal"] = {
"Lala (South Africa)",
6480154,
"bnt-ngu",
"Latn",
}
m["bnt-mpi"] = {
"Mpiin",
93937013,
"bnt-bdz",
"Latn",
}
m["bnt-mpu"] = {
"Mpuono", -- not to be confused with Mbuun zmp
36056,
"bnt",
"Latn",
}
m["bnt-ngu-pro"] = {
"งูนีดั้งเดิม",
961559,
"bnt-ngu",
"Latn",
type = "reconstructed",
sort_key = {remove_diacritics = c.grave .. c.acute .. c.circ .. c.caron},
}
m["bnt-phu"] = {
"Phuthi",
33796,
"bnt-ngu",
"Latn",
strip_diacritics = {remove_diacritics = c.grave .. c.acute},
}
m["bnt-pro"] = {
"แบนทูดั้งเดิม",
3408025,
"bnt",
"Latn",
type = "reconstructed",
sort_key = "bnt-pro-sortkey",
}
m["bnt-sab-pro"] = {
"Proto-Sabaki",
nil, -- Q2209395 is the code for the Sabaki family
"bnt-sab",
"Latn",
type = "reconstructed",
}
m["bnt-sbo"] = {
"South Boma",
nil,
"bnt",
"Latn",
}
m["bnt-sts-pro"] = {
"Proto-Sotho-Tswana",
116773278,
"bnt-sts",
"Latn",
type = "reconstructed",
}
m["btk-pro"] = {
"Proto-Batak",
116773191,
"btk",
"Latn",
type = "reconstructed",
}
m["cau-abz-pro"] = {
"Proto-Abkhaz-Abaza",
7251831,
"cau-abz",
"Latn",
type = "reconstructed",
}
m["cau-and-pro"] = {
"Proto-Andian",
nil,
"cau-and",
"Latn",
type = "reconstructed",
}
m["cau-ava-pro"] = {
"Proto-Avaro-Andian",
116773187,
"cau-ava",
"Latn",
type = "reconstructed",
}
m["cau-cir-pro"] = {
"Proto-Circassian",
7251838,
"cau-cir",
"Latn",
type = "reconstructed",
}
m["cau-drg-pro"] = {
"Proto-Dargwa",
116773205,
"cau-drg",
"Latn",
type = "reconstructed",
}
m["cau-lzg-pro"] = {
"Proto-Lezghian",
116773223,
"cau-lzg",
"Latn",
type = "reconstructed",
}
m["cau-nec-pro"] = {
"คอเคเซียนตะวันออกเฉียงเหนือดั้งเดิม",
116773244,
"cau-nec",
"Latn",
type = "reconstructed",
}
m["cau-nkh-pro"] = {
"นัคดั้งเดิม",
108032840,
"cau-nkh",
"Latn",
type = "reconstructed",
}
m["cau-nwc-pro"] = {
"คอเคเซียนตะวันตกเฉียงเหนือดั้งเดิม",
7251861,
"cau-nwc",
"Latn",
type = "reconstructed",
}
m["cau-tsz-pro"] = {
"Proto-Tsezian",
116773287,
"cau-tsz",
"Latn",
type = "reconstructed",
}
m["cba-ata"] = {
"Atanques",
4812783,
"cba",
"Latn",
}
m["cba-cat"] = {
"Catío Chibcha",
7083619,
"cba",
"Latn",
}
m["cba-dor"] = {
"Dorasque",
5297532,
"cba",
"Latn",
}
m["cba-dui"] = {
"Duit",
3041061,
"cba",
"Latn",
}
m["cba-hue"] = {
"Huetar",
35514,
"cba",
"Latn",
}
m["cba-nut"] = {
"Nutabe",
7070405,
"cba",
"Latn",
}
m["cba-pro"] = {
"Proto-Chibchan",
116773203,
"cba",
"Latn",
type = "reconstructed",
}
m["ccs-pro"] = {
"คาร์ทเวเลียนดั้งเดิม",
2608203,
"ccs",
"Latn",
type = "reconstructed",
strip_diacritics = {
from = {"q̣", "p̣", "ʓ", "ċ"},
to = {"q̇", "ṗ", "ʒ", "c̣"}
},
}
m["ccs-gzn-pro"] = {
"จอร์เจียน-แซนดั้งเดิม",
23808119,
"ccs-gzn",
"Latn",
type = "reconstructed",
strip_diacritics = {
from = {"q̣", "p̣", "ʓ", "ċ"},
to = {"q̇", "ṗ", "ʒ", "c̣"}
},
}
m["cdc-cbm-pro"] = {
"ชาดิกตอนกลางดั้งเดิม",
116773197,
"cdc-cbm",
"Latn",
type = "reconstructed",
}
m["cdc-mas-pro"] = {
"Proto-Masa",
116773789,
"cdc-mas",
"Latn",
type = "reconstructed",
}
m["cdc-pro"] = {
"ชาดิกดั้งเดิม",
116773201,
"cdc",
"Latn",
type = "reconstructed",
}
m["cdd-pro"] = {
"Proto-Caddoan",
116773725,
"cdd",
"Latn",
type = "reconstructed",
}
m["cel-bry-pro"] = {
"บริทอนิกดั้งเดิม",
1248800,
"cel-bry",
"Latn, Polyt",
sort_key = {
Latn = "cel-bry-pro-sortkey",
},
-- Polyt translit, display_text, strip_diacritics, sort_key in [[Module:scripts/data]]
}
m["cel-gal"] = {
"Gallaecian",
3094789,
"cel-his",
}
m["cel-gau"] = {
"กอล",
29977,
"cel",
"Latn, Polyt, Ital",
strip_diacritics = {
Latn = {remove_diacritics = c.macron .. c.breve .. c.diaer},
},
sort_key = {
Latn = "cel-bry-pro-sortkey",
},
-- Ital translit in [[Module:scripts/data]]
-- Polyt translit, display_text, strip_diacritics, sort_key in [[Module:scripts/data]]
}
m["cel-pro"] = {
"เคลติกดั้งเดิม",
653649,
"cel",
"Latn",
type = "reconstructed",
sort_key = "cel-pro-sortkey",
}
m["chi-pro"] = {
"Proto-Chimakuan",
116773734,
"chi",
"Latn",
type = "reconstructed",
}
m["chm-pro"] = {
"Proto-Mari",
116773788,
"chm",
"Latn",
type = "reconstructed",
}
m["cmc-pro"] = {
"จามิกดั้งเดิม",
114793834,
"cmc",
"Latn",
type = "reconstructed",
}
m["crp-bip"] = {
"Basque-Icelandic Pidgin",
810378,
"crp",
"Latn",
ancestors = "eu",
}
m["crp-gep"] = {
"West Greenlandic Pidgin",
17036301,
"crp",
"Latn",
ancestors = "kl",
}
m["crp-kia"] = {
"Kiautschou German Pidgin",
108314615,
"crp",
"Latn",
ancestors = "de",
}
m["crp-mar"] = {
"Maroon Spirit Language",
1093206,
"crp",
"Latn",
ancestors = "en",
}
m["crp-mpp"] = {
"Macau Pidgin Portuguese",
128804537,
"crp",
"Hant, Latn",
ancestors = "pt",
sort_key = {Hant = "Hani-sortkey"},
}
m["crp-rsn"] = {
"Russenorsk",
505125,
"crp",
"Cyrl, Latn",
ancestors = "nn, ru",
translit = {Cyrl = "ru-translit"},
}
m["crp-spp"] = {
"Samoan Plantation Pidgin",
7409948,
"crp",
"Latn",
ancestors = "en",
}
m["crp-slb"] = {
"Solombala English",
7558525,
"crp",
"Cyrl, Latn",
ancestors = "en, ru",
translit = {Cyrl = "ru-translit"},
}
m["crp-tpr"] = {
"Taimyr Pidgin Russian",
16930506,
"crp",
"Cyrl",
ancestors = "ru",
translit = "ru-translit",
}
m["csu-bba-pro"] = {
"Proto-Bongo-Bagirmi",
116773722,
"csu-bba",
"Latn",
type = "reconstructed",
}
m["csu-maa-pro"] = {
"Proto-Mangbetu",
116773786,
"csu-maa",
"Latn",
type = "reconstructed",
}
m["csu-pro"] = {
"Proto-Central Sudanic",
116773730,
"csu",
"Latn",
type = "reconstructed",
}
m["csu-sar-pro"] = {
"Proto-Sara",
116773809,
"csu-sar",
"Latn",
type = "reconstructed",
}
m["cus-ash"] = {
"Ashraaf",
4805855,
"cus-som",
"Latn",
}
m["cus-hec-pro"] = {
"Proto-Highland East Cushitic",
116773761,
"cus-hec",
"Latn",
type = "reconstructed",
}
m["cus-som-pro"] = {
"โซมาลอยด์ดั้งเดิม",
nil,
"cus-som",
"Latn",
type = "reconstructed",
}
m["cus-sou-pro"] = {
"Proto-South Cushitic",
126081567,
"cus-sou",
"Latn",
type = "reconstructed",
}
m["cus-pro"] = {
"Proto-Cushitic",
116773204,
"cus",
"Latn",
type = "reconstructed",
}
m["dmn-dam"] = {
"Dama (Sierra Leone)",
19601574,
"dmn",
"Latn",
}
m["dra-bry"] = {
"Beary",
1089116,
"qfa-mix",
"Mlym, Knda",
ancestors = "ml, tcy",
-- Knda translit in [[Module:scripts/data]]
-- Mlym translit in [[Module:scripts/data]]
}
m["dra-cen-pro"] = {
"ดราวิเดียนตอนกลางดั้งเดิม",
nil,
"dra-cen",
"Latn",
type = "reconstructed",
}
m["dra-mkn"] = {
"Middle Kannada",
128810572,
"dra-kan",
"Knda",
-- Knda translit in [[Module:scripts/data]]
}
m["dra-nor-pro"] = {
"ดราวิเดียนเหนือดั้งเดิม",
124433593,
"dra-nor",
"Latn",
type = "reconstructed",
}
m["dra-okn"] = {
"Old Kannada",
15723156,
"dra-kan",
"Knda",
-- Knda translit in [[Module:scripts/data]]
}
m["dra-ote"] = {
"Old Telugu",
126720868,
"dra-tel",
"Telu",
translit = "Telu-translit",
}
m["dra-pro"] = {
"ดราวิเดียนดั้งเดิม",
1702853,
"dra",
"Latn",
type = "reconstructed",
}
m["dra-sdo-pro"] = {
"ดราวิเดียนใต้ที่หนึ่งดั้งเดิม",
104847952, -- Wikipedia's "Proto-South Dravidian" is Proto-South Dravidian I in this scheme.
"dra-sdo",
"Latn",
type = "reconstructed",
}
m["dra-sdt-pro"] = {
"ดราวิเดียนใต้ที่สองดั้งเดิม",
128885257,
"dra-sdt",
"Latn",
type = "reconstructed",
}
m["dra-sou-pro"] = {
"ดราวิเดียนใต้ดั้งเดิม",
128886121,
"dra-sou",
"Latn",
type = "reconstructed",
}
m["egx-dem"] = {
"Demotic Egyptian",
36765,
"egx",
"Latn, Egyd, Polyt",
sort_key = {
Latn = {
remove_diacritics = "'%-%s",
from = {"ꜣ", "j", "e", "ꜥ", "y", "w", "b", "p", "f", "m", "n", "r", "l", "ḥ", "ḫ", "h̭", "ẖ", "h", "š", "s", "q", "k", "g", "ṱ", "ṯ", "t", "ḏ", "%.", "⸗"},
to = {p[1], p[2], p[3], p[4], p[5], p[6], p[7], p[8], p[9], p[10], p[11], p[12], p[13], p[15], p[16], p[16], p[17], p[14], p[19], p[18], p[20], p[21], p[22], p[23], p[24], p[23], p[25], p[26], p[26]}
},
},
-- Polyt translit, display_text, strip_diacritics, sort_key in [[Module:scripts/data]]
}
m["dmn-pro"] = {
"Proto-Mande",
116773785,
"dmn",
"Latn",
type = "reconstructed",
}
m["dmn-mdw-pro"] = {
"Proto-Western Mande",
116773822,
"dmn-mdw",
"Latn",
type = "reconstructed",
}
m["dru-pro"] = {
"Proto-Rukai",
116773807,
"map",
"Latn",
type = "reconstructed",
}
m["ero-gsz"] = {
"Geshiza",
nil,
"ero",
"Latn",
}
m["ero-nya"] = {
"Nyagrong Minyag",
nil,
"ero",
"Latn",
}
m["ero-tau"] = {
"Stau",
nil,
"ero",
"Latn",
}
m["esx-esk-pro"] = {
"เอสกิโมดั้งเดิม",
7251842,
"esx-esk",
"Latn",
type = "reconstructed",
}
m["esx-ink"] = {
"Inuktun",
1671647,
"esx-inu",
"Latn",
}
m["esx-inq"] = {
"Inuinnaqtun",
28070,
"esx-inu",
"Latn",
}
m["esx-inu-pro"] = {
"อินุอิตดั้งเดิม",
60785588,
"esx-inu",
"Latn",
type = "reconstructed",
}
m["esx-pro"] = {
"Proto-Eskimo-Aleut",
7251843,
"esx",
"Latn",
type = "reconstructed",
}
m["esx-tut"] = {
"Tunumiisut",
15665389,
"esx-inu",
"Latn",
}
m["euq-pro"] = {
"บาสก์ดั้งเดิม",
938011,
"euq",
"Latn",
type = "reconstructed",
}
m["gba-pro"] = {
"Proto-Gbaya",
nil,
"gba",
"Latn",
type = "reconstructed",
}
m["gem-pro"] = {
"เจอร์แมนิกดั้งเดิม",
669623,
"gem",
"Latn",
type = "reconstructed",
sort_key = "gem-pro-sortkey",
}
m["gme-bur"] = {
"Burgundian",
47625,
"gme",
"Latn",
}
m["gme-cgo"] = {
"กอทแบบไครเมีย",
36211,
"gme",
"Latn",
}
m["gmq-gut"] = {
"Gutnish",
1256646,
"gmq",
"Latn",
ancestors = "gmq-ogt",
}
m["gmq-jmk"] = {
"Jamtish",
35512,
"gmq-eas",
"Latn",
}
m["gmq-mno"] = {
"นอร์เวย์กลาง",
3417070,
"gmq-wes",
"Latn",
}
m["gmq-oda"] = {
"เดนมาร์กเก่า",
12330003,
"gmq-eas",
"Latn, Runr",
strip_diacritics = {remove_diacritics = c.macron},
}
m["gmq-ogt"] = {
"Old Gutnish",
1133488,
"gmq",
"Latn, Runr",
ancestors = "non",
}
m["gmq-osw"] = {
"สวีเดนเก่า",
2417210,
"gmq-eas",
"Latn, Runr",
strip_diacritics = {remove_diacritics = c.macron},
}
m["gmq-pro"] = {
"นอร์สดั้งเดิม",
1671294,
"gmq",
"Runr",
translit = "Runr-translit",
}
m["gmq-scy"] = {
"Scanian",
768017,
"gmq-eas",
"Latn",
}
m["gmw-bgh"] = {
"Bergish",
329030,
"gmw-frk",
"Latn",
}
m["gmw-cfr"] = {
"ภาษาฟรังโกเนียตอนกลาง",
572197,
"gmw-hgm",
"Latn",
ancestors = "gmh",
wikimedia_codes = "ksh",
}
m["gmw-ecg"] = {
"เยอรมันตอนกลางตะวันออก",
499344, -- subsumes Q699284, Q152965
"gmw-hgm",
"Latn",
ancestors = "gmh",
}
m["gmw-fin"] = {
"Fingallian",
3072588,
"gmw-ian",
"Latn",
}
m["gmw-gts"] = {
"Gottscheerish",
533109,
"gmw-hgm",
"Latn",
ancestors = "bar",
}
m["gmw-jdt"] = {
"Jersey Dutch",
1687911,
"gmw-frk",
"Latn",
ancestors = "nl",
}
m["gmw-msc"] = {
"Middle Scots",
3327000,
"gmw-ang",
"Latn",
ancestors = "enm-esc",
}
m["gmw-pro"] = {
"เจอร์แมนิกตะวันตกดั้งเดิม",
78079021,
"gmw",
"Latn, Runr",
-- type = "reconstructed",
-- largely but not entirely reconstructed (like Proto-Norse); see April '24 BP, set back to reconstructed (?) if 'anti-asterisk' is added
sort_key = "gmw-pro-sortkey",
}
m["gmw-rfr"] = {
"ฟรังโกเนียแบบไรน์",
707007,
"gmw-hgm",
"Latn",
ancestors = "gmh",
}
m["gmw-stm"] = {
"Sathmar Swabian",
2223059,
"gmw-hgm",
"Latn",
ancestors = "swg",
}
m["gmw-tsx"] = {
"Transylvanian Saxon",
260942,
"gmw-hgm",
"Latn",
ancestors = "gmw-cfr",
}
m["gmw-vog"] = {
"เยอรมันแบบว็อลกา",
312574,
"gmw-hgm",
"Latn",
ancestors = "gmw-rfr",
}
m["gmw-zps"] = {
"เยอรมันแบบซิพเซอร์",
205548,
"gmw-hgm",
"Latn",
ancestors = "gmh",
}
m["gn-cls"] = {
"กัวรานีคลาสสิก",
17478065,
"gn",
"Latn",
}
m["grk-cal"] = {
"Calabrian Greek",
1146398,
"grk",
"Latn, Grek",
ancestors = "grk-ita",
translit = {
Grek = "el-translit",
},
-- Grek display_text, strip_diacritics, sort_key in [[Module:scripts/data]]
}
m["grk-ita"] = {
"Italiot Greek",
19720507,
"grk",
"Latn, Grek",
ancestors = "gkm",
translit = {
Grek = "el-translit",
},
-- Grek display_text, strip_diacritics, sort_key in [[Module:scripts/data]]
}
m["grk-mar"] = {
"Mariupol Greek",
4400023,
"grk",
"Cyrl, Latn, Grek",
ancestors = "gkm",
translit = {
Cyrl = "grk-mar-translit",
Grek = "grk-mar-translit",
},
override_translit = true,
strip_diacritics = {
Cyrl = {remove_diacritics = c.acute},
},
-- Grek display_text, strip_diacritics, sort_key in [[Module:scripts/data]]
}
m["grk-pro"] = {
"เฮลเลนิกดั้งเดิม",
1231805,
"grk",
"Latn, Polyt",
type = "reconstructed",
sort_key = {Latn = {
from = {"ʰ", "ʷ"},
to = {"h", "w"},
remove_diacritics = c.grave .. c.acute .. c.macron .. c.breve .. c.caron
}},
-- Polyt translit, display_text, strip_diacritics, sort_key in [[Module:scripts/data]]
-- NOTE: formerly no translit specified for Polyt; presumably an accidental omission; if not, set Polyt = false in
-- the translit section
}
m["hmn-pro"] = {
"ม้งดั้งเดิม",
116773210,
"hmn",
"Latn",
type = "reconstructed",
}
m["hmx-mie-pro"] = {
"เมี่ยนดั้งเดิม",
116773229,
"hmx-mie",
"Latn",
type = "reconstructed",
}
m["hmx-pro"] = {
"ม้ง-เมี่ยนดั้งเดิม",
7251846,
"hmx",
"Latn",
type = "reconstructed",
}
m["hyx-pro"] = {
"อาร์มีเนียนดั้งเดิม",
3848498,
"hyx",
"Latn",
type = "reconstructed",
}
m["iir-nur-pro"] = {
"Proto-Nuristani",
116773248,
"iir-nur",
"Latn",
type = "reconstructed",
}
m["iir-pro"] = {
"อินโด-อิเรเนียนดั้งเดิม",
966439,
"iir",
"Latn",
type = "reconstructed",
}
m["ijo-pro"] = {
"Proto-Ijoid",
116773766,
"ijo",
"Latn",
type = "reconstructed",
}
m["inc-apa"] = {
"Apabhramsa",
616419,
"inc-mid",
"Deva, Shrd, Sidd",
ancestors = "pra",
translit = {
Deva = "Deva-translit",
-- Shrd translit in [[Module:scripts/data]]
-- Sidd translit in [[Module:scripts/data]]
},
}
m["inc-ash"] = {
"ปรากฤตแบบอโศก",
104854379,
"inc-mid",
"Brah, Khar",
ancestors = "sa",
translit = {
-- Brah translit in [[Module:scripts/data]]
Khar = "Khar-translit",
},
}
m["inc-dng-pro"] = {
"Proto-Dangari",
nil,
"inc-dng",
"Latn",
type = "reconstructed",
}
m["inc-kam"] = {
"กามรูป",
6356097,
"inc-bas",
"Brah, Sidd",
-- Brah, Sidd translit in [[Module:scripts/data]]
}
m["inc-kho"] = {
"Kholosi",
24952008,
"inc-snd",
"Latn",
}
m["inc-krd-pro"] = {
"Proto-Kamta",
128816843,
"inc-bas",
"Latn",
ancestors = "inc-kam",
type = "reconstructed",
}
m["inc-mas"] = {
"อัสสัมกลาง",
128806836,
"inc-bas",
"as-Beng",
ancestors = "inc-oas",
translit = "Beng-translit",
}
m["inc-mbn"] = {
"เบงกอลกลาง",
113559927,
"inc-bas",
"Beng",
ancestors = "inc-obn",
translit = "Beng-translit",
}
m["inc-mgu"] = {
"คุชราตกลาง",
24907429,
"inc-wes",
"Deva",
ancestors = "inc-ogu",
translit = {
Deva = "Deva-translit",
},
}
m["inc-mor"] = {
"โอริยากลาง",
128810882,
"inc-eas",
"Orya",
ancestors = "inc-oor",
}
m["inc-oas"] = {
"อัสสัมช่วงต้น",
85758237,
"inc-bas",
"as-Beng",
ancestors = "inc-kam",
translit = "Beng-translit",
}
m["inc-oaw"] = {
"Old Awadhi",
nil,
"inc-hie",
"Deva, Kthi, ur-Arab",
strip_diacritics = {
from = {"هٔ", "ۂ"}, -- character "ۂ" code U+06C2 to "ه" and "هٔ" (U+0647 + U+0654) to "ه"
to = {"ہ", "ہ"},
remove_diacritics = c.fathatan .. c.dammatan .. c.kasratan .. c.fatha .. c.damma .. c.kasra .. c.shadda .. c.sukun .. c.nunghunna .. c.superalef
},
translit = {
Deva = "Deva-translit",
Kthi = "Kthi-translit",
["ur-Arab"] = "inc-ohi-translit",
},
}
m["inc-obn"] = {
"เบงกอลเก่า",
113559926,
"inc-bas",
"Beng",
translit = "Beng-translit",
}
m["inc-ogu"] = {
"คุชราตเก่า",
24907427,
"inc-wes",
"Deva",
translit = "Deva-translit",
}
m["inc-ohi"] = {
"ฮินดีเก่า",
48767781,
"inc-hiw",
"Deva, ur-Arab",
strip_diacritics = {
from = {"هٔ", "ۂ"}, -- character "ۂ" code U+06C2 to "ه" and "هٔ" (U+0647 + U+0654) to "ه"
to = {"ہ", "ہ"},
remove_diacritics = c.fathatan .. c.dammatan .. c.kasratan .. c.fatha .. c.damma .. c.kasra .. c.shadda .. c.sukun .. c.nunghunna .. c.superalef
},
translit = {
Deva = "Deva-translit",
["ur-Arab"] = "ur-translit",
},
}
m["inc-oor"] = {
"โอริยาเก่า",
128807801,
"inc-eas",
"Orya",
}
m["inc-opa"] = {
"ปัญจาบเก่า",
115270971,
"inc-pan",
"Guru, pa-Arab",
translit = {
Guru = "Guru-translit",
["pa-Arab"] = "pa-Arab-translit",
},
strip_diacritics = {remove_diacritics = c.fathatan .. c.dammatan .. c.kasratan .. c.fatha .. c.damma .. c.kasra .. c.shadda .. c.sukun},
}
m["inc-pro"] = {
"อินโด-อารยันดั้งเดิม",
23808344,
"inc",
"Latn",
type = "reconstructed",
}
m["ine-ana-pro"] = {
"อานาโตเลียนดั้งเดิม",
7251833,
"ine-ana",
"Latn",
type = "reconstructed",
}
m["ine-bsl-pro"] = {
"บอลโต-สลาวิกดั้งเดิม",
1703347,
"ine-bsl",
"Latn",
type = "reconstructed",
sort_key = {
from = {"[áā]", "[éēḗ]", "[íī]", "[óōṓ]", "[úū]", c.acute, c.macron, "ˀ"},
to = {"a", "e", "i", "o", "u"}
},
}
m["ine-kal"] = {
"Kalašma",
122770439,
"ine-ana",
"Xsux",
}
m["ine-pae"] = {
"Paeonian",
2705672,
"ine",
"Polyt",
-- Polyt translit, display_text, strip_diacritics, sort_key in [[Module:scripts/data]]
}
m["ine-pro"] = {
"อินโด-ยูโรเปียนดั้งเดิม",
37178,
"ine",
"Latn",
type = "reconstructed",
sort_key = {
from = {"[áā]", "[éēḗ]", "[íī]", "[óōṓ]", "[úū]", "ĺ", "ḿ", "ń", "ŕ", "ǵ", "ḱ", "ʰ", "ʷ", "₁", "₂", "₃", c.ringbelow, c.acute, c.macron},
to = {"a", "e", "i", "o", "u", "l", "m", "n", "r", "g'", "k'", "¯h", "¯w", "1", "2", "3"}
},
}
m["ine-toc-pro"] = {
"โทแคเรียนดั้งเดิม",
104841462,
"ine-toc",
"Latn",
type = "reconstructed",
}
m["xme-old"] = {
"Old Median",
36461,
"xme",
"Polyt, Latn",
-- Polyt translit, display_text, strip_diacritics, sort_key in [[Module:scripts/data]]
}
m["xme-mid"] = {
"Middle Median",
12836150,
"xme",
"Latn",
}
m["xme-ker"] = {
"Kermanic",
129850,
"xme",
"fa-Arab, Latn, Hebr",
ancestors = "xme-mid",
-- Hebr display_text, strip_diacritics, sort_key in [[Module:scripts/data]]
}
m["xme-taf"] = {
"Tafreshi",
nil,
"xme",
"fa-Arab, Latn",
ancestors = "xme-mid",
}
m["xme-ttc-pro"] = {
"Proto-Tatic",
122973870,
"xme-ttc",
"Latn",
ancestors = "xme-mid",
}
m["xme-kls"] = {
"Kalasuri",
nil,
"xme-ttc",
ancestors = "xme-ttc-nor",
}
m["xme-klt"] = {
"Kilit",
3612452,
"xme-ttc",
"Cyrl", -- and fa-Arab?
}
m["xme-ott"] = {
"Old Tati",
434697,
"xme-ttc",
"fa-Arab, Latn",
}
m["ira-kms-pro"] = {
"Proto-Komisenian",
116773777,
"ira-kms",
"Latn",
type = "reconstructed",
}
m["ira-mpr-pro"] = {
"Proto-Medo-Parthian",
116773227,
"ira-mpr",
"Latn",
type = "reconstructed",
}
m["ira-pat-pro"] = {
"ปาทานดั้งเดิม",
116773255,
"ira-pat",
"Latn",
type = "reconstructed",
}
m["ira-pro"] = {
"อิเรเนียนดั้งเดิม",
4167865,
"ira",
"Latn",
type = "reconstructed",
}
m["ira-zgr-pro"] = {
"Proto-Zaza-Gorani",
116775031,
"ira-zgr",
"Latn",
type = "reconstructed",
}
m["xsc-pro"] = {
"Proto-Scythian",
116773273,
"xsc",
"Latn",
type = "reconstructed",
}
m["xsc-sar-pro"] = {
"Proto-Sarmatian",
116773249,
"xsc-sar",
"Latn",
type = "reconstructed",
}
m["xsc-skw-pro"] = {
"Proto-Saka-Wakhi",
116773267,
"xsc-skw",
"Latn",
type = "reconstructed",
}
m["xsc-sak-pro"] = {
"Proto-Saka",
116773264,
"xsc-sak",
"Latn",
type = "reconstructed",
}
m["ira-sym-pro"] = {
"Proto-Shughni-Yazghulami-Munji",
116773813,
"ira-sym",
"Latn",
type = "reconstructed",
}
m["ira-sgi-pro"] = {
"Proto-Sanglechi-Ishkashimi",
116773808,
"ira-sgi",
"Latn",
type = "reconstructed",
}
m["ira-mny-pro"] = {
"Proto-Munji-Yidgha",
116773792,
"ira-mny",
"Latn",
type = "reconstructed",
}
m["ira-shy-pro"] = {
"Proto-Shughni-Yazghulami",
116773812,
"ira-shy",
"Latn",
type = "reconstructed",
}
m["ira-shr-pro"] = {
"Proto-Shughni-Roshani",
116773811,
"ira-shr",
"Latn",
type = "reconstructed",
}
m["ira-sgc-pro"] = {
"ซอกดิกดั้งเดิม",
116773276,
"ira-sgc",
"Latn",
type = "reconstructed",
}
m["ira-wnj"] = {
"Vanji",
3398419,
"ira-shy",
"Latn",
}
m["iro-ere"] = {
"Erie",
5388365,
"iro-nor",
"Latn",
}
m["iro-min"] = {
"Mingo",
128531,
"iro-nor",
"Latn",
ietf_subtag = "i-mingo", -- grandfathered IETF tag
}
m["iro-nor-pro"] = {
"Proto-North Iroquoian",
116773242,
"iro-nor",
"Latn",
type = "reconstructed",
}
m["iro-pro"] = {
"Proto-Iroquoian",
7251852,
"iro",
"Latn",
type = "reconstructed",
}
m["itc-pro"] = {
"อิตาลิกดั้งเดิม",
17102720,
"itc",
"Latn",
type = "reconstructed",
}
m["itc-psa"] = {
"Pre-Samnite",
7239186,
"itc-sbl",
"Ital, Polyt, Latn",
-- Ital translit in [[Module:scripts/data]] (NOTE: formerly not present, probably an accidental omission)
-- Polyt translit, display_text, strip_diacritics, sort_key in [[Module:scripts/data]]
}
m["jpx-hcj"] = {
"Hachijō",
5637049,
"jpx",
"Jpan",
ancestors = "ojp-eas",
translit = s["jpx-translit"],
display_text = s["jpx-displaytext"],
strip_diacritics = s["jpx-stripdiacritics"],
sort_key = s["jpx-sortkey"],
}
m["jpx-pro"] = {
"แจพอนิกดั้งเดิม",
3924309,
"jpx",
"Latn",
type = "reconstructed",
}
m["jpx-ryu-pro"] = {
"Proto-Ryukyuan",
56349069,
"jpx-ryu",
"Latn",
type = "reconstructed",
}
m["kar-pro"] = {
"กะเหรี่ยงดั้งเดิม",
85794783,
"kar",
"Latn",
type = "reconstructed",
}
m["kca-eas"] = {
"Eastern Khanty",
30304622,
"kca",
"Cyrl",
translit = "kca-translit",
override_translit = true,
-- TODO temporary until MediaWiki supports Unicode 16 (probably requires a PHP update from their side)
sort_key = { Cyrl = { from = {""}, to = {""} } },
}
m["kca-nor"] = {
"Northern Khanty",
30304527,
"kca",
"Cyrl",
translit = "kca-translit",
override_translit = true,
-- TODO temporary until MediaWiki supports Unicode 16 (probably requires a PHP update from their side)
sort_key = { Cyrl = { from = {""}, to = {""} } },
}
m["kca-pro"] = {
"Proto-Khanty",
127505171,
"kca",
"Latn",
type = "reconstructed",
}
m["kca-sou"] = {
"Southern Khanty",
30304618,
"kca",
"Cyrl",
translit = "kca-translit",
override_translit = true,
}
m["khi-kho-pro"] = {
"Proto-Khoe",
116773218,
"khi-kho",
"Latn",
type = "reconstructed",
}
m["khi-kun"] = {
"ǃKung",
32904,
"khi-kxa",
"Latn",
}
m["ko-ear"] = {
"เกาหลีใหม่ช่วงต้น",
756014,
"qfa-kor",
"Kore",
ancestors = "okm",
translit = "okm-translit",
-- Kore strip_diacritics in [[Module:scripts/data]]
}
m["kro-pro"] = {
"Proto-Kru",
116773778,
"kro",
"Latn",
type = "reconstructed",
}
m["ku-pro"] = {
"เคอร์ดิชดั้งเดิม",
116773221,
"ku",
"Latn",
type = "reconstructed",
}
m["map-ata-pro"] = {
"Proto-Atayalic",
116773151,
"map-ata",
"Latn",
type = "reconstructed",
}
m["map-bms"] = {
"Banyumasan",
33219,
"map",
"Latn, Java",
}
m["map-pro"] = {
"ออสโตรนีเซียนดั้งเดิม",
49230,
"map",
"Latn",
type = "reconstructed",
}
m["mis-hkl"] = {
"Kelantan Peranakan Hokkien",
108794818,
"qfa-mix",
ancestors = "nan-hbl, sou, mfa",
}
m["mis-idn"] = {
"Idiom Neutral",
35847,
"art",
"Latn",
type = "appendix-constructed",
}
m["mis-isa"] = {
"Isaurian",
16956868,
nil,
-- "Xsux, Hluw, Latn",
}
m["mis-jie"] = {
"Jie",
124424186,
nil,
"Hani",
sort_key = "Hani-sortkey",
}
m["mis-jzh"] = {
"Jizhao",
45242758,
"qfa-bej",
"Latn",
}
m["mis-kas"] = {
"Kassite",
35612,
nil,
"Xsux",
}
m["mis-mmd"] = {
"Mimi of Decorse",
6862206,
nil,
"Latn",
}
m["mis-mmn"] = {
"Mimi of Nachtigal",
6862207,
nil,
"Latn",
}
m["mis-phi"] = {
"Philistine",
2230924,
nil,
"Phnx",
-- Phnx translit in [[Module:scripts/data]] (NOTE: not present before, presumably an accidental omission)
}
m["mis-rou"] = {
"Rouran",
48816637,
"qfa-xgx",
"Hani, Latn",
sort_key = {Hani = "Hani-sortkey"},
}
m["mis-tdl"] = {
"Turdulian",
133176492,
}
m["mis-tdt"] = {
"Turdetanian",
133176461,
}
m["mis-tnw"] = {
"Tangwang",
7683179,
"qfa-mix",
"Latn",
ancestors = "cmn, sce",
}
m["mis-tuh"] = {
"Tuyuhun",
48816625,
"qfa-xgx",
"Hani, Latn",
sort_key = {Hani = "Hani-sortkey"},
}
m["mis-tuo"] = {
"Tuoba",
48816629,
"qfa-xgx",
"Hani, Latn",
sort_key = {Hani = "Hani-sortkey"},
}
m["mis-wuh"] = {
"Wuhuan",
118976867,
"qfa-xgx",
"Hani, Latn",
sort_key = {Hani = "Hani-sortkey"},
}
m["mis-xbi"] = {
"Xianbei",
4448647,
"qfa-xgx",
"Hani, Latn",
sort_key = {Hani = "Hani-sortkey"},
}
m["mis-xnu"] = {
"Xiongnu",
10901674,
nil,
"Hani, Latn",
sort_key = {Hani = "Hani-sortkey"},
}
m["mjg-mgl"] = {
"Mongghul",
53765528,
"mjg",
"Latn", -- also Mong, Cyrl ?
}
m["mjg-mgr"] = {
"Mangghuer",
56285392,
"mjg",
"Latn", -- also Mong, Cyrl ?
}
m["mkh-asl-pro"] = {
"Proto-Aslian",
55630680,
"mkh-asl",
"Latn",
type = "reconstructed",
}
m["mkh-ban-pro"] = {
"Proto-Bahnaric",
116773189,
"mkh-ban",
"Latn",
type = "reconstructed",
}
m["mkh-kat-pro"] = {
"Proto-Katuic",
116773772,
"mkh-kat",
"Latn",
type = "reconstructed",
}
m["mkh-khm-pro"] = {
"ขมุอิกดั้งเดิม",
116773774,
"mkh-khm",
"Latn",
type = "reconstructed",
}
m["mkh-kmr-pro"] = {
"เขมรดั้งเดิม",
55630684,
"mkh-kmr",
"Latn",
type = "reconstructed",
}
m["mkh-mmn"] = {
"มอญกลาง",
121337926,
"mkh-mnc",
"Latn, Mymr", --and also Pallava
ancestors = "omx",
}
m["mkh-mnc-pro"] = {
"มอญดั้งเดิม",
116773231,
"mkh-mnc",
"Latn",
type = "reconstructed",
}
m["mkh-mvi"] = {
"เวียดนามกลาง",
9199,
"mkh-vie",
"Hani, Latn",
sort_key = {Hani = "Hani-sortkey"},
}
m["mkh-pal-pro"] = {
"ปะหล่องดั้งเดิม",
104847372,
"mkh-pal",
"Latn",
type = "reconstructed",
}
m["mkh-pea-pro"] = {
"Proto-Pearic",
116773804,
"mkh-pea",
"Latn",
type = "reconstructed",
}
m["mkh-pkn-pro"] = {
"Proto-Pakanic",
116773803,
"mkh-pkn",
"Latn",
type = "reconstructed",
}
m["mkh-pro"] = { --This will be merged into 2015 aav-pro.
"มอญ-เขมรดั้งเดิม",
7251859,
"mkh",
"Latn",
type = "reconstructed",
}
m["mnw-tha"] = { -- To be removed.
"มอญแบบไทย",
nil,
"mkh-mnc",
"Mymr, Thai",
ancestors = "mkh-mmn",
sort_key = {
from = {"[%p]", "ျ", "ြ", "ွ", "ှ", "ၞ", "ၟ", "ၠ", "ၚ", "ဿ", "[็-๎]", "([เแโใไ])([ก-ฮ])ฺ?"},
to = {"", "္ယ", "္ရ", "္ဝ", "္ဟ", "္န", "္မ", "္လ", "င", "သ္သ", "", "%2%1"}
},
}
m["mkh-vie-pro"] = {
"เวียตติกดั้งเดิม",
109432616,
"mkh-vie",
"Latn",
type = "reconstructed",
}
m["mns-cen"] = {
"Central Mansi",
128810384,
"mns",
"Cyrl",
translit = "mns-translit",
override_translit = true,
}
m["mns-nor"] = {
"Northern Mansi",
30304537,
"mns",
"Cyrl",
translit = "mns-translit",
override_translit = true,
}
m["mns-pro"] = {
"Proto-Mansi",
128883093,
"mns",
"Latn",
type = "reconstructed",
}
m["mns-sou"] = {
"Southern Mansi",
30304629,
"mns",
"Cyrl",
translit = "mns-translit",
override_translit = true,
}
m["mun-pro"] = {
"มุนดาดั้งเดิม",
105102373,
"mun",
"Latn",
type = "reconstructed",
}
m["myn-chl"] = { -- the stage after ''emy''
"Ch'olti'",
873995,
"myn",
"Latn",
}
m["myn-pro"] = {
"มายันดั้งเดิม",
3321532,
"myn",
"Latn",
type = "reconstructed",
}
m["nai-ala"] = {
"Alazapa",
128810233,
nil,
"Latn",
}
m["nai-bay"] = {
"Bayogoula",
1563704,
nil,
"Latn",
}
m["nai-cal"] = {
"Calusa",
51782,
nil,
"Latn",
}
m["nai-chi"] = {
"Chiquimulilla",
25339627,
"nai-xin",
"Latn",
}
m["nai-chu-pro"] = {
"Proto-Chumash",
116773736,
"nai-chu",
"Latn",
type = "reconstructed",
}
m["nai-cig"] = {
"Ciguayo",
20741700,
nil,
"Latn",
}
m["nai-ckn-pro"] = {
"Proto-Chinookan",
116773735,
"nai-ckn",
"Latn",
type = "reconstructed",
}
m["nai-guz"] = {
"Guazacapán",
19572028,
"nai-xin",
"Latn",
}
m["nai-hit"] = {
"Hitchiti",
1542882,
"nai-mus",
"Latn",
}
m["nai-ipa"] = {
"Ipai",
3027474,
"nai-yuc",
"Latn",
}
m["nai-jtp"] = {
"Jutiapa",
nil,
"nai-xin",
"Latn",
}
m["nai-jum"] = {
"Jumaytepeque",
25339626,
"nai-xin",
"Latn",
}
m["nai-kat"] = {
"Kathlamet",
6376639,
"nai-ckn",
"Latn",
}
m["nai-klp-pro"] = {
"Proto-Kalapuyan",
116773771,
"nai-klp",
"Latn",
type = "reconstructed",
}
m["nai-knm"] = {
"Konomihu",
3198734,
"nai-shs",
"Latn",
}
m["nai-kum"] = {
"Kumeyaay",
4910139,
"nai-yuc",
"Latn",
}
m["nai-mac"] = {
"Macoris",
21070851,
nil,
"Latn",
}
m["nai-mdu-pro"] = {
"Proto-Maidun",
116773784,
"nai-mdu",
"Latn",
type = "reconstructed",
}
m["nai-miz-pro"] = {
"Proto-Mixe-Zoque",
7251858,
"nai-miz",
"Latn",
type = "reconstructed",
}
m["nai-mus-pro"] = {
"Proto-Muskogean",
116775368,
"nai-mus",
"Latn",
type = "reconstructed",
}
m["nai-nao"] = {
"Naolan",
6964594,
nil,
"Latn",
}
m["nai-nrs"] = {
"New River Shasta",
7011254,
"nai-shs",
"Latn",
}
m["nai-okw"] = {
"Okwanuchu",
3350126,
"nai-shs",
"Latn",
}
m["nai-per"] = {
"Pericú",
3375369,
nil,
"Latn",
}
m["nai-pic"] = {
"Picuris",
7191257,
"nai-kta",
"Latn",
}
m["nai-plp-pro"] = {
"Proto-Plateau Penutian",
116773806,
"nai-plp",
"Latn",
type = "reconstructed",
}
m["nai-pom-pro"] = {
"Proto-Pomo",
116773262,
"nai-pom",
"Latn",
type = "reconstructed",
}
m["nai-qng"] = {
"Quinigua",
36360,
nil,
"Latn",
}
m["nai-sca-pro"] = { -- NB 'sio-pro' "Proto-Siouan" which is Proto-Western Siouan
"Proto-Siouan-Catawban",
116773275,
"nai-sca",
"Latn",
type = "reconstructed",
}
m["nai-sin"] = {
"Sinacantán",
24190249,
"nai-xin",
"Latn",
}
m["nai-sln"] = {
"Salvadoran Lenca",
3229434,
"nai-len",
"Latn",
}
m["nai-spt"] = {
"Sahaptin",
3833015,
"nai-shp",
"Latn",
}
m["nai-tap"] = {
"Tapachultec",
7684401,
"nai-miz",
"Latn",
}
m["nai-taw"] = {
"Tawasa",
7689233,
nil,
"Latn",
}
m["nai-teq"] = {
"Tequistlatec",
2964454,
"nai-tqn",
"Latn",
}
m["nai-tip"] = {
"Tipai",
3027471,
"nai-yuc",
"Latn",
}
m["nai-tot-pro"] = {
"Proto-Totozoquean",
116773285,
"nai-tot",
"Latn",
type = "reconstructed",
}
m["nai-tsi-pro"] = {
"Proto-Tsimshianic",
nil,
"nai-tsi",
"Latn",
type = "reconstructed",
}
m["nai-utn-pro"] = {
"Proto-Utian",
116773290,
"nai-utn",
"Latn",
type = "reconstructed",
}
m["nai-wai"] = {
"Waikuri",
3118702,
nil,
"Latn",
}
m["nai-wji"] = {
"Western Jicaque",
3178610,
"nai-jcq",
"Latn",
}
m["nai-yup"] = {
"Yupiltepeque",
25339628,
"nai-xin",
"Latn",
}
m["nan-dat"] = {
"Datian Min",
19855572,
"zhx-nan",
"Hants",
generate_forms = "zh-generateforms",
sort_key = "Hani-sortkey",
}
m["nan-hbl"] = {
"ฮกเกี้ยน",
1624231,
"zhx-nan",
"Hants, Latn, Bopo, Kana",
wikimedia_codes = "zh-min-nan",
generate_forms = "zh-generateforms",
sort_key = {
Hani = "Hani-sortkey",
Kana = "Kana-sortkey"
},
}
m["nan-hlh"] = {
"Hailufeng Min",
120755728,
"zhx-nan",
"Hants",
generate_forms = "zh-generateforms",
sort_key = "Hani-sortkey",
}
m["nan-lnx"] = {
"Longyan Min",
6674568,
"zhx-nan",
"Hants",
generate_forms = "zh-generateforms",
sort_key = "Hani-sortkey",
}
m["nan-tws"] = {
"แต้จิ๋ว",
36759,
"zhx-nan",
"Hants",
generate_forms = "zh-generateforms",
translit = "zh-translit",
sort_key = "Hani-sortkey",
}
m["nan-zhe"] = {
"Zhenan Min",
3846710,
"zhx-nan",
"Hants",
generate_forms = "zh-generateforms",
sort_key = "Hani-sortkey",
}
m["nan-zsh"] = {
"Sanxiang Min",
7420769,
"zhx-nan",
"Hants",
generate_forms = "zh-generateforms",
sort_key = "Hani-sortkey",
}
m["nds-de"] = {
"เยอรมันต่ำแบบเยอรมนี",
25433,
"gmw-lgm",
"Latn",
ancestors = "nds",
ietf_subtag = "nds-DE", -- should we make this the actual code?
wikimedia_codes = "nds",
}
m["nds-nl"] = {
"Dutch Low Saxon",
516137,
"gmw-lgm",
"Latn",
ancestors = "nds",
ietf_subtag = "nds-NL", -- should we make this the actual code?
wikimedia_codes = "nds-nl",
}
m["ngf-pro"] = {
"Proto-Trans-New Guinea",
85794785,
"ngf",
"Latn",
type = "reconstructed",
}
m["nic-bco-pro"] = {
"เบนูเอ-คองโกดั้งเดิม",
116773194,
"nic-bco",
"Latn",
type = "reconstructed",
}
m["nic-bod-pro"] = {
"แบนทอยด์ดั้งเดิม",
116773190,
"nic-bod",
"Latn",
type = "reconstructed",
}
m["nic-eov-pro"] = {
"Proto-Eastern Oti-Volta",
116773753,
"nic-eov",
"Latn",
type = "reconstructed",
}
m["nic-gns-pro"] = {
"Proto-Gurunsi",
116773759,
"nic-gns",
"Latn",
type = "reconstructed",
}
m["nic-grf-pro"] = {
"Proto-Grassfields",
116773755,
"nic-grf",
"Latn",
type = "reconstructed",
}
m["nic-gur-pro"] = {
"กูร์ดั้งเดิม",
116773758,
"nic-gur",
"Latn",
type = "reconstructed",
}
m["nic-jkn-pro"] = {
"Proto-Jukunoid",
116773769,
"nic-jkn",
"Latn",
type = "reconstructed",
}
m["nic-lcr-pro"] = {
"Proto-Lower Cross River",
116773782,
"nic-lcr",
"Latn",
type = "reconstructed",
}
m["nic-ogo-pro"] = {
"Proto-Ogoni",
116773799,
"nic-ogo",
"Latn",
type = "reconstructed",
}
m["nic-ovo-pro"] = {
"Proto-Oti-Volta",
116773802,
"nic-ovo",
"Latn",
type = "reconstructed",
}
m["nic-plt-pro"] = {
"Proto-Plateau",
116773805,
"nic-plt",
"Latn",
type = "reconstructed",
}
m["nic-pro"] = {
"ไนเจอร์-คองโกดั้งเดิม",
108000748,
"nic",
"Latn",
type = "reconstructed",
}
m["nic-ubg-pro"] = {
"Proto-Ubangian",
116773818,
"nic-ubg",
"Latn",
type = "reconstructed",
}
m["nic-ucr-pro"] = {
"Proto-Upper Cross River",
116773819,
"nic-ucr",
"Latn",
type = "reconstructed",
}
m["nic-vco-pro"] = {
"วอลตา-คองโกดั้งเดิม",
116773293,
"nic-vco",
"Latn",
type = "reconstructed",
}
m["nub-har"] = {
"Haraza",
19572059,
"nub",
"Arab, Latn",
}
m["nub-pro"] = {
"นูเบียนดั้งเดิม",
116773246,
"nub",
"Latn",
type = "reconstructed",
}
m["omq-cha-pro"] = {
"Proto-Chatino",
116773202,
"omq-cha",
"Latn",
type = "reconstructed",
}
m["omq-maz-pro"] = {
"Proto-Mazatec",
116773790,
"omq-maz",
"Latn",
type = "reconstructed",
}
m["omq-mix-pro"] = {
"Proto-Mixtecan",
21573423,
"omq-mix",
"Latn",
type = "reconstructed",
}
m["omq-mxt-pro"] = {
"Proto-Mixtec",
21573424,
"omq-mxt",
"Latn",
type = "reconstructed",
}
m["omq-otp-pro"] = {
"Proto-Oto-Pamean",
116773251,
"omq-otp",
"Latn",
type = "reconstructed",
}
m["omq-pro"] = {
"Proto-Oto-Manguean",
33669,
"omq",
"Latn",
type = "reconstructed",
}
m["omq-sjq"] = {
"San Juan Quiahije Chatino",
17003130,
"omq-cha",
"Latn",
}
m["omq-tel"] = {
"Teposcolula Mixtec",
nil,
"omq-mxt",
"Latn",
}
m["omq-teo"] = {
"Teojomulco Chatino",
25340451,
"omq-cha",
"Latn",
}
m["omq-tri-pro"] = {
"Proto-Triqui",
116773817,
"omq-tri",
"Latn",
type = "reconstructed",
}
m["omq-zap-pro"] = {
"Proto-Zapotecan",
116773297,
"omq-zap",
"Latn",
type = "reconstructed",
}
m["omq-zpc-pro"] = {
"Proto-Zapotec",
116773296,
"omq-zpc",
"Latn",
type = "reconstructed",
}
m["omv-aro-pro"] = {
"Proto-Aroid",
116773721,
"omv-aro",
"Latn",
type = "reconstructed",
}
m["omv-diz-pro"] = {
"Proto-Dizoid",
116773750,
"omv-diz",
"Latn",
type = "reconstructed",
}
m["omv-pro"] = {
"Proto-Omotic",
116773800,
"omv",
"Latn",
type = "reconstructed",
}
m["oto-otm-pro"] = {
"Proto-Otomi",
5908710,
"oto-otm",
"Latn",
type = "reconstructed",
}
m["oto-pro"] = {
"Proto-Otomian",
116773252,
"oto",
"Latn",
type = "reconstructed",
}
m["paa-bin-pro"] = {
"Proto-Binanderean",
137881672,
"paa-bin",
"Latn",
type = "reconstructed",
}
m["paa-kom"] = {
"Kómnzo",
18344310,
"paa-yam",
"Latn",
}
m["paa-kwn"] = {
"Kuwani",
6449056,
"qfa-unc", -- poorly attested, possibly the same as or related to Kalabra
"Latn",
}
m["paa-nha-pro"] = {
"Proto-North Halmahera",
116773241,
"paa-nha",
"Latn",
type = "reconstructed"
}
m["paa-nun"] = {
"Nungon",
128807788,
"ngf-fin",
"Latn",
}
m["phi-din"] = {
"Dinapigue Agta",
16945774,
"phi",
"Latn",
}
m["phi-kal-pro"] = {
"คาลาเมียนดั้งเดิม",
116773213,
"phi-kal",
"Latn",
type = "reconstructed",
}
m["phi-nag"] = {
"Nagtipunan Agta",
16966111,
"phi",
"Latn",
}
m["phi-pro"] = {
"ฟิลิปปินส์ดั้งเดิม",
18204898,
"phi",
"Latn",
type = "reconstructed",
}
m["poz-abi"] = {
"Abai",
19570729,
"poz-san",
"Latn",
}
m["poz-bal"] = {
"Baliledo",
4850912,
"poz",
"Latn",
}
m["poz-btk-pro"] = {
"Proto-Bungku-Tolaki",
116773724,
"poz-btk",
"Latn",
type = "reconstructed",
}
m["poz-cet-pro"] = {
"มาลาโย-พอลินีเชียนตอนกลาง-ตะวันออกดั้งเดิม",
2269883,
"poz-cet",
"Latn",
type = "reconstructed",
}
m["poz-hce-pro"] = {
"Proto-Halmahera-Cenderawasih",
116773209,
"poz-hce",
"Latn",
type = "reconstructed",
}
m["poz-lgx-pro"] = {
"ลัมปุงกิกดั้งเดิม",
116773222,
"poz-lgx",
"Latn",
type = "reconstructed",
}
m["poz-mcm-pro"] = {
"มาลาโย-จามิกดั้งเดิม",
116773225,
"poz-mcm",
"Latn",
type = "reconstructed",
}
m["poz-mic-pro"] = {
"ไมโครนีเซียนดั้งเดิม",
111939079,
"poz-mic",
"Latn",
type = "reconstructed",
}
m["poz-mly-pro"] = {
"มาเลย์อิกดั้งเดิม",
98057728,
"poz-mly",
"Latn",
type = "reconstructed",
}
m["poz-msa-pro"] = {
"มาลาโย-ซุมบาวันดั้งเดิม",
116773226,
"poz-msa",
"Latn",
type = "reconstructed",
}
m["poz-oce-pro"] = {
"โอเชียนิกดั้งเดิม",
141741,
"poz-oce",
"Latn",
type = "reconstructed",
}
m["poz-pep-pro"] = {
"พอลินีเชียนตะวันออกดั้งเดิม",
113988745,
"poz-pep",
"Latn",
type = "reconstructed",
}
m["poz-pnp-pro"] = {
"นิวเคลียร์พอลินีเชียนดั้งเดิม",
113988746,
"poz-pnp",
"Latn",
type = "reconstructed",
}
m["poz-pol-pro"] = {
"พอลินีเชียนดั้งเดิม",
1658709,
"poz-pol",
"Latn",
type = "reconstructed",
}
m["poz-pro"] = {
"มาลาโย-พอลินีเชียนดั้งเดิม",
3832960,
"poz",
"Latn",
type = "reconstructed",
}
m["poz-sml"] = {
"Sarawak Malay",
4251702,
"poz-mly",
"Latn, ms-Arab",
}
m["poz-ssw-pro"] = {
"ซูลาเวซีใต้ดั้งเดิม",
116773279,
"poz-ssw",
"Latn",
type = "reconstructed",
}
m["poz-swa-pro"] = {
"ซาราวักเหนือดั้งเดิม",
116773243,
"poz-swa",
"Latn",
type = "reconstructed",
}
m["poz-ter"] = {
"มลายูแบบตรังกานู",
4207412,
"poz-mly",
"Latn, ms-Arab",
}
m["pqe-pro"] = {
"มาลาโย-พอลินีเชียนตะวันออกดั้งเดิม",
2269883,
"pqe",
"Latn",
type = "reconstructed",
}
m["pra-niy"] = {
"Niya Prakrit",
11991601,
"inc-mid",
"Khar",
ancestors = "inc-ash",
translit = "Khar-translit",
}
m["qfa-adm-pro"] = {
"เกรตอันดามันนีสดั้งเดิม",
116773756,
"qfa-adm",
"Latn",
type = "reconstructed",
}
m["qfa-bet-pro"] = {
"เบ-ไทดั้งเดิม",
116773193,
"qfa-bet",
"Latn",
type = "reconstructed",
}
m["qfa-cka-pro"] = {
"Proto-Chukotko-Kamchatkan",
7251837,
"qfa-cka",
"Latn",
type = "reconstructed",
}
m["qfa-hur-pro"] = {
"Proto-Hurro-Urartian",
116773211,
"qfa-hur",
"Latn",
type = "reconstructed",
}
m["qfa-kad-pro"] = {
"Proto-Kadu",
116773770,
"qfa-kad",
"Latn",
type = "reconstructed",
}
m["qfa-kms-pro"] = {
"Proto-Kam-Sui",
55630682,
"qfa-kms",
"Latn",
type = "reconstructed",
}
m["qfa-kor-pro"] = {
"เกาหลีดั้งเดิม",
467883,
"qfa-kor",
"Latn",
type = "reconstructed",
}
m["qfa-kra-pro"] = {
"ขร้าดั้งเดิม",
7251854,
"qfa-kra",
"Latn",
type = "reconstructed",
}
m["qfa-lic-pro"] = {
"ไหลดั้งเดิม",
7251845,
"qfa-lic",
"Latn",
type = "reconstructed",
}
m["qfa-onb-pro"] = {
"เบดั้งเดิม",
116773192,
"qfa-onb",
"Latn",
type = "reconstructed",
}
m["qfa-ong-pro"] = {
"Proto-Ongan",
116773801,
"qfa-ong",
"Latn",
type = "reconstructed",
}
m["qfa-tak-pro"] = {
"ขร้า-ไทดั้งเดิม",
104901616,
"qfa-tak",
"Latn",
type = "reconstructed",
}
m["qfa-yen-pro"] = {
"Proto-Yeniseian",
27639,
"qfa-yen",
"Latn",
type = "reconstructed",
}
m["qfa-yuk-pro"] = {
"Proto-Yukaghir",
116773294,
"qfa-yuk",
"Latn",
type = "reconstructed",
}
m["qwe-kch"] = {
"Kichwa",
1740805,
"qwe",
"Latn",
ancestors = "qu",
}
m["qwe-pro"] = {
"เกชวนดั้งเดิม",
5575757,
"qwe",
"Latn",
type = "reconstructed",
}
m["roa-ang"] = {
"Angevin",
56782,
"roa-oil",
"Latn",
sort_key = s["roa-oil-sortkey"],
}
m["roa-bbn"] = {
"บูร์บอแน-แบรีชง",
2899128,
"roa-oil",
"Latn",
sort_key = s["roa-oil-sortkey"],
}
m["roa-brg"] = {
"Bourguignon",
508332,
"roa-oil",
"Latn",
sort_key = s["roa-oil-sortkey"],
}
m["roa-can"] = {
"Cantabrian",
917021,
"roa-asl",
"Latn",
}
m["roa-cha"] = {
"Champenois",
430018,
"roa-oil",
"Latn",
sort_key = s["roa-oil-sortkey"],
}
m["roa-fcm"] = {
"Franc-Comtois",
510561,
"roa-oil",
"Latn",
sort_key = s["roa-oil-sortkey"],
}
m["roa-gal"] = {
"Gallo",
37300,
"roa-oil",
"Latn",
sort_key = s["roa-oil-sortkey"],
}
m["roa-gib"] = {
"Gallo-Italic of Basilicata",
3094838,
"roa-git",
"Latn",
}
m["roa-gis"] = {
"Gallo-Italic of Sicily",
2629019,
"roa-git",
"Latn",
}
m["roa-leo"] = {
"เลออน",
34108,
"roa-asl",
"Latn",
}
m["roa-lor"] = {
"Lorrain",
671198,
"roa-oil",
"Latn",
sort_key = s["roa-oil-sortkey"],
}
m["roa-oca"] = {
"กาตาลาเก่า",
15478520,
"roa-ocr",
"Latn",
sort_key = {remove_diacritics = c.grave .. c.acute .. c.diaer .. c.cedilla .. "·"},
}
m["roa-ole"] = {
"เลออนเก่า",
125977465,
"roa-asl",
"Latn",
}
m["roa-ona"] = {
"Old Navarro-Aragonese",
2736184,
"roa-nar",
"Latn",
}
m["roa-opt"] = {
"กาลิเซีย-โปรตุเกสเก่า",
1072111,
"roa-gap",
"Latn",
strip_diacritics = {remove_diacritics = c.grave .. c.acute .. c.circ},
}
m["roa-orl"] = {
"Orléanais",
28497058,
"roa-oil",
"Latn",
sort_key = s["roa-oil-sortkey"],
}
m["roa-poi"] = {
"Poitevin-Saintongeais",
514123,
"roa-oil",
"Latn",
sort_key = s["roa-oil-sortkey"],
}
m["roa-tar"] = {
"Tarantino",
695526,
"roa-itr",
"Latn",
wikimedia_codes = "roa-tara",
}
m["sai-all"] = {
"Allentiac",
19570789,
"sai-hrp",
"Latn",
}
m["sai-and"] = { -- not to be confused with 'cbc' or 'ano'
"Andoquero",
16828359,
"sai-wit",
"Latn",
}
m["sai-ayo"] = {
"Ayomán",
16937754,
"sai-jir",
"Latn",
}
m["sai-bae"] = {
"Baenan",
3401998,
"qfa-unc", -- extinct, poorly attested; only known through 9 words
"Latn",
}
m["sai-bag"] = {
"Bagua",
5390321,
"qfa-unc", -- extinct, poorly attested; possibly Cariban
"Latn",
}
m["sai-bet"] = {
"Betoi",
926551,
"qfa-iso",
"Latn",
}
m["sai-bor-pro"] = {
"Proto-Boran",
nil,
"sai-bor",
"Latn",
}
m["sai-cac"] = {
"Cacán",
945482,
"qfa-unc", -- extinct, poorly attested; no consensus on classification
"Latn",
}
m["sai-caq"] = {
"Caranqui",
2937753,
"sai-bar",
"Latn",
}
m["sai-car-pro"] = {
"Proto-Cariban",
116773196,
"sai-car",
"Latn",
type = "reconstructed",
}
m["sai-cat"] = {
"Catacao",
5051136,
"sai-ctc",
"Latn",
}
m["sai-cer-pro"] = {
"Proto-Cerrado",
116773200,
"sai-cer",
"Latn",
type = "reconstructed",
}
m["sai-chi"] = {
"Chirino",
5390321,
"qfa-unc", -- extinct, only four words known; possibly related to Candoshi-Shapra (cbu)
"Latn",
}
m["sai-chn"] = {
"Chaná",
5072718,
"sai-crn",
"Latn",
}
m["sai-chp"] = {
"Chapacura",
5072884,
"sai-cpc",
"Latn",
}
m["sai-chr"] = {
"Charrua",
5086680,
"sai-crn",
"Latn",
}
m["sai-chu"] = {
"Churuya",
5118339,
"sai-guh",
"Latn",
}
m["sai-cje-pro"] = {
"Proto-Central Jê",
116773198,
"sai-cje",
"Latn",
type = "reconstructed",
}
m["sai-cmg"] = {
"Comechingon",
6644203,
"qfa-unc", -- extinct, poorly attested; no consensus on classification
"Latn",
}
m["sai-cno"] = {
"Chono",
5104704,
"qfa-unc", -- extinct, poorly attested; no consensus on classification, possibly spurious
"Latn",
}
m["sai-cnr"] = {
"Cañari",
5055572,
"qfa-unc", -- extinct, poorly attested; possibly Chimuan or Barbacoan
"Latn",
}
m["sai-coe"] = {
"Coeruna",
6425639,
"sai-wit",
"Latn",
}
m["sai-col"] = {
"Colán",
5141893,
"sai-ctc",
"Latn",
}
m["sai-cop"] = {
"Copallén",
5390321,
"qfa-unc", -- extinct, only four words attested; possibly Cholonan
"Latn",
}
m["sai-crd"] = {
"Coroado Puri",
24191321,
"sai-mje",
"Latn",
}
m["sai-ctq"] = {
"Catuquinaru",
16858455,
"qfa-unc", -- extinct, poorly attested; vocabulary does not resemble other languages
"Latn",
}
m["sai-cul"] = {
"Culli",
2879660,
"qfa-unc", -- extinct, poorly attested; often considered an isolate
"Latn",
}
m["sai-cva"] = {
"Cueva",
5192644,
"qfa-unc", -- extinct, poorly attested; possibly Chocoan
"Latn",
}
m["sai-esm"] = {
"Esmeralda",
3058083,
"qfa-unc", -- extinct, poorly attested; possibly related to Yaruro
"Latn",
}
m["sai-ewa"] = {
"Ewarhuyana",
16898104,
nil,
"Latn",
}
m["sai-gam"] = {
"Gamela",
5403661,
"qfa-unc", -- extinct, poorly attested; possibly an isolate
"Latn",
}
m["sai-gay"] = {
"Gayón",
5528902,
"sai-jir",
"Latn",
}
m["sai-gmo"] = {
"Guamo",
5613495,
"qfa-unc", -- extinct; "Kaufman (1990) finds a connection with the Chapacuran languages convincing." [Wikipedia] Considered an isolate by Campbell (2024).
"Latn",
}
m["sai-gua"] = {
"Guachí",
5613172,
"sai-guc",
"Latn",
}
m["sai-gue"] = {
"Güenoa",
5626799,
"sai-crn",
"Latn",
}
m["sai-hau"] = {
"Haush",
3128376,
"sai-cho",
"Latn",
}
m["sai-jee-pro"] = {
"Proto-Jê",
116773212,
"sai-jee",
"Latn",
type = "reconstructed",
}
m["sai-jko"] = {
"Jeikó",
6176527,
"sai-mje",
"Latn",
}
m["sai-jrj"] = {
"Jirajara",
6202966,
"sai-jir",
"Latn",
}
m["sai-kat"] = { -- contrast xoo, kzw, sai-xoc
"Katembri",
6375925,
"qfa-unc", -- extinct, poorly attested; "Kaufman (1990) has linked it with the nearly extinct Taruma, although this has not been accepted by other scholars." [Wikipedia]
"Latn",
}
m["sai-mal"] = {
"Malalí",
6741212,
"sai-mje", -- considered the most divergent Maxakalían language (a subdivision of Macro-Jê), for which we have no entry
"Latn",
}
m["sai-mar"] = {
"Maratino",
6755055,
"qfa-unc", -- extinct, poorly attested; possibly Uto-Aztecan
"Latn",
}
m["sai-mat"] = {
"Matanawi",
6786047,
"qfa-unc", -- extinct; either an isolate or distantly related to the Muran languages; Campbell (2024) lists it as an isolate, Glottolog gives it as unclassified
"Latn",
}
m["sai-mcn"] = {
"Mocana",
3402048,
"qfa-unc", -- extinct, poorly attested; given as part of the Malibu languages (geographic grouping; not a clade)
"Latn",
}
m["sai-men"] = {
"Menien",
16890110,
"sai-mje",
"Latn",
}
m["sai-mil"] = {
"Millcayac",
19573012,
"sai-hrp",
"Latn",
}
m["sai-mlb"] = {
"Malibu",
3402048,
"qfa-unc", -- extinct, poorly attested; given as part of the Malibu languages (geographic grouping; not a clade)
"Latn",
}
m["sai-msk"] = {
"Masakará",
6782426,
"sai-mje",
"Latn",
}
m["sai-muc"] = {
"Mucuchí",
6931290,
nil, -- generally considered Timotean, for which we have no entry
"Latn",
}
m["sai-mue"] = {
"Muellama",
16886936,
"sai-bar",
"Latn",
}
m["sai-muz"] = {
"Muzo",
6644203,
"qfa-unc", -- extinct language of Colombia, poorly attested; may be Pijao (Cariban)
"Latn",
}
m["sai-mys"] = {
"Maynas",
16919393,
"sai-cah", -- per Campbell (2024); formerly considered unclassified
"Latn",
}
m["sai-nat"] = {
"Natú",
9006749,
"qfa-unc", -- extinct, poorly attested; "only Greenberg dares to classify [it]".[Wikipedia, quoting Moseley, Christopher; Asher, R. E.; Tait, Mary (1994), Atlas of the world's languages]
"Latn",
}
m["sai-nje-pro"] = {
"Proto-Northern Jê",
116773245,
"sai-nje",
"Latn",
type = "reconstructed",
}
m["sai-opo"] = {
"Opón",
7099152,
"sai-car",
"Latn",
}
m["sai-oto"] = {
"Otomaco",
16879234,
"sai-otm",
"Latn",
}
m["sai-pal"] = {
"Palta",
3042978,
"qfa-unc", -- extinct, unclassified; possibly Chicham
"Latn",
}
m["sai-pam"] = {
"Pamigua",
5908689,
"sai-otm",
"Latn",
}
m["sai-par"] = {
"Paratió",
16890038,
"qfa-unc", -- extinct, poorly attested; possibly Xukuruan
"Latn",
}
m["sai-peb"] = {
"Peba",
3373890,
"sai-pey",
"Latn",
}
m["sai-pnz"] = {
"Panzaleo",
3123275,
"qfa-unc", -- extinct, unclassified; possibly Paezan
"Latn",
}
m["sai-prh"] = {
"Puruhá",
3410994,
"qfa-unc", -- extinct, poorly attested; possibly in a famil with Cañari
"Latn",
}
m["sai-ptg"] = {
"Patagón",
128807870,
"sai-tar", -- extinct, only known from 4 words, which suggest Cariban lineage (Campbell 2024)
"Latn",
}
m["sai-pur"] = {
"Purukotó",
7261622,
"sai-pem",
"Latn",
}
m["sai-pyg"] = {
"Payaguá",
7156643,
"sai-guc",
"Latn",
}
m["sai-pyk"] = {
"Pykobjê",
98113977,
"sai-nje",
"Latn",
}
m["sai-qmb"] = {
"Quimbaya",
7272043,
"qfa-unc", -- extinct, might not exist; few known words
"Latn",
}
m["sai-qtm"] = {
"Quitemo",
7272651,
"sai-cpc",
"Latn",
}
m["sai-rab"] = {
"Rabona",
6644203,
"qfa-unc", -- extinct, poorly attested, mostly plant names; possibly Candoshi-Shapra
"Latn",
}
m["sai-ram"] = {
"Ramanos",
16902824,
"qfa-unc", -- extinct, poorly attested, possibly an isolate; per Glottolog: "the minuscule wordlist ... shows no convincing resemblances to surrounding languages"
"Latn",
}
m["sai-sac"] = {
"Sácata",
5390321,
"qfa-unc", -- extinct, only 3 words known; possibly Candoshí or Arawakan
"Latn",
}
m["sai-san"] = {
"Sanaviron",
16895999,
"qfa-unc", -- extinct, unclassified; no consensus on classification
"Latn",
}
m["sai-sap"] = {
"Sapará",
7420922,
"sai-car",
"Latn",
}
m["sai-sec"] = {
"Sechura",
7442912,
"qfa-unc", -- extinct, poorly attested; possibly Catacaoan
"Latn",
}
m["sai-sin"] = {
"Sinúfana",
7525275,
"qfa-unc", -- moribund, poorly attested; possibly Chocoan
"Latn",
}
m["sai-sje-pro"] = {
"Proto-Southern Jê",
116773814,
"sai-sje",
"Latn",
type = "reconstructed",
}
m["sai-tab"] = {
"Tabancale",
5390321,
"qfa-unc", -- extinct, only 5 words known; no obvious connections, might be an isolate
"Latn",
}
m["sai-tal"] = {
"Tallán",
16910468,
"qfa-unc", -- extinct, poorly attested; might be Catacaoan
"Latn",
}
m["sai-tap"] = {
"Tapayuna",
30719984,
"sai-nje",
"Latn",
}
m["sai-tar-pro"] = {
"Proto-Taranoan",
116773816,
"sai-tar",
"Latn",
type = "reconstructed",
}
m["sai-teu"] = {
"Teushen",
3519243,
"qfa-unc", -- probably extinct by the 1950's; possibly Chonan
"Latn",
}
m["sai-tim"] = {
"Timote",
7806995,
nil, -- possibly in a small Timotean family
"Latn",
}
m["sai-tpr"] = {
"Taparita",
7684460,
"sai-otm",
"Latn",
}
m["sai-trr"] = {
"Tarairiú",
7685313,
"qfa-unc", -- extinct, too poorly attested to classify
"Latn",
}
m["sai-wai"] = {
"Waitaká",
16918610,
"qfa-unc", -- extinct, possibly Purian
"Latn",
}
m["sai-way"] = {
"Wayumara",
7960726,
"sai-car",
"Latn",
}
m["sai-wit-pro"] = {
"Proto-Witotoan",
116773823,
"sai-wit",
"Latn",
type = "reconstructed",
}
m["sai-wnm"] = {
"Wanham",
16879440,
"sai-cpc",
"Latn",
}
m["sai-xoc"] = { -- contrast xoo, kzw, sai-kat
"Xocó",
12953620,
"qfa-unc", -- extinct and poorly attested; not clear if one or three languages
"Latn",
}
m["sai-yao"] = {
"Yao (South America)",
16979655,
"sai-ven",
"Latn",
}
m["sai-yar"] = { -- not the same family as 'suy'
"Yarumá",
3505859,
"sai-pek",
"Latn",
}
m["sai-yri"] = {
"Yuri",
2669157,
"sai-tyu",
"Latn",
}
m["sai-yup"] = {
"Yupua",
8061430,
"sai-tuc",
"Latn",
}
m["sai-yur"] = {
"Yurumanguí",
1281291,
"qfa-unc", -- extinct, too poorly attested to classify
"Latn",
}
m["sal-pro"] = {
"Proto-Salish",
116773269,
"sal",
"Latn",
type = "reconstructed",
}
m["sdv-daj-pro"] = {
"Proto-Daju",
116773739,
"sdv-daj",
"Latn",
type = "reconstructed",
}
m["sdv-eje-pro"] = {
"Proto-Eastern Jebel",
116773751,
"sdv-eje",
"Latn",
type = "reconstructed",
}
m["sdv-nil-pro"] = {
"Proto-Nilotic",
116773794,
"sdv-nil",
"Latn",
type = "reconstructed",
}
m["sdv-nyi-pro"] = {
"Proto-Nyima",
116773796,
"sdv-nyi",
"Latn",
type = "reconstructed",
}
m["sdv-tmn-pro"] = {
"Proto-Taman",
116773815,
"sdv-tmn",
"Latn",
type = "reconstructed",
}
m["sel-nor"] = {
"Northern Selkup",
30304565,
"sel",
"Cyrl",
translit = "sel-nor-translit",
}
m["sel-pro"] = {
"Proto-Selkup",
128884235,
"sel",
"Latn",
type = "reconstructed",
}
m["sel-sou"] = {
"Southern Selkup",
30304639,
"sel",
"Cyrl",
translit = "sel-sou-translit",
}
m["sem-amm"] = {
"อัมโมน",
279181,
"sem-can",
"Phnx",
-- Phnx translit in [[Module:scripts/data]]
}
m["sem-amo"] = {
"Amorite",
35941,
"sem-nwe",
"Xsux, Latn",
}
m["sem-cha"] = {
"Chaha",
35543,
"sem-eth",
"Ethi",
translit = "Ethi-translit",
}
m["sem-dad"] = {
"Dadanitic",
21838040,
"sem-cen",
"Narb",
-- Narb translit in [[Module:scripts/data]]
}
m["sem-dum"] = {
"Dumaitic",
128810397,
"sem-cen",
"Narb",
-- Narb translit in [[Module:scripts/data]]
}
m["sem-has"] = {
"Hasaitic",
3541433,
"sem-cen",
"Narb",
-- Narb translit in [[Module:scripts/data]]
}
m["sem-his"] = {
"Hismaic",
22948260,
"sem-cen",
"Narb",
-- Narb translit in [[Module:scripts/data]]
}
m["sem-mhr"] = {
"Muher",
33743,
"sem-eth",
"Latn",
}
m["sem-pro"] = {
"เซมิติกดั้งเดิม",
1658554,
"sem",
"Latn",
type = "reconstructed",
}
m["sem-saf"] = {
"Safaitic",
472586,
"sem-cen",
"Narb",
-- Narb translit in [[Module:scripts/data]]
}
m["sem-sam"] = {
"Samalian",
85847147,
"sem-nwe",
"Phnx",
-- Phnx translit in [[Module:scripts/data]]
}
m["sem-srb"] = {
"Old South Arabian",
35025,
"sem-osa",
"Sarb",
-- Sarb translit in [[Module:scripts/data]]
}
m["sem-tay"] = {
"Taymanitic",
24912301,
"sem-cen",
"Narb",
-- Narb translit in [[Module:scripts/data]]
}
m["sem-tha"] = {
"Thamudic",
843030,
"sem-cen",
"Narb",
-- Narb translit in [[Module:scripts/data]]
}
m["sem-wes-pro"] = {
"เซมิติกตะวันตกดั้งเดิม",
98021726,
"sem-wes",
"Latn",
type = "reconstructed",
}
m["sio-pro"] = { -- NB this is not Proto-Siouan-Catawban 'nai-sca-pro'
"Proto-Siouan",
34181,
"sio",
"Latn",
type = "reconstructed",
}
m["sit-aao-pro"] = {
"Proto-Central Naga",
nil,
"sit-aao",
"Latn",
type = "reconstructed",
}
m["sit-bai-pro"] = {
"Proto-Bai",
nil,
"sit-bai",
"Latn",
type = "reconstructed",
}
m["sit-ban"] = {
"Bangru",
56071779,
"sit-hrs",
"Latn",
}
m["sit-bdi-pro"] = {
"Proto-Bodish",
nil,
"sit-bdi",
"Latn",
type = "reconstructed",
}
m["sit-bok"] = {
"Bokar",
4938727,
"sit-tan",
"Latn, Tibt",
override_translit = true,
-- Tibt translit, display_text, strip_diacritics, sort_key in [[Module:scripts/data]]
}
m["sit-cai"] = {
"Caijia",
5017528,
"sit-cln",
"Latn"
}
m["sit-cha"] = {
"Chairel",
5068066,
"sit-luu",
"Latn",
}
m["sit-ers-pro"] = {
"Proto-Ersuic",
nil,
"sit-ers",
"Latn",
type = "reconstructed",
}
m["sit-hrs-pro"] = {
"Proto-Hrusish",
116773762,
"sit-hrs",
"Latn",
type = "reconstructed",
}
m["sit-jap"] = {
"Japhug",
3162245,
"sit-egy",
"Latn",
}
m["sit-kha-pro"] = {
"Proto-Kham",
116773773,
"sit-kha",
"Latn",
type = "reconstructed",
}
m["sit-khb-pro"] = {
"Proto-Kho-Bwa",
nil,
"sit-khb",
"Latn",
type = "reconstructed",
}
m["sit-khp-pro"] = {
"Proto-Puroik",
nil,
"sit-khb",
"Latn",
type = "reconstructed",
}
m["sit-khw-pro"] = {
"Proto-Western Kho-Bwa",
nil,
"sit-khw",
"Latn",
type = "reconstructed",
}
m["sit-kon-pro"] = {
"Proto-Northern Naga",
nil,
"sit-kon",
"Latn",
type = "reconstructed",
}
m["sit-liz"] = {
"Lizu",
6660653,
"sit-ers",
"Latn", -- and Ersu Shaba
}
m["sit-lnj"] = {
"Longjia",
17096251,
"sit-cln",
"Latn"
}
m["sit-lrn"] = {
"Luren",
16946370,
"sit-cln",
"Latn"
}
m["sit-luu-pro"] = {
"ลูอิชดั้งเดิม",
116773783,
"sit-luu",
"Latn",
type = "reconstructed",
}
m["sit-nas-pro"] = {
"Proto-Naish",
nil,
"sit-nas",
"Latn",
type = "reconstructed",
}
m["sit-prn"] = {
"Puiron",
7259048,
"sit-zem",
}
m["sit-pro"] = {
"ซีโน-ทิเบตันดั้งเดิม",
24839178,
"sit",
"Latn",
type = "reconstructed",
}
m["sit-sit"] = {
"Situ",
19840830,
"sit-egy",
"Latn",
}
m["sit-tam-pro"] = {
"Proto-Tamangic",
117469295,
"sit-tam",
"Latn",
type = "reconstructed",
}
m["sit-tan-pro"] = {
"Proto-Tani",
116773284,
"sit-tan",
"Latn", -- needs verification
type = "reconstructed",
}
m["sit-tgm"] = {
"ตางัม",
17041370,
"sit-tan",
"Latn",
}
m["sit-tng-pro"] = {
"Proto-Tangkhulic",
nil,
"sit-tng",
"Latn",
type = "reconstructed"
}
m["sit-tos"] = {
"Tosu",
7827899,
"sit-ers",
"Latn", -- also Ersu Shaba
}
m["sit-tsh"] = {
"Tshobdun",
19840950,
"sit-egy",
"Latn",
}
m["sit-zbu"] = {
"Zbu",
19841106,
"sit-egy",
"Latn",
}
m["sla-pro"] = {
"สลาวิกดั้งเดิม",
747537,
"sla",
"Latn",
type = "reconstructed",
strip_diacritics = {
remove_diacritics = c.grave .. c.acute .. c.tilde .. c.macron .. c.dgrave .. c.invbreve,
remove_exceptions = {'ś'},
},
sort_key = {
from = {"č", "ď", "ě", "ę", "ь", "ľ", "ň", "ǫ", "ř", "š", "ś", "ť", "ъ", "ž"},
to = {"c²", "d²", "e²", "e³", "i²", "l²", "nj", "o²", "r²", "s²", "s³", "t²", "u²", "z²"},
}
}
m["smi-pro"] = {
"ซามิกดั้งเดิม",
7251862,
"smi",
"Latn",
type = "reconstructed",
sort_key = {
from = {"ā", "č", "δ", "[ëē]", "ŋ", "ń", "ō", "š", "θ", "%([^()]+%)"},
to = {"a", "c²", "d", "e", "n²", "n³", "o", "s²", "t²"}
},
}
m["son-pro"] = {
"Proto-Songhay",
116773277,
"son",
"Latn",
type = "reconstructed",
}
m["sqj-pro"] = {
"แอลเบเนียนดั้งเดิม",
18210846,
"sqj",
"Latn",
type = "reconstructed",
}
m["ssa-klk-pro"] = {
"Proto-Kuliak",
116773779,
"ssa-klk",
"Latn",
type = "reconstructed",
}
m["ssa-kom-pro"] = {
"Proto-Koman",
116773775,
"ssa-kom",
"Latn",
type = "reconstructed",
}
m["ssa-pro"] = {
"Proto-Nilo-Saharan",
116773236,
"ssa",
"Latn",
type = "reconstructed",
}
m["syd-pro"] = {
"Proto-Samoyedic",
7251863,
"syd",
"Latn",
type = "reconstructed",
}
m["tai-pro"] = {
"ไทดั้งเดิม",
6583709,
"tai",
"Latn",
type = "reconstructed",
}
m["tai-swe-pro"] = {
"ไทตะวันตกเฉียงใต้ดั้งเดิม",
116773280,
"tai-swe",
"Latn",
type = "reconstructed",
}
m["tbq-bdg-pro"] = {
"โบโด-กาโรดั้งเดิม",
116773195,
"tbq-bdg",
"Latn",
type = "reconstructed",
}
m["tbq-blg"] = {
"Bailang",
2879843,
"tbq-lob",
"Hani",
sort_key = "Hani-sortkey",
}
m["tbq-brm-pro"] = {
"Proto-Burmish",
nil,
"tbq-brm",
"Latn",
type = "reconstructed",
}
m["tbq-gkh"] = {
"Gokhy",
5578069,
"tbq-sil",
"Latn",
}
m["tbq-kuk-pro"] = {
"Proto-Kuki-Chin",
116773220,
"tbq-kuk",
"Latn",
type = "reconstructed",
}
m["tbq-lal-pro"] = {
"Proto-Lalo",
116773781,
"tbq-lal",
"Latn",
type = "reconstructed",
}
m["tbq-laz"] = {
"Laze",
17007626,
"sit-nas",
"Latn",
}
m["tbq-lob-pro"] = {
"โลโล-เบอร์มีซดั้งเดิม",
116773224,
"tbq-lob",
"Latn",
type = "reconstructed",
}
m["tbq-lol-pro"] = {
"โลโลอิชดั้งเดิม",
7251855,
"tbq-lol",
"Latn",
type = "reconstructed",
}
m["tbq-mil"] = {
"Milang",
6850761,
"sit-gsi",
"Deva, Latn",
translit = {
Deva = "Deva-translit",
},
}
m["tbq-mor"] = {
"Moran",
6909216,
"tbq-bdg",
"Latn",
}
m["tbq-ngo"] = {
"Ngochang",
56582,
"tbq-brm",
"Latn",
}
-- tbq-pro is now etymology-only
m["trk-dkh"] = {
"Dukhan",
12809273,
"trk-ssb",
"Latn, Cyrl, Mong",
-- Mong translit, display_text and strip_diacritics in [[Module:scripts/data]]
}
-- As described in Mahmud al-Kashgari's 11th century ''Dīwān Lughāt al-Turk''.
m["trk-eog"] = {
"Early Old Oghuz",
nil,
"trk-ogz",
"ota-Arab",
strip_diacritics = {["ota-Arab"] = "ar-stripdiacritics"},
}
m["trk-oat"] = {
"ตุรกีแบบอานาโตเลียเก่า",
7083390,
"trk-ogz",
"ota-Arab",
strip_diacritics = {["ota-Arab"] = "ar-stripdiacritics"},
ancestors = "trk-eog",
}
m["trk-pro"] = {
"เตอร์กิกดั้งเดิม",
3657773,
"trk",
"Latn",
type = "reconstructed",
standard_chars = {
Latn = " ()-abdegiklmnoprstuxyzïöüāčēīĺŋōŕšūǖȫẹ" .. c.macron,
}
}
m["tup-gua-pro"] = {
"ตูปี-กัวรานีดั้งเดิม",
116773288,
"tup-gua",
"Latn",
type = "reconstructed",
}
m["tup-kab"] = {
"Kabishiana",
15302988,
"tup",
"Latn",
}
m["tup-pro"] = {
"ตูเปียนดั้งเดิม",
10354700,
"tup",
"Latn",
type = "reconstructed",
}
m["tuw-alk"] = {
"Alchuka",
113553616,
"tuw-jrc",
"Latn, Hans",
sort_key = {Hans = "Hani-sortkey"},
}
m["tuw-bal"] = {
"Bala",
86730632,
"tuw-jrc",
"Latn, Hans",
sort_key = {Hans = "Hani-sortkey"},
}
m["tuw-kkl"] = {
"Kyakala",
118875708,
"tuw-jrc",
"Latn, Hans",
sort_key = {Hans = "Hani-sortkey"},
}
m["tuw-kli"] = {
"Kili",
6406892,
"tuw-ewe",
"Cyrl",
}
m["tuw-pro"] = {
"Proto-Tungusic",
85872335,
"tuw",
"Latn",
type = "reconstructed",
}
m["tuw-sol"] = {
"Solon",
30004,
"tuw-ewe",
}
m["urj-fin-pro"] = {
"ฟินนิกดั้งเดิม",
11883720,
"urj-fin",
"Latn",
type = "reconstructed",
}
m["urj-koo"] = {
"Old Komi",
86679962,
"kv",
"Perm, Cyrs",
translit = "urj-koo-translit",
-- Cyrs strip_diacritics, sort_key in [[Module:scripts/data]]; previously, Cyrs strip_diacritics not present
}
m["urj-kuk"] = {
"Kukkuzi",
107410460,
"urj-fin",
"Latn",
ancestors = "vot",
}
m["urj-kya"] = {
"Komi-Yazva",
2365210,
"kv",
"Cyrl",
translit = "kv-translit",
override_translit = true,
strip_diacritics = {remove_diacritics = c.acute},
}
m["urj-mdv-pro"] = {
"Proto-Mordvinic",
116773232,
"urj-mdv",
"Latn",
type = "reconstructed",
}
m["urj-prm-pro"] = {
"เปอร์มิกดั้งเดิม",
116773257,
"urj-prm",
"Latn",
type = "reconstructed",
}
m["urj-pro"] = {
"ยูราลิกดั้งเดิม",
288765,
"urj",
"Latn",
type = "reconstructed",
}
m["urj-ugr-pro"] = {
"ยูกริกดั้งเดิม",
156631,
"urj-ugr",
"Latn",
type = "reconstructed",
}
m["xnd-pro"] = {
"Proto-Na-Dene",
116773233,
"xnd",
"Latn",
type = "reconstructed",
}
m["xgn-pro"] = {
"มองโกลิกดั้งเดิม",
2493677,
"xgn",
"Latn",
type = "reconstructed",
sort_key = {
from = {"č", "i", "ï", "ǰ", "ŋ", "ö", "š", "ü"},
to = {"c", "i" .. p[1], "i", "j", "n" .. p[1], "o" .. p[1], "s" .. p[1], "u" .. p[1]},
},
}
m["yok-bvy"] = {
"Buena Vista Yokuts",
4985474,
"yok",
"Latn",
}
m["yok-dly"] = {
"Delta Yokuts",
70923266,
"yok",
"Latn",
}
m["yok-gsy"] = {
"Gashowu Yokuts",
3098708,
"yok",
"Latn",
}
m["yok-kry"] = {
"Kings River Yokuts",
6413014,
"yok",
"Latn",
}
m["yok-nvy"] = {
"Northern Valley Yokuts",
85789777,
"yok",
"Latn",
}
m["yok-ply"] = {
"Palewyami Yokuts",
2387391,
"yok",
"Latn",
}
m["yok-svy"] = {
"Southern Valley Yokuts",
12642473,
"yok",
"Latn",
}
m["yok-tky"] = {
"Tule-Kaweah Yokuts",
7851988,
"yok",
"Latn",
}
m["ypk-pro"] = {
"Proto-Yupik",
116773295,
"ypk",
"Latn",
type = "reconstructed",
}
m["yrk-for"] = {
"Forest Nenets",
1295107,
"yrk",
"Cyrl",
translit = "yrk-for-translit",
strip_diacritics = {remove_diacritics = c.grave .. c.acute .. c.macron .. c.breve .. c.dotabove},
}
m["yrk-tun"] = {
"Tundra Nenets",
36452,
"yrk",
"Cyrl",
strip_diacritics = {
from = {"ӑ", "а̄", "э̇", "ӣ", "ы̄", "ӯ", "ю̄", "я̆", "я̄"},
to = {"а", "а", "э", "и", "ы", "у", "ю", "я", "я"},
},
translit = "yrk-tun-translit",
}
m["zhx-min-pro"] = {
"หมิ่นดั้งเดิม",
19646347,
"zhx-min",
"Latn",
type = "reconstructed",
}
m["zhx-sht"] = {
"Shaozhou Tuhua",
1920769,
"zhx",
"Nshu, Hants",
generate_forms = "zh-generateforms",
sort_key = {Hani = "Hani-sortkey"},
}
m["zhx-sic"] = {
"เสฉวน",
2278732,
"zhx-man",
"Hants",
generate_forms = "zh-generateforms",
translit = "zh-translit",
sort_key = "Hani-sortkey",
}
m["zhx-tai"] = {
"ห่อยซัน",
2208940,
"zhx-yue",
"Hants",
generate_forms = "zh-generateforms",
translit = "zh-translit",
sort_key = "Hani-sortkey",
}
m["zle-ono"] = {
"Old Novgorodian",
162013,
"zle",
"Cyrs, Glag",
translit = {Cyrs = "Cyrs-translit", Glag = "Glag-translit"},
-- Cyrs strip_diacritics, sort_key in [[Module:scripts/data]]
}
m["zle-ort"] = {
"รูซินเก่า",
13211,
"zle",
"Arab, Cyrs, Latn",
ancestors = "orv",
translit = {
Cyrs = "zle-ort-translit",
Arab = "zle-ort-Arab-translit",
},
strip_diacritics = {
Cyrs = {
remove_diacritics = m_langdata.chars_substitutions["Cyrs_remove_diacritics"],
remove_exceptions = {"Ї", "ї"},
},
Arab = "ar-stripdiacritics",
},
-- Cyrs sort_key in [[Module:scripts/data]]
}
m["zls-chs"] = {
"Church Slavonic",
33251,
"zls",
"Cyrs, Glag, Latn",
ancestors = "cu",
translit = {
Cyrs = "Cyrs-translit",
Glag = "Glag-translit"
},
-- Cyrs strip_diacritics, sort_key in [[Module:scripts/data]]
}
m["zlw-ocs"] = {
"เช็กเก่า",
593096,
"zlw",
"Latn",
}
m["zlw-opl"] = {
"โปแลนด์เก่า",
149838,
"zlw-lch",
"Latn",
strip_diacritics = {remove_diacritics = c.ringabove},
}
m["zlw-osk"] = {
"สโลวักเก่า",
12776676,
"zlw",
"Latn",
}
m["zlw-slv"] = {
"สโลวินช์",
36822,
"zlw-pom",
"Latn",
strip_diacritics = {remove_diacritics = c.macron .. c.breve},
}
return require("Module:languages").finalizeData(m, "language")
c7n6allndmel74lnv21o7pg4zdh1o88
มอดูล:languages/data/3/w
828
36364
5720768
5684169
2026-04-21T07:01:19Z
OctraBot
3198
บอต: แทนที่ข้อความโดยอัตโนมัติ (-standardChars +standard_chars)
5720768
Scribunto
text/plain
local m_langdata = require("Module:languages/data")
-- Loaded on demand, as it may not be needed (depending on the data).
local function u(...)
u = require("Module:string utilities").char
return u(...)
end
local c = m_langdata.chars
local p = m_langdata.puaChars
local s = m_langdata.shared
local m = {}
m["waa"] = {
"Walla Walla",
12953960,
"nai-shp",
"Latn",
ancestors = "nai-spt",
}
m["wab"] = {
"Wab",
11222271,
"poz-ocw",
"Latn",
}
m["wac"] = {
"Wasco-Wishram",
12645081,
"nai-ckn",
"Latn",
}
m["wad"] = {
"Wandamen",
2806128,
"poz-hce",
"Latn",
}
m["waf"] = {
"Wakoná",
7961205,
}
m["wag"] = {
"Wa'ema",
12953264,
"poz-ocw",
"Latn",
}
m["wah"] = {
"Watubela",
7975070,
"poz-cma",
"Latn",
}
m["waj"] = {
"Waffa",
3565058,
"ngf-kag",
"Latn",
}
m["wal"] = {
"Wolaytta",
36943,
"omv-nom",
"Latn, Ethi",
}
m["wam"] = {
"Massachusett",
56519,
"alg-eas",
"Latn",
}
m["wan"] = {
"Wan",
3913272,
"dmn-nbe",
}
m["wao"] = {
"Wappo",
56530,
"nai-ykn",
"Latn",
}
m["wap"] = {
"Wapishana",
3450493,
"awd",
"Latn",
}
m["waq"] = {
"Wageman",
3436843,
"aus-gun",
"Latn",
}
m["war"] = {
"วาไร",
34279,
"phi",
"Latn",
strip_diacritics = {Latn = {remove_diacritics = c.grave .. c.acute .. c.circ}},
standard_chars = {
Latn = "AaBbKkDdEeGgHhIiLlMmNnOoPpRrSsTtUuWwYy",
c.punc
},
sort_key = {
Latn = "tl-sortkey",
},
}
m["was"] = {
"Washo",
34198,
"qfa-iso",
"Latn",
}
m["wat"] = {
"Kaninuwa",
12952565,
"poz-ocw",
"Latn",
}
m["wau"] = {
"Wauja",
3450522,
"awd",
"Latn",
}
m["wav"] = {
"Waka",
3913394,
"alv-mye",
}
m["waw"] = {
"Waiwai",
56632,
"sai-prk",
"Latn",
}
m["wax"] = {
"Watam",
3566597,
"paa-ram",
"Latn",
}
m["way"] = {
"Wayana",
5908753,
"sai-gui",
"Latn",
}
m["waz"] = {
"Wampur",
7966957,
"poz-ocw",
"Latn",
}
m["wba"] = {
"วาราโอ",
36946,
"qfa-iso",
"Latn",
}
m["wbb"] = {
"Wabo",
7958701,
"poz-hce",
"Latn",
}
m["wbe"] = {
"Waritai",
7969453,
"paa-lkp",
"Latn",
}
m["wbf"] = {
"Wara",
3914052,
"alv-wan",
}
m["wbh"] = {
"Wanda",
7967153,
"bnt-mwi",
}
m["wbi"] = {
"Wanji",
3376818,
"bnt-bki",
"Latn",
}
m["wbj"] = {
"Alagwa",
56621,
"cus-sou",
"Latn",
}
m["wbk"] = {
"Waigali",
34196,
"nur-sou",
"Latn",
}
m["wbl"] = {
"Wakhi",
34208,
"xsc-skw",
"Cyrl, Latn, Arab",
translit = {Cyrl = "tg-translit"},
}
m["wbm"] = {
"Wa",
12644869,
"mkh-pal",
}
m["wbp"] = {
"Warlpiri",
1639998,
"aus-pam",
"Latn",
}
m["wbq"] = {
"Waddar",
6708569,
"dra-tel",
}
m["wbr"] = {
"Wagdi",
7959490,
"inc-bhi",
"Deva",
}
m["wbt"] = {
"Wanman",
7967989,
}
m["wbv"] = {
"Wajarri",
3913856,
"aus-psw",
"Latn",
}
m["wbw"] = {
"Woi",
8029092,
"poz-hce",
"Latn",
}
m["wca"] = {
"Yanomam",
7960056,
"sai-ynm",
"Latn",
}
m["wci"] = {
"Waci Gbe",
36987,
"alv-gbe",
}
m["wdd"] = {
"Wandji",
36976,
"bnt-nze",
}
m["wdg"] = {
"Wadaginam",
7958930,
"ngf-sad",
"Latn",
}
m["wdj"] = {
"Wadjiginy",
7959489,
}
m["wdt"] = {
"Wendat",
3567223,
"iro-nor",
"Latn",
ancestors = "iro-ohu",
}
m["wdu"] = {
"Wadjigu",
10719025,
}
m["wdy"] = {
"Wadjabangayi",
63313681,
}
m["wea"] = {
"Wewaw",
15895870,
}
m["wec"] = {
"Wè Western",
11159067,
"kro-wee",
}
m["wed"] = {
"Wedau",
12953294,
"poz-ocw",
"Latn",
}
m["weh"] = {
"Weh",
7979690,
"nic-rnw",
}
m["wei"] = {
"Kiunum",
7983230,
"paa-ani",
"Latn",
}
m["wem"] = {
"Weme Gbe",
18379970,
"alv-gbe",
}
m["weo"] = {
"Wemale",
7982165,
"poz-cma",
}
m["wer"] = {
"Weri",
11732752,
"paa-kun",
"Latn",
}
m["wes"] = {
"Cameroon Pidgin",
35541,
"crp",
"Latn",
ancestors = "en",
}
m["wet"] = {
"Perai",
12953035,
"poz-tim",
}
m["weu"] = {
"Welaung",
7980503,
"tbq-kuk",
}
m["wew"] = {
"Weyewa",
4314526,
"poz-cet",
"Latn",
}
m["wfg"] = {
"Yafi",
8074520,
"paa-pau",
}
m["wga"] = {
"Wagaya",
7959487,
"aus-pam",
}
m["wgb"] = {
"Wagawaga",
7959485,
}
m["wgg"] = {
"Wangganguru",
7967859,
"aus-kar",
"Latn",
}
m["wgi"] = {
"Wahgi", -- not to be confused with North Wahgi
3565122,
"ngf-chw",
"Latn",
}
m["wgo"] = {
"Waigeo",
7959937,
"poz-hce",
"Latn",
}
m["wgu"] = {
"Wirangu",
2092286,
"aus-pam",
"Latn",
}
m["wgy"] = {
"Warrgamay",
3915942,
"aus-pam",
"Latn",
}
m["wha"] = {
"Manusela",
3287127,
"poz-cma",
"Latn",
}
m["whg"] = {
"North Wahgi",
12953273,
"ngf-chw",
"Latn",
}
m["whk"] = {
"Wahau Kenyah",
7959737,
"poz-swa",
"Latn",
}
m["whu"] = {
"Wahau Kayan",
12473397,
}
m["wib"] = {
"Southern Toussian",
11158982,
"alv-sav",
}
m["wic"] = {
"Wichita",
56513,
"cdd",
"Latn",
}
m["wie"] = {
"Wik-Epa",
10720035,
"aus-pmn",
}
m["wif"] = {
"Wik-Keyangan",
10720037,
"aus-pmn",
}
m["wig"] = {
"Wik-Ngathana",
3915695,
"aus-pmn",
}
m["wih"] = {
"Wik-Me'anha",
10720039,
"aus-pmn",
}
m["wii"] = {
"Minidien",
6865237,
"paa-tor",
"Latn",
}
m["wij"] = {
"Wik-Iiyanh",
10720036,
"aus-pmn",
}
m["wik"] = {
"Wikalkan",
7999800,
"aus-pmn",
}
m["wil"] = {
"Wilawila",
10720050,
"aus-wor",
}
m["wim"] = {
"Wik-Mungkan",
2092246,
"aus-pmn",
"Latn",
}
m["win"] = {
"วินเนอเบโก",
1957108,
"sio-msv",
"Latn",
}
m["wir"] = {
"Wiraféd",
12953970,
"tup-gua",
"Latn",
}
m["wiu"] = {
"Wiru",
8027044,
"qfa-dis", -- Papuan; isolate in Glottolog; grouped with Teberan by Usher (2020)
"Latn",
}
m["wiv"] = {
"Muduapa",
3121040,
"poz-ocw",
"Latn",
}
m["wiy"] = {
"Wiyot",
36937,
"aql",
"Latn",
}
m["wja"] = {
"Waja",
3914415,
"alv-wjk",
}
m["wji"] = {
"Warji",
3440381,
"cdc-wst",
"Latn",
}
m["wka"] = {
"Kw'adza",
3807652,
"cus-sou",
}
m["wkb"] = {
"Kumbaran",
16878146,
"dra-sdo",
}
m["wkd"] = {
"Mo",
7960881,
"poz-ocw",
"Latn",
}
m["wkl"] = {
"Kalanadi",
6350515,
"dra-mal",
}
m["wku"] = {
"Kunduvadi",
6444383,
"dra-mal",
}
m["wkw"] = {
"Wakawaka",
10719110,
"aus-pam",
}
m["wky"] = {
"Wangkayutyuru",
33060533,
"aus-kar",
}
m["wla"] = {
"Walio",
7961958,
"paa-wal",
"Latn",
}
m["wlc"] = {
"Mwali Comorian",
3319155,
"bnt-com",
"Latn",
sort_key = "bnt-com-sortkey",
}
m["wle"] = {
"Wolane",
12645275,
"sem-eth",
"Ethi",
}
m["wlg"] = {
"Kunbarlang",
5618523,
"aus-gun",
"Latn",
}
m["wli"] = {
"Waioli",
7960241,
"paa-nha",
"Latn",
}
m["wlk"] = {
"Wailaki",
20832,
"ath-pco",
"Latn",
}
m["wll"] = {
"Wali (Sudan)",
30597440,
"nub-hil",
}
m["wlm"] = {
"เวลส์กลาง",
2487263,
"cel-brw",
"Latn",
ancestors = "owl",
strip_diacritics = {
from = {"Ð", "ð"},
to = {"D", "d"}
},
sort_key = "wlm-sortkey",
}
m["wlo"] = {
"Wolio",
1185114,
"poz-wot",
"Latn, Arab",
}
m["wlr"] = {
"Wailapa",
7960062,
"poz-vnn",
"Latn",
}
m["wls"] = {
"Wallisian",
36979,
"poz-pnp",
"Latn",
}
m["wlu"] = {
"Wuliwuli",
8039208,
}
m["wlv"] = {
"Wichí Lhamtés Vejoz",
13526867,
"sai-wic",
"Latn",
}
m["wlw"] = {
"Walak",
7961258,
"ngf-dan",
"Latn",
}
m["wlx"] = {
"Wali (Ghana)",
36895,
"nic-mre",
"Latn",
}
m["wly"] = {
"Waling",
7961957,
"sit-kic",
ancestors = "bap",
}
m["wmb"] = {
"Wambaya",
2083197,
"aus-mir",
"Latn",
}
m["wmc"] = {
"Wamas",
7966909,
"ngf-nwh",
"Latn",
}
m["wmd"] = {
"Mamaindé",
3284890,
"sai-nmk",
"Latn",
}
m["wme"] = {
"Wambule",
56785,
"sit-kiw",
"Latn",
}
m["wmh"] = {
"Waima'a",
7960132,
"poz-tim",
"Latn",
}
m["wmi"] = {
"Wamin",
7966934,
}
m["wmm"] = {
"Maiwa (Indonesia)",
6737226,
"poz",
"Latn",
}
m["wmn"] = {
"Waamwang",
7958575,
"poz-cln",
"Latn",
}
m["wmo"] = {
"Wam",
8030620,
"paa-tor",
"Latn",
}
m["wms"] = {
"Wambon",
7966922,
"ngf-gaw",
"Latn",
}
m["wmt"] = {
"Walmajarri",
2232696,
"aus-pam",
"Latn",
}
m["wmw"] = {
"มวานี",
3042206,
"bnt-swh",
"Latn",
}
m["wmx"] = {
"Womo",
8031646,
"paa-msk",
"Latn",
}
m["wnb"] = {
"Wanambre",
7967057,
"ngf-tib",
"Latn",
}
m["wnc"] = {
"Wantoat",
7968184,
"ngf-fin",
"Latn",
}
m["wnd"] = {
"Wandarang",
3913767,
"aus-arn",
"Latn",
}
m["wne"] = {
"Waneci",
7967334,
"ira-pat",
"ps-Arab",
}
m["wng"] = {
"Wanggom",
11732736,
"ngf-gaw",
"Latn",
}
m["wni"] = {
"Ndzwani Comorian",
2850262,
"bnt-com",
"Latn",
sort_key = "bnt-com-sortkey",
}
m["wnk"] = {
"Wanukaka",
2370136,
"poz",
"Latn",
}
m["wnm"] = {
"Wanggamala",
7967860,
"aus-kar",
"Latn",
}
m["wno"] = {
"Wano",
3566166,
"ngf-dan",
"Latn",
}
m["wnp"] = {
"Wanap",
7967060,
"paa-tor",
"Latn",
}
m["wnu"] = {
"Usan",
7901709,
"ngf-num",
"Latn",
}
m["wnw"] = {
"Wintu",
56754,
"nai-wtq",
"Latn",
}
m["wny"] = {
"Wanyi",
7968201,
"aus-gar",
"Latn",
}
m["woa"] = {
"Tyaraity",
10706951,
}
m["wob"] = {
"Wobé",
3915363,
"kro-wee",
}
m["woc"] = {
"Wogeo",
8029061,
"poz-ocw",
"Latn",
}
m["wod"] = {
"Wolani",
8029704,
"ngf-pan",
"Latn",
}
m["woe"] = {
"Woleaian",
34037,
"poz-mic",
"Latn, Wole",
}
m["wog"] = {
"Wogamusin",
56991,
"paa-spk",
"Latn",
}
m["woi"] = {
"Kamang",
8029096,
"paa-tap",
"Latn",
}
m["wok"] = {
"Longto",
35795,
"alv-dur",
"Latn",
}
m["wom"] = {
"Perema",
3913378,
"alv-lek",
"Latn",
}
m["won"] = {
"Wongo",
8032058,
"bnt-bsh",
"Latn",
}
m["woo"] = {
"Manombai",
6751253,
"poz",
"Latn",
}
m["wor"] = {
"Woria",
8034514,
"paa-egb",
"Latn",
}
m["wos"] = {
"Hanga Hundi",
6450232,
"paa-ndu",
"Latn",
}
m["wow"] = {
"Wawonii",
3566780,
"poz-btk",
"Latn",
}
m["woy"] = {
"Weyto",
3915918,
"qfa-unc", -- speculated to have been Agaw
}
m["wpc"] = {
"Wirö",
12953684,
nil,
"Latn",
}
m["wra"] = {
"Warapu",
56739,
"paa-msk",
"Latn",
}
m["wrb"] = {
"Warluwara",
3913761,
"aus-pam",
"Latn",
}
m["wrg"] = {
"Warungu",
7970854,
"aus-pam",
"Latn",
}
m["wrh"] = {
"Wiradjuri",
3913840,
"aus-cww",
"Latn",
}
m["wri"] = {
"Wariyangga",
10719289,
"aus-psw",
"Latn",
}
m["wrk"] = {
"Garawa",
2524022,
"aus-gar",
"Latn",
}
m["wrl"] = {
"Warlmanpa",
3913823,
"aus-pam",
}
m["wrm"] = {
"Warumungu",
1764544,
"aus-pam",
"Latn",
}
m["wrn"] = {
"Warnang",
36971,
"alv-hei",
}
m["wro"] = {
"Worora",
3504106,
"aus-wor",
"Latn",
}
m["wrp"] = {
"Waropen",
7969851,
"poz-hce",
"Latn",
}
m["wrr"] = {
"Wardaman",
3913842,
"aus-yng",
}
m["wrs"] = {
"Waris",
3502610,
"paa-brd",
"Latn",
}
m["wru"] = {
"Waru",
3566463,
}
m["wrv"] = {
"Waruna",
7971078,
"ngf-gsu",
"Latn",
}
m["wrw"] = {
"Gugu Warra",
5615286,
}
m["wrx"] = {
"Wae Rana",
7959375,
}
m["wrz"] = {
"Warray",
7969971,
"aus-gun",
}
m["wsa"] = {
"Warembori",
56459,
}
m["wsi"] = {
"Wusi",
8039349,
"poz-vnn",
"Latn",
}
m["wsk"] = {
"Waskia",
7972683,
"ngf-kow",
"Latn",
}
m["wsr"] = {
"Owenia",
7114727,
"ngf-kag",
"Latn",
}
-- "wss" Wasa is treated as "ak" Akan, see [[WT:LT]]
m["wsu"] = {
"Wasu",
7972892,
}
m["wsv"] = {
"Wotapuri-Katarqalai",
3877569,
"inc-koh",
}
m["wtf"] = {
"Watiwa",
35316,
"ngf-eva",
"Latn",
}
m["wth"] = {
"Wathaurong",
7974656,
"aus-pam",
"Latn",
}
m["wti"] = {
"Berta",
33178,
"qfa-iso", -- may be ssa
"Latn",
}
m["wtk"] = {
"Watakataui",
7972975,
"paa-spk",
"Latn",
}
m["wtm"] = {
"Mewati",
2605943,
"raj",
"Deva",
translit = "Deva-translit",
}
m["wtw"] = {
"Wotu",
12473488,
}
m["wua"] = {
"Wikngenchera",
10720045,
"aus-pmn",
}
m["wub"] = {
"Wunambal",
3913805,
"aus-wor",
}
m["wud"] = {
"Wudu",
36972,
"alv-gbe",
"Latn",
}
m["wuh"] = {
"Wutunhua",
1012917,
"qfa-mix",
"Latn",
ancestors = "cmn, bo, peh",
}
m["wul"] = {
"Silimo",
11732514,
"ngf-dan",
"Latn",
}
m["wum"] = {
"Wumbvu",
36891,
"bnt-kel",
"Latn",
}
m["wun"] = {
"Bungu",
4997686,
"bnt-mby",
"Latn",
}
m["wur"] = {
"Wurrugu",
8039305,
"aus-wdj",
}
m["wut"] = {
"Wutung",
56743,
"paa-msk",
"Latn",
}
m["wuu"] = {
"อู๋",
34290,
"zhx",
"Hants",
ancestors = "ltc",
generate_forms = "zh-generateforms",
translit = "zh-translit",
sort_key = "Hani-sortkey",
}
m["wuv"] = {
"Wuvulu-Aua",
3062746,
"poz-aay",
"Latn",
}
m["wux"] = {
"Wulna",
13591670,
}
m["wuy"] = {
"Wauyai",
12953295,
"poz-hce",
"Latn",
}
m["wwa"] = {
"Waama",
7958576,
"nic-eov",
"Latn",
}
m["wwo"] = {
"Dorig",
3037047,
"poz-vnn",
"Latn",
}
m["wwr"] = {
"Warrwa",
7970852,
}
m["www"] = {
"Wawa",
36889,
"nic-mmb",
"Latn",
}
m["wxa"] = {
"Waxiang",
2252191,
"zhx",
"Hants",
generate_forms = "zh-generateforms",
sort_key = "Hani-sortkey",
}
m["wxw"] = {
"Wardandi",
61999705,
}
m["wya"] = {
"Wyandot",
1185119,
"iro-nor",
"Latn",
}
m["wyb"] = {
"Ngiyambaa",
3913825,
"aus-cww",
"Latn",
}
m["wyi"] = {
"Woiwurrung",
8029099,
"aus-pam",
"Latn",
}
m["wym"] = {
"วีลามอวิตแซ",
56485,
"gmw-hgm",
"Latn",
ancestors = "gmh",
strip_diacritics = {remove_diacritics = c.dotabove},
}
m["wyr"] = {
"Wayoró",
2875044,
"tup",
}
m["wyy"] = {
"Western Fijian",
3062751,
"poz-pcc",
"Latn",
}
return require("Module:languages").finalizeData(m, "language")
r4syvgbv90anygy45o3ulhprz6ijuvx
มอดูล:languages/data/3/v
828
36365
5720767
5683855
2026-04-21T07:01:18Z
OctraBot
3198
บอต: แทนที่ข้อความโดยอัตโนมัติ (-standardChars +standard_chars)
5720767
Scribunto
text/plain
local m_langdata = require("Module:languages/data")
-- Loaded on demand, as it may not be needed (depending on the data).
local function u(...)
u = require("Module:string utilities").char
return u(...)
end
local c = m_langdata.chars
local p = m_langdata.puaChars
local s = m_langdata.shared
local m = {}
m["vaa"] = {
"Vaagri Booli",
7907798,
"inc",
"Taml",
translit = {
Taml = "Taml-translit",
},
}
m["vae"] = {
"Vale",
3450194,
"csu-val",
}
m["vag"] = {
"Vagla",
36637,
"nic-gnw",
}
m["vah"] = {
"Varhadi",
155645,
"inc-sou",
"Deva, Modi",
ancestors = "omr",
translit = {
Deva = "Deva-translit",
Modi = "Modi-translit",
},
}
m["vai"] = {
"ไว",
36939,
"dmn-vak",
"Vaii",
translit = "Vaii-translit",
}
m["vaj"] = {
"Sekele",
56528,
}
m["val"] = {
"Vehes",
7918407,
}
m["vam"] = {
"Vanimo",
3327415,
"paa-msk",
"Latn",
}
m["van"] = {
"Valman",
7912479,
"paa-tor",
}
m["vao"] = {
"Vao",
2160405,
"poz-vnc",
"Latn",
}
m["vap"] = {
"Vaiphei",
56368,
"tbq-kuk",
}
m["var"] = {
"Huarijio",
10974017,
"azc-trc",
"Latn",
}
m["vas"] = {
"Vasavi",
765418,
}
m["vau"] = {
"Vanuma",
7915259,
"bnt-nya",
}
m["vav"] = {
"Varli",
7915983,
"inc-sou",
"Deva, Gujr",
translit = {
Deva = "Deva-translit",
Gujr = "Gujr-translit",
},
}
m["vay"] = {
"Vayu",
7917585,
"sit-kiw",
}
m["vbb"] = {
"Southeast Babar",
12952247,
"poz-tim",
}
m["vbk"] = {
"Southwestern Bontoc",
63313677,
"phi",
"Latn",
}
m["vec"] = {
"เวเนโต",
32724,
"roa-iwr",
"Latn, Armn",
-- Armn translit in [[Module:scripts/data]]
}
m["ved"] = {
"Veddah",
2567934,
"crp",
"Sinh"
}
m["vem"] = {
"Vemgo-Mabas",
56268,
}
m["veo"] = {
"Ventureño",
56712,
"nai-chu",
"Latn",
}
m["vep"] = {
"เวปส์",
32747,
"urj-fin",
"Latn",
display_text = {
from = {"'"},
to = {"ʹ"}
},
strip_diacritics = {
from = {"'"},
to = {"ʹ"}
},
sort_key = {
from = {
"č", "š", "ž", "ü", "ä", "ö", -- 2 chars
"z", "ʹ" -- 1 char
},
to = {
"c" .. p[1], "s" .. p[1], "s" .. p[3], "y" .. p[1], "y" .. p[2], "y" .. p[3],
"s" .. p[2], "y" .. p[4],
}
},
}
m["ver"] = {
"Mom Jango",
35862,
"alv-dur",
}
m["vgr"] = {
"Vaghri",
7908480,
"inc-bhi",
"Gujr",
translit = "Gujr-translit",
}
m["vgt"] = {
"Flemish Sign Language",
2107617,
"sgn",
}
m["vic"] = {
"Virgin Islands Creole",
7933935,
"crp",
"Latn",
ancestors = "en",
}
m["vid"] = {
"Vidunda",
7928151,
"bnt-ruv",
}
m["vif"] = {
"Vili",
3558409,
"bnt-kng",
"Latn",
}
m["vig"] = {
"Viemo",
36912,
"alv-sav",
}
m["vil"] = {
"Vilela",
3409297,
}
m["vis"] = {
"Vishavan",
14916908,
"dra-mal",
}
m["vit"] = {
"Viti",
11011055,
"nic-grf",
}
m["viv"] = {
"Iduna",
5989839,
"poz-ocw",
"Latn",
}
m["vjk"] = {
"Bajjika",
3636001,
"inc-bih",
"Deva, Kthi",
translit = {
Deva = "Deva-translit",
Kthi = "Kthi-translit",
},
}
m["vka"] = {
"Kariyarra",
13586632,
"aus-nga",
"Latn",
}
m["vki"] = {
"Ija-Zuba",
11011389,
"nic-pls",
ancestors = "uji",
}
m["vkj"] = {
"Kujarge",
33448,
"qfa-unc", -- Chadic, Cushitic or an isolate; still living but only 200 words known
}
m["vkk"] = {
"Kaur",
6378867,
"poz-mly", -- per Wikipedia
}
m["vkl"] = {
"Kulisusu",
3200326,
"poz-btk",
"Latn",
}
m["vkm"] = {
"Kamakan",
3192316,
"sai-mje",
"Latn",
}
m["vko"] = {
"Kodeoha",
3198209,
"poz-btk", -- per Wikipedia
}
m["vkp"] = {
"Korlai Creole Portuguese",
3915520,
"crp",
"Latn",
ancestors = "idb",
}
m["vkt"] = {
"Tenggarong Kutai Malay",
12683226,
"poz-mly", -- per Wikipedia
}
m["vku"] = {
"Kurrama",
3915684,
"aus-nga",
"Latn",
}
m["vlp"] = {
"Valpei",
7912582,
"poz-vnn",
"Latn",
}
m["vls"] = {
"West Flemish",
100103,
"gmw-frk",
"Latn",
ancestors = "dum",
}
m["vma"] = {
"Martuthunira",
975399,
"aus-nga",
"Latn",
}
m["vmb"] = {
"Mbabaram",
3303475,
"aus-pam",
"Latn",
}
m["vmc"] = {
"Juxtlahuaca Mixtec",
25559582,
"omq-mxt",
"Latn",
}
m["vmd"] = {
"Mudu Koraga",
12952656,
"dra-kor",
"Knda",
-- Knda translit in [[Module:scripts/data]] (NOTE: not present before, presumably an accidental omission)
}
m["vme"] = {
"East Masela",
18487451,
"poz-tim",
"Latn",
}
m["vmf"] = {
"East Franconian",
497345,
"gmw-hgm",
"Latn",
ancestors = "gmh",
sort_key = "vmf-sortkey",
}
m["vmg"] = {
"Minigir",
17053237,
"poz-ocw",
"Latn",
}
m["vmh"] = {
"Maraghei",
36220,
"xme-ttc",
ancestors = "xme-ttc-eas",
}
m["vmi"] = {
"Miwa",
10586712,
"aus-wor",
}
m["vmj"] = {
"Ixtayutla Mixtec",
6101163,
"omq-mxt",
"Latn",
}
m["vmk"] = {
"Makhuwa-Shirima",
2963909,
"bnt-mak",
"Latn",
ancestors = "vmw",
}
m["vml"] = {
"Malgana",
6743201,
"aus-psw",
"Latn",
}
m["vmm"] = {
"Mitlatongo Mixtec",
6881813,
"omq-mxt",
"Latn",
}
m["vmp"] = {
"Soyaltepec Mazatec",
7572000,
"omq-maz", -- per Wikipedia
"Latn",
}
m["vmq"] = {
"Soyaltepec Mixtec",
7572001,
"omq-mxt",
"Latn",
}
m["vmr"] = {
"Marenje",
11128833,
ancestors = "vmw",
"bnt-mak",
}
-- vms "Moskela" is extinct and unattested; see Wikipedia
m["vmu"] = {
"Muluridyi",
10590149,
"aus-pam", -- Yalanjic but we don't have that family
}
m["vmv"] = {
"Valley Maidu",
5096458,
"nai-mdu",
"Latn",
}
m["vmw"] = {
"Makhuwa",
33882,
"bnt-mak",
"Latn",
strip_diacritics = { remove_diacritics = c.acute },
}
m["vmx"] = {
"Tamazola Mixtec",
12953734,
"omq-mxt",
"Latn",
}
m["vmy"] = {
"Ayautla Mazatec",
14916912,
"omq-maz",
"Latn",
}
m["vmz"] = {
"Mazatlán Mazatec",
12953706,
"omq-maz",
"Latn",
}
m["vnk"] = {
"Lovono",
3211090,
"poz-tem",
"Latn",
}
m["vnm"] = {
"Neve'ei",
2157431,
"poz-vnc",
"Latn",
}
m["vnp"] = {
"Vunapu",
7943647,
"poz-vnn",
"Latn",
}
m["vor"] = {
"Voro",
3914407,
"alv-yun",
"Latn",
}
m["vot"] = {
"โวต",
32858,
"urj-fin",
"Latn",
display_text = {
from = {"'"},
to = {"ʹ"}
},
strip_diacritics = {
from = {"'"},
to = {"ʹ"}
},
sort_key = {
from = {
"tš",
"š", "ž", "õ", "ä", "ö", "ü",
"z",
"ʹ"
},
to = {
"t" .. p[1],
"s" .. p[1], "s" .. p[3], "w" .. p[1], "w" .. p[2], "w" .. p[3], "w" .. p[4],
"s" .. p[2],
""
}
},
standard_chars = "AaBbDdEeFfGgHhIiJjKkLlMmNnOoPpRrSsŠšZzŽžTtUuVvÕõÄäÖöÜüʹ" .. c.punc,
}
m["vra"] = {
"Vera'a",
3555689,
"poz-vnn",
"Latn",
}
m["vro"] = {
"เวอโร",
32762,
"urj-fin",
"Latn",
wikimedia_codes = "fiu-vro",
strip_diacritics = { remove_diacritics = c.caronbelow },
sort_key = {
from = {
"ǵ", "ḿ", "ń", "ṕ", "ŕ", "ś", "v́"
},
to = {
"g'", "m'", "n'", "p'", "r'", "s'", "v'"
}
},
}
m["vrs"] = {
"Varisi",
3554807,
"poz-ocw",
}
m["vrt"] = {
"Burmbar",
2928522,
"poz-vnc",
"Latn",
}
m["vsi"] = {
"Moldova Sign Language",
12953478,
"sgn",
}
m["vsl"] = {
"Venezuelan Sign Language",
3322064,
"sgn",
}
m["vsv"] = {
"Valencian Sign Language",
32663,
"sgn",
}
m["vto"] = {
"Vitou",
7937210,
"paa-tkw",
}
m["vum"] = {
"Vumbu",
36629,
"bnt-sir",
}
m["vun"] = {
"Vunjo",
12953261,
"bnt-chg",
"Latn",
}
m["vut"] = {
"Vute",
36897,
"nic-mmb",
"Latn",
}
m["vwa"] = {
"Awa (China)",
2874642,
"mkh-pal",
"Latn, Mymr",
}
return require("Module:languages").finalizeData(m, "language")
4ysixpmc3nwqt77xaxchmknzwdl9m6r
มอดูล:languages/data/3/t
828
36367
5720766
5688007
2026-04-21T07:01:15Z
OctraBot
3198
บอต: แทนที่ข้อความโดยอัตโนมัติ (-standardChars +standard_chars)
5720766
Scribunto
text/plain
local m_langdata = require("Module:languages/data")
-- Loaded on demand, as it may not be needed (depending on the data).
local function u(...)
u = require("Module:string utilities").char
return u(...)
end
local c = m_langdata.chars
local p = m_langdata.puaChars
local s = m_langdata.shared
local m = {}
m["taa"] = {
"Lower Tanana",
28565,
"ath-nor",
"Latn",
}
m["tab"] = {
"ทาบาซารัน",
34079,
"cau-esm",
"Cyrl, Latn, Arab",
translit = {
Cyrl = "tab-translit",
},
override_translit = true,
display_text = {
Cyrl = s["cau-Cyrl-displaytext"]
},
strip_diacritics = {
Cyrl = s["cau-Cyrl-stripdiacritics"],
Latn = s["cau-Latn-stripdiacritics"],
},
sort_key = {
Cyrl = "tab-sortkey",
}
}
m["tac"] = {
"Lowland Tarahumara",
15616384,
"azc-trc",
"Latn",
}
m["tad"] = {
"Tause",
2356440,
"paa-lkp",
"Latn",
}
m["tae"] = {
"Tariana",
732726,
"awd-nwk",
"Latn",
}
m["taf"] = {
"Tapirapé",
7684673,
"tup-gua",
"Latn",
}
m["tag"] = {
"Tagoi",
36537,
"nic-ras",
"Latn",
}
m["taj"] = {
"Eastern Tamang",
12953177,
"sit-tam",
"sit-tam-Tibt, Deva",
translit = {
Deva = "Deva-translit",
},
-- sit-tam-Tibt translit, display_text, strip_diacritics, sort_key in [[Module:scripts/data]]
-- NOTE: Formerly there was no sort_key or translit specified; I assume that's a mistake.
}
m["tak"] = {
"Tala",
3914494,
"cdc-wst",
"Latn",
}
m["tal"] = {
"Tal",
3440387,
"cdc-wst",
"Latn",
}
m["tan"] = {
"Tangale",
529921,
"cdc-wst",
"Latn",
}
m["tao"] = {
"ยามี",
715760,
"phi",
"Latn",
}
m["tap"] = {
"Taabwa",
7673650,
"bnt-sbi",
"Latn",
}
m["tar"] = {
"Central Tarahumara",
20090009,
"azc-trc",
"Latn",
sort_key = {remove_diacritics = c.acute .. "ꞌ"},
}
m["tas"] = {
"Tây Bồi",
2233794,
"crp",
"Latn",
ancestors = "fr",
sort_key = s["roa-oil-sortkey"],
}
m["tau"] = {
"Upper Tanana",
28281,
"ath-nor",
"Latn",
}
m["tav"] = {
"Tatuyo",
2524007,
"sai-tuc",
"Latn",
}
m["taw"] = {
"Tai",
7675861,
"ngf-kak",
"Latn",
}
m["tax"] = {
"Tamki",
3449082,
"cdc-est",
"Latn",
}
m["tay"] = {
"Atayal",
715766,
"map-ata",
"Latn",
}
m["taz"] = {
"Tocho",
36680,
"alv-tal",
"Latn",
}
m["tba"] = {
"Aikanã",
3409307,
"qfa-iso",
"Latn",
}
m["tbb"] = {
"Tapeba",
12953908,
}
m["tbc"] = {
"Takia",
3514336,
"poz-oce",
"Latn",
}
m["tbd"] = {
"Kaki Ae",
6349417,
"qfa-iso", -- isolate in Glottolog and Pawley and Hammarström (2018); tentatively in Eleman family by Ross (2005)
-- (and Usher?), but they don't address counterarguments of Clifton 1997
"Latn",
}
m["tbe"] = {
"Tanimbili",
3515188,
"poz-tem",
"Latn",
}
m["tbf"] = {
"Mandara",
3285424,
"poz-ocw",
"Latn",
}
m["tbg"] = {
"North Tairora",
20210398,
"ngf-kag",
"Latn",
}
m["tbh"] = {
"Thurawal",
3537135,
"aus-yuk",
}
m["tbi"] = {
"Gaam",
35338,
"sdv-eje",
"Latn",
}
m["tbj"] = {
"Tiang",
3528020,
"poz-ocw",
"Latn",
}
m["tbk"] = {
"Calamian Tagbanwa",
3915487,
"phi-kal",
"Tagb, Latn",
}
m["tbl"] = {
"ตโบลี",
7690594,
"phi",
"Latn",
}
m["tbm"] = {
"Tagbu",
7675188,
"nic-ser",
}
m["tbn"] = {
"Barro Negro Tunebo",
12953943,
"cba",
}
m["tbo"] = {
"Tawala",
7689206,
"poz-ocw",
"Latn",
}
m["tbp"] = {
"Taworta",
7689337,
"paa-lkp",
"Latn",
}
m["tbr"] = {
"Tumtum",
3407029,
"qfa-kad",
}
m["tbs"] = {
"Tanguat",
7683166,
"paa-ram",
"Latn",
}
m["tbt"] = {
"Kitembo",
13123561,
"bnt-shh",
"Latn",
}
m["tbu"] = {
"Tubar",
56730,
"azc-trc",
"Latn",
}
m["tbv"] = {
-- considered a dialect of Kulungtfu-Yuanggeng-Tobo [kgf] by Glottolog
"Tobo",
7811712,
"ngf-huo",
"Latn",
}
m["tbw"] = {
"Aborlan Tagbanwa",
3915475,
"phi",
"Latn",
}
m["tbx"] = {
"Kapin",
6366665,
"poz-ocw",
"Latn",
}
m["tby"] = {
"Tabaru",
11732670,
"paa-nha",
"Latn",
}
m["tbz"] = {
"Ditammari",
35186,
"nic-eov",
}
m["tca"] = {
"Ticuna",
1815205,
"sai-tyu",
"Latn",
}
m["tcb"] = {
"Tanacross",
28268,
"ath-nor",
"Latn",
}
m["tcc"] = {
"Datooga",
35327,
"sdv-nis",
"Latn",
}
m["tcd"] = {
"Tafi",
36545,
"alv-ktg",
}
m["tce"] = {
"Southern Tutchone",
31091048,
"ath-nor",
"Latn",
}
m["tcf"] = {
"Malinaltepec Tlapanec",
25559732,
"omq",
"Latn",
}
m["tcg"] = {
"Tamagario",
7680531,
"paa-kay",
"Latn",
}
m["tch"] = {
"Turks and Caicos Creole English",
7855478,
"crp",
"Latn",
ancestors = "en",
}
m["tci"] = {
"Wára",
20825638,
"paa-yam",
"Latn",
}
m["tck"] = {
"Tchitchege",
36595,
"bnt-tek",
}
m["tcl"] = {
"Taman (Myanmar)",
15616518,
"sit-jnp",
"Latn",
}
m["tcm"] = {
"Tanahmerah",
3514927,
"qfa-dis", -- Papuan; isolate per Glottolog and Palmer (2018), considered an independent branch of TNG by Usher
-- (2020); seems based only on some pronoun correspondences
"Latn",
}
m["tco"] = {
"Taungyo",
12953186,
"tbq-brm",
ancestors = "obr",
}
m["tcp"] = {
"Tawr Chin",
7689338,
"tbq-kuk",
}
m["tcq"] = {
"Kaiy",
6348709,
"paa-lkp",
"Latn",
}
m["tcs"] = {
"Torres Strait Creole",
36648,
"crp",
"Latn",
ancestors = "en",
}
m["tct"] = {
"T'en",
3442330,
"qfa-kms",
}
m["tcu"] = {
"Southeastern Tarahumara",
36807,
"azc-trc",
"Latn",
}
m["tcw"] = {
"Tecpatlán Totonac",
7692795,
"nai-ttn",
"Latn",
}
m["tcx"] = {
"โตทา",
34042,
"dra-tkt",
"Taml",
translit = {Taml = "Taml-translit"},
}
m["tcy"] = {
"ตูลู",
34251,
"dra-tlk",
"Tutg, Mlym, Knda", -- Mlym is nearer than Knda but both lack ɛ/ɛː.
translit = {
Tutg = "tcy-Tutg-translit",
},
-- Knda translit in [[Module:scripts/data]]
-- Mlym translit in [[Module:scripts/data]]
}
m["tcz"] = {
"Thado Chin",
6583558,
"tbq-kuk",
}
m["tda"] = {
"Tagdal",
36570,
"son",
}
m["tdb"] = {
"Panchpargania",
21946879,
"inc-eas",
"Deva, as-Beng, Orya, Chis",
translit = {
Deva = "Deva-translit",
["as-Beng"] = "Beng-translit",
Orya = "Orya-translit",
},
ancestors = "bh",
}
m["tdc"] = {
"Emberá-Tadó",
3052041,
"sai-chc",
"Latn",
}
m["tdd"] = {
"ไทใต้คง",
36556,
"tai-swe",
"Tale",
translit = "Tale-translit",
strip_diacritics = {remove_diacritics = c.ZWNJ .. c.ZWJ},
}
m["tde"] = {
"Tiranige Diga Dogon",
5313387,
"nic-dgw",
}
m["tdf"] = {
"Talieng",
37525108,
"mkh-ban",
}
m["tdg"] = {
"Western Tamang",
12953178,
"sit-tam",
"sit-tam-Tibt, Deva",
translit = {
Deva = "Deva-translit",
},
-- sit-tam-Tibt translit, display_text, strip_diacritics, sort_key in [[Module:scripts/data]]
-- NOTE: Formerly there was no sort_key or translit specified; I assume that's a mistake.
}
m["tdh"] = {
"Thulung",
56553,
"sit-kiw",
}
m["tdi"] = {
"Tomadino",
7818197,
"poz-btk",
"Latn",
}
m["tdj"] = {
"Tajio",
7676870,
"poz",
"Latn",
}
m["tdk"] = {
"Tambas",
3440392,
"cdc-wst",
}
m["tdl"] = {
"Sur",
3914453,
"nic-tar",
}
m["tdm"] = {
"Taruma",
5559094,
}
m["tdn"] = {
"Tondano",
3531514,
"phi",
"Latn",
}
m["tdo"] = {
"Teme",
3913994,
"alv-mye",
}
m["tdq"] = {
"Tita",
3914899,
"nic-bco",
}
m["tdr"] = {
"Todrah",
7812881,
"mkh",
}
m["tds"] = {
"Doutai",
5302331,
"paa-lkp",
"Latn",
}
m["tdt"] = {
"Tetun Dili",
12643484,
"poz-tim",
"Latn",
}
m["tdu"] = {
"Tempasuk Dusun",
3529155,
"poz-san",
}
m["tdv"] = {
"Toro",
3438367,
"nic-alu",
"Latn",
}
m["tdy"] = {
"Tadyawan",
7674700,
"phi",
"Latn",
}
m["tea"] = {
"เตอเมียร์",
3914693,
"mkh-asl",
"Latn",
}
m["teb"] = {
"Tetete",
7706087,
"sai-tuc",
"Latn",
}
m["tec"] = {
"Terik",
3518379,
"sdv-nma",
}
m["ted"] = {
"Tepo Krumen",
11152243,
"kro-grb",
}
m["tee"] = {
"Huehuetla Tepehua",
56455,
"nai-ttn",
"Latn",
}
m["tef"] = {
"Teressa",
3518362,
"aav-nic",
}
m["teg"] = {
"Teke-Tege",
36478,
"bnt-tek",
}
m["teh"] = {
"Tehuelche",
33930,
"sai-cho",
"Latn",
}
m["tei"] = {
"Torricelli",
3450788,
"paa-tor",
"Latn",
}
m["tek"] = {
"Ibali Teke",
2802914,
"bnt-tek",
}
m["tem"] = {
"Temne",
36613,
"alv-mel",
"Latn",
}
m["ten"] = {
"Tama (Colombia)",
3832969,
"sai-tuc",
"Latn",
}
m["teo"] = {
"Ateso",
29474,
"sdv-ttu",
"Latn",
}
m["tep"] = {
"Tepecano",
3915525,
"azc-pim",
"Latn",
}
m["teq"] = {
"Temein",
7698064,
"sdv",
}
m["ter"] = {
"Tereno",
3314742,
"awd",
"Latn",
}
m["tes"] = {
"Tengger",
12473479,
"poz",
"Latn, Java",
}
m["tet"] = {
"เตตุน",
34125,
"poz-tim",
"Latn",
}
m["teu"] = {
"Soo",
3437607,
"ssa-klk",
}
m["tev"] = {
"Teor",
12953198,
"poz-cma",
}
m["tew"] = {
"Tewa",
56492,
"nai-kta",
"Latn",
}
m["tex"] = {
"Tennet",
56346,
"sdv",
}
m["tey"] = {
"Tulishi",
12911106,
"qfa-kad",
"Latn",
}
m["tez"] = {
"Tetserret",
7706841,
"ber",
"Latn",
}
m["tfi"] = {
"Tofin Gbe",
3530330,
"alv-pph",
}
m["tfn"] = {
"Dena'ina",
27785,
"ath-nor",
"Latn",
}
m["tfo"] = {
"Tefaro",
7694618,
"paa-egb",
"Latn",
}
m["tfr"] = {
"Teribe",
36533,
"cba",
"Latn",
}
m["tft"] = {
"เตอร์นาเต",
3518492,
"paa-nha",
"Latn, Arab",
}
m["tga"] = {
"Sagalla",
12953082,
"bnt-cht",
}
m["tgb"] = {
"Tobilung",
12953913,
"poz-san",
"Latn",
}
m["tgc"] = {
"Tigak",
3528276,
"poz-ocw",
"Latn",
}
m["tgd"] = {
"Ciwogai",
3438799,
"cdc-wst",
"Latn",
}
m["tge"] = {
"Eastern Gorkha Tamang",
12953175,
"sit-tam",
"sit-tam-Tibt, Deva",
translit = {
Deva = "Deva-translit",
},
-- sit-tam-Tibt translit, display_text, strip_diacritics, sort_key in [[Module:scripts/data]]
-- NOTE: Formerly there was no sort_key or translit specified; I assume that's a mistake.
}
m["tgf"] = {
"Chali",
3695197,
"sit-ebo",
"Tibt, Latn",
override_translit = true,
-- Tibt translit, display_text, strip_diacritics, sort_key in [[Module:scripts/data]]
}
m["tgh"] = {
"Tobagonian Creole English",
7811541,
"crp",
ancestors = "en",
}
m["tgi"] = {
"Lawunuia",
3219937,
"poz-ocw",
}
m["tgn"] = {
"ตันดากาโนน",
63311769,
"phi",
"Latn",
}
m["tgo"] = {
"Sudest",
7675351,
"poz-ocw",
}
m["tgp"] = {
"Tangoa",
2410276,
"poz-vnn",
"Latn",
}
m["tgq"] = {
"Tring",
7842360,
"poz-swa",
}
m["tgr"] = {
"Tareng",
25559541,
"mkh",
}
m["tgs"] = {
"Nume",
3346290,
"poz-vnn",
"Latn",
}
m["tgt"] = {
"Central Tagbanwa",
3915515,
"phi",
"Tagb",
}
m["tgu"] = {
"Tanggu",
7682930,
"paa-ram",
"Latn",
}
m["tgv"] = {
"Tingui-Boto",
7808195,
"sai-mje",
"Latn",
}
m["tgw"] = {
"Tagwana Senoufo",
36514,
"alv-tdj",
}
m["tgx"] = {
"Tagish",
28064,
"ath-nor",
"Latn",
}
m["tgy"] = {
"Togoyo",
36825,
"nic-ser",
}
m["thc"] = {
"Tai Hang Tong",
7675753,
"tai-nor",
}
m["thd"] = {
"Kuuk Thaayorre",
6448718,
"aus-pmn",
"Latn",
}
m["the"] = {
"Chitwania Tharu",
22083804,
"inc-tha",
"Deva",
}
m["thf"] = {
"Thangmi",
7710314,
"sit-new",
}
m["thh"] = {
"Northern Tarahumara",
15616395,
"azc-trc",
"Latn",
}
m["thi"] = {
"Tai Long",
25559562,
"tai-swe",
}
m["thk"] = {
"Tharaka",
15407179,
"bnt-kka",
}
m["thl"] = {
"Dangaura Tharu",
22083815,
"inc-tha",
"Deva",
}
m["thm"] = {
"ทะวืง",
34780,
"mkh-vie",
"Thai", --Laoo is feasible but no evidence yet.
--sort_key = "Thai-sortkey",
}
m["thn"] = {
"Thachanadan",
7708880,
"dra-mal",
}
m["thp"] = {
"Thompson",
1755054,
"sal",
"Latn, Dupl",
}
m["thq"] = {
"Kochila Tharu",
22083826,
"inc-tha",
}
m["thr"] = {
"Rana Tharu",
12953920,
"inc-tha",
"Deva",
}
m["ths"] = {
"Thakali",
7709348,
"sit-tam",
}
m["tht"] = {
"Tahltan",
30125,
"ath-nor",
"Latn",
}
m["thu"] = {
"Thuri",
7799291,
"sdv-lon",
}
m["thy"] = {
"Tha",
3915849,
"alv-bwj",
}
m["tic"] = {
"Tira",
36677,
"alv-hei",
}
m["tif"] = {
"Tifal",
11732691,
"ngf-okk",
"Latn",
}
m["tig"] = {
"ทือเกร",
34129,
"sem-eth",
"Ethi",
translit = "Ethi-translit",
}
m["tih"] = {
"Timugon Murut",
7807680,
"poz-san",
"Latn",
}
m["tii"] = {
"Tiene",
36469,
"bnt-tek",
}
m["tij"] = {
"Tilung",
7803037,
"sit-kiw",
}
m["tik"] = {
"Tikar",
36483,
"nic-bdn",
"Latn",
}
m["til"] = {
"Tillamook",
2109432,
"sal",
"Latn",
}
m["tim"] = {
"Timbe",
7804599,
"ngf-huo",
"Latn",
}
m["tin"] = {
"Tindi",
36860,
"cau-and",
"Cyrl",
display_text = s["cau-Cyrl-displaytext"],
strip_diacritics = s["cau-Cyrl-stripdiacritics"],
}
m["tio"] = {
"Teop",
3518239,
"poz-ocw",
"Latn",
}
m["tip"] = {
"Trimuris",
7842270,
"paa-tkw",
}
m["tiq"] = {
"Tiéfo",
3914874,
"alv-sav",
}
m["tis"] = {
"Masadiit Itneg",
18748769,
"phi",
}
m["tit"] = {
"Tinigua",
3029805,
}
m["tiu"] = {
"Adasen",
11214797,
"phi",
}
m["tiv"] = {
"Tiv",
34131,
"nic-tvc",
"Latn",
}
m["tiw"] = {
"Tiwi",
1656014,
"qfa-iso",
"Latn",
}
m["tix"] = {
"Southern Tiwa",
7570552,
"nai-kta",
"Latn",
}
m["tiy"] = {
"ตีรูไร",
7809425,
"phi",
"Latn",
}
m["tiz"] = {
"Tai Hongjin",
3915716,
"tai-swe",
}
m["tja"] = {
"Tajuasohn",
3915326,
"kro-wkr",
}
m["tjg"] = {
"Tunjung",
3542117,
"poz",
"Latn",
}
m["tji"] = {
"Northern Tujia",
12953229,
"sit-tja",
"Latn",
}
m["tjl"] = {
"ไทแหล่ง",
7675773,
"tai-swe",
"Mymr",
}
m["tjm"] = {
"Timucua",
638300,
"qfa-iso",
"Latn",
}
m["tjn"] = {
"Tonjon",
3913372,
"dmn-jje",
}
m["tjs"] = {
"Southern Tujia",
12633994,
"sit-tja",
"Latn",
}
m["tju"] = {
"Tjurruru",
3913834,
"aus-nga",
"Latn",
}
m["tjw"] = {
"Chaap Wuurong",
5285187,
"aus-pam",
"Latn",
}
m["tka"] = {
"Truká",
7847648,
}
m["tkb"] = {
"Buksa",
20983638,
"inc-eas",
"Deva",
}
m["tkd"] = {
"Tukudede",
36863,
"poz-tim",
"Latn",
}
m["tke"] = {
"Takwane",
11030092,
"bnt-mak",
"Latn",
ancestors = "vmw",
}
m["tkf"] = {
"Tukumanféd",
42330115,
"tup-gua",
"Latn",
}
m["tkl"] = {
"Tokelauan",
34097,
"poz-pnp",
"Latn",
}
m["tkm"] = {
"Takelma",
56710,
}
m["tkn"] = {
"โทกูโนชิมะ",
3530484,
"jpx-nry",
"Jpan",
translit = s["jpx-translit"],
display_text = s["jpx-displaytext"],
strip_diacritics = s["jpx-stripdiacritics"],
sort_key = s["jpx-sortkey"],
}
m["tkp"] = {
"Tikopia",
36682,
"poz-pnp",
"Latn",
}
m["tkq"] = {
"Tee",
3075144,
"nic-ogo",
"Latn",
}
m["tkr"] = {
"Tsakhur",
36853,
"cau-wsm",
"Cyrl, Latn, Arab",
translit = "tkr-translit",
override_translit = true,
display_text = {
Cyrl = s["cau-Cyrl-displaytext"]
},
strip_diacritics = {
Cyrl = s["cau-Cyrl-stripdiacritics"],
Latn = s["cau-Latn-stripdiacritics"],
},
}
m["tks"] = {
"Ramandi",
25261947,
"xme-ttc",
"Arab",
ancestors = "xme-ttc-sou",
}
m["tkt"] = {
"Kathoriya Tharu",
22083822,
"inc-tha",
}
m["tku"] = {
"Upper Necaxa Totonac",
56343,
"nai-ttn",
"Latn",
}
m["tkv"] = {
"Mur Pano",
16939373,
"poz-ocw",
"Latn",
}
m["tkw"] = {
"Teanu",
3516731,
"poz-tem",
"Latn",
}
m["tkx"] = {
"Tangko",
7682993,
"ngf-okk",
"Latn",
}
m["tkz"] = {
"Takua",
7678544,
"mkh",
}
m["tla"] = {
"Southwestern Tepehuan",
3518245,
"azc-pim",
"Latn",
}
m["tlb"] = {
"Tobelo",
1142333,
"paa-nha",
"Latn",
}
m["tlc"] = {
"Misantla Totonac",
56460,
"nai-ttn",
"Latn",
}
m["tld"] = {
"Talaud",
7678964,
"phi",
"Latn",
}
m["tlf"] = {
"Telefol",
7696150,
"ngf-okk",
"Latn",
}
m["tlg"] = {
"Tofanma",
4461493,
"paa-pau",
}
m["tlh"] = {
"Klingon",
10134,
"art",
"Latn",
type = "appendix-constructed",
}
m["tli"] = {
"Tlingit",
27792,
"xnd",
"Latn, Cyrl",
}
m["tlj"] = {
"Talinga-Bwisi",
7679530,
"bnt-haj",
}
m["tlk"] = {
"Taloki",
3514563,
"poz-btk",
}
m["tll"] = {
"Tetela",
2613465,
"bnt-tet",
}
m["tlm"] = {
"Tolomako",
3130514,
"poz-vnn",
"Latn",
}
m["tln"] = {
"Talondo'",
7680293,
"poz-ssw",
}
m["tlo"] = {
"Talodi",
36525,
"alv-tal",
}
m["tlp"] = {
"Filomena Mata-Coahuitlán Totonac",
5449202,
"nai-ttn",
"Latn",
}
m["tlq"] = {
"Tai Loi",
7675784,
"mkh-pal",
}
m["tlr"] = {
"Talise",
3514510,
"poz-sls",
"Latn",
}
m["tls"] = {
"Tambotalo",
7681065,
"poz-vnn",
"Latn",
}
m["tlt"] = {
"Teluti",
12953194,
"poz-cma",
}
m["tlu"] = {
"Tulehu",
7852006,
"poz-cma",
}
m["tlv"] = {
"Taliabu",
3514498,
"poz-cma",
"Latn",
}
m["tlx"] = {
"Khehek",
3196124,
"poz-aay",
}
m["tly"] = {
"Talysh",
34318,
"xme-ttc",
"Latn, Cyrl, fa-Arab",
}
m["tma"] = {
"Tama (Chad)",
57001,
"sdv-tmn",
}
m["tmb"] = {
"Avava",
2157461,
"poz-vnc",
"Latn",
}
m["tmc"] = {
"Tumak",
3121045,
"cdc-est",
}
m["tmd"] = {
"Haruai",
12632146,
"paa-pia",
"Latn",
}
m["tme"] = {
"Tremembé",
5246937,
}
m["tmf"] = {
"Toba-Maskoy",
3033544,
"sai-mas",
"Latn",
}
m["tmg"] = {
"Ternateño",
7232597,
}
m["tmh"] = {
"Tuareg",
34065,
"ber",
"Latn, Tfng, Arab",
strip_diacritics = {
Latn = {remove_diacritics = c.grave .. c.acute .. c.circ},
},
}
m["tmi"] = {
"Tutuba",
7857052,
"poz-vnn",
"Latn",
}
m["tmj"] = {
"Samarokena",
7408865,
"paa-tkw",
}
m["tmk"] = {
"Northwestern Tamang",
15616509,
"sit-tam",
"sit-tam-Tibt, Deva",
translit = {
Deva = "Deva-translit",
},
-- sit-tam-Tibt translit, display_text, strip_diacritics, sort_key in [[Module:scripts/data]]
-- NOTE: Formerly there was no sort_key or translit specified; I assume that's a mistake.
}
m["tml"] = {
"Tamnim Citak",
12643315,
"ngf-ask",
"Latn",
}
m["tmm"] = {
"Tai Thanh",
7675842,
"tai-swe",
}
m["tmn"] = {
"Taman (Indonesia)",
7680671,
"poz",
"Latn",
}
m["tmo"] = {
"Temoq",
7698205,
"mkh-asl",
}
m["tmq"] = {
"Tumleo",
7852641,
"poz-ocw",
}
m["tms"] = {
"Tima",
36684,
"nic-ktl",
}
m["tmt"] = {
"Tasmate",
7687571,
"poz-vnn",
"Latn",
}
m["tmu"] = {
"Iau",
56867,
"paa-lkp",
"Latn",
}
m["tmv"] = {
"Motembo",
11013108,
"bnt-bun",
}
m["tmy"] = {
"Tami",
3514812,
"poz-oce",
}
m["tmz"] = {
"Tamanaku",
3441435,
"sai-ven",
"Latn",
}
m["tna"] = {
"Tacana",
3182551,
"sai-tac",
"Latn",
}
m["tnb"] = {
"Western Tunebo",
3181238,
"cba",
}
m["tnc"] = {
"Tanimuca-Retuarã",
36535,
"sai-tuc",
"Latn",
}
m["tnd"] = {
"Angosturas Tunebo",
25559604,
"cba",
}
m["tne"] = {
"Tinoc Kallahan",
3192219,
}
m["tng"] = {
"Tobanga",
3440501,
"cdc-est",
}
m["tnh"] = {
"Maiani",
6735243,
"ngf-kau",
"Latn",
}
m["tni"] = {
"Tandia",
7682454,
"poz-hce",
"Latn",
}
m["tnk"] = {
"Kwamera",
3200806,
"poz-vns",
"Latn",
}
m["tnl"] = {
"Lenakel",
3229429,
"poz-vns",
"Latn",
}
m["tnm"] = {
"Tabla",
7673105,
"paa-sen",
"Latn",
}
m["tnn"] = {
"North Tanna",
957945,
"poz-vns",
"Latn",
}
m["tno"] = {
"Toromono",
510544,
"sai-tac",
"Latn",
}
m["tnp"] = {
"Whitesands",
3063761,
"poz-vns",
"Latn",
}
m["tnq"] = {
"ตาอีโน",
5232952,
"awd-taa",
"Latn",
}
m["tnr"] = {
"Bedik",
35096,
"alv-ten",
"Latn",
}
m["tns"] = {
"Tenis",
7699870,
"poz-stm",
"Latn",
}
m["tnt"] = {
"Tontemboan",
3531666,
"phi",
"Latn",
}
m["tnu"] = {
"Tay Khang",
6362363,
"tai",
}
m["tnv"] = {
"Tanchangya",
7682361,
"inc-bas",
"Cakm",
ancestors = "inc-obn",
}
m["tnw"] = {
"Tonsawang",
3531660,
"phi",
"Latn",
}
m["tnx"] = {
"Tanema",
2106984,
"poz-tem",
"Latn",
}
m["tny"] = {
"Tongwe",
7821200,
"bnt",
}
m["tnz"] = {
"Ten'edn",
3073453,
"mkh-asl",
"Latn",
}
m["tob"] = {
"Toba",
3113756,
"sai-guc",
"Latn",
}
m["toc"] = {
"Coyutla Totonac",
15615591,
"nai-ttn",
"Latn",
}
m["tod"] = {
"Toma",
11055484,
"dmn-msw",
"Latn, Loma"
}
m["tof"] = {
"Gizrra",
5565941,
"paa-etf",
"Latn",
}
m["tog"] = {
"Tonga (Malawi)",
3847648,
"bnt-nys",
"Latn",
}
m["toh"] = {
"Tonga (Mozambique)",
7820988,
"bnt-bso",
}
m["toi"] = {
"Tonga (Zambia)",
34101,
"bnt-bot",
"Latn",
}
m["toj"] = {
"Tojolabal",
36762,
"myn",
"Latn",
}
m["tok"] = {
"Toki Pona",
36846,
"art",
"Latn",
type = "appendix-constructed",
}
m["tol"] = {
"Tolowa",
20827,
"ath-pco",
"Latn",
}
m["tom"] = {
"Tombulu",
3531199,
"phi",
"Latn",
}
m["too"] = {
"Xicotepec de Juárez Totonac",
8044353,
"nai-ttn",
"Latn",
}
m["top"] = {
"Papantla Totonac",
56329,
"nai-ttn",
"Latn",
}
m["toq"] = {
"Toposa",
3033588,
"sdv-ttu",
}
m["tor"] = {
"Togbo-Vara Banda",
11002922,
"bad-cnt",
}
m["tos"] = {
"Highland Totonac",
13154149,
"nai-ttn",
"Latn",
}
m["tou"] = {
"Tho",
22694631,
"mkh-vie",
"Latn",
}
m["tov"] = {
"Upper Taromi",
12953183,
"xme-ttc",
ancestors = "xme-ttc-cen",
}
m["tow"] = {
"Jemez",
3912876,
"nai-kta",
"Latn",
}
m["tox"] = {
"Tobian",
34022,
"poz-mic",
}
m["toy"] = {
"Topoiyo",
7824977,
"poz-kal",
}
m["toz"] = {
"To",
7811216,
"alv-mbm",
}
m["tpa"] = {
"Taupota",
7688832,
"poz-ocw",
}
m["tpc"] = {
"Azoyú Me'phaa",
25559730,
"omq",
"Latn",
}
m["tpe"] = {
"Tippera",
16115423,
"tbq-bdg",
}
m["tpf"] = {
"Tarpia",
12953185,
"poz-ocw",
"Latn",
}
m["tpg"] = {
"Kula",
6442714,
"paa-tap",
"Latn",
}
m["tpi"] = {
"ตอกปีซิน",
34159,
"crp",
"Latn",
ancestors = "en",
}
m["tpj"] = {
"Tapieté",
3121063,
"gn",
"Latn",
}
m["tpk"] = {
"Tupinikin",
33924,
"tup-gua",
}
m["tpl"] = {
"Tlacoapa Me'phaa",
16115511,
"omq",
}
m["tpm"] = {
"Tampulma",
36590,
"nic-gnw",
}
m["tpn"] = {
"Tupinambá",
31528147,
"tup-gua",
"Latn",
}
m["tpo"] = {
"Tai Pao",
7675795,
"tai-nor",
}
m["tpp"] = {
"Pisaflores Tepehua",
56349,
"nai-ttn",
}
m["tpq"] = {
"Tukpa",
12953230,
"sit-las",
}
m["tpr"] = {
"Tuparí",
3542217,
"tup",
"Latn",
}
m["tpt"] = {
"Tlachichilco Tepehua",
56330,
"nai-ttn",
}
m["tpu"] = {
"Tampuan",
3514882,
"mkh-ban",
"Khmr",
}
m["tpv"] = {
"Tanapag",
3397371,
"poz-mic",
}
m["tpw"] = {
"ตูปีเก่า",
56944,
"tup-gua",
"Latn",
}
m["tpx"] = {
"Acatepec Me'phaa",
31157882,
"omq",
"Latn",
}
m["tpy"] = {
"Trumai",
12294279,
"qfa-iso",
}
m["tpz"] = {
"Tinputz",
3529205,
"poz-ocw",
}
m["tqb"] = {
"Tembé",
10322157,
"tup-gua",
"Latn",
}
m["tql"] = {
"Lehali",
3229119,
"poz-vnn",
"Latn",
}
m["tqm"] = {
"Turumsa",
7856508,
"paa-dot",
"Latn",
}
m["tqn"] = {
"Tenino",
15699255,
"nai-shp",
"Latn",
ancestors = "nai-spt",
}
m["tqo"] = {
"Toaripi",
7811403,
"paa-eel",
"Latn",
}
m["tqp"] = {
"Tomoip",
3531388,
"poz-ocw",
}
m["tqq"] = {
"Tunni",
3514343,
"cus-som",
}
m["tqr"] = {
"Torona",
36679,
"alv-tal",
}
m["tqt"] = {
"Western Totonac",
7116691,
"nai-ttn",
"Latn",
}
m["tqu"] = {
"Touo",
56750,
}
m["tqw"] = {
"Tonkawa",
2454881,
"qfa-iso",
"Latn",
}
m["tra"] = {
"Tirahi",
3812406,
"inc-koh",
}
m["trb"] = {
"Terebu",
7701797,
"poz-ocw",
}
m["trc"] = {
"Copala Triqui",
12953935,
"omq-tri",
"Latn",
}
m["trd"] = {
"Turi",
7854914,
"mun",
}
m["tre"] = {
"East Tarangan",
18609750,
"poz",
}
m["trf"] = {
"Trinidadian Creole English",
7842493,
"crp",
"Latn",
ancestors = "en",
}
m["trg"] = {
"Lishán Didán",
56473,
"sem-nna",
}
m["trh"] = {
"Turaka",
12953237,
"ngf-dag",
"Latn",
}
m["tri"] = {
"Trió",
56885,
"sai-tar",
"Latn",
}
m["trj"] = {
"Toram",
3441225,
"cdc-est",
}
m["trl"] = {
"Traveller Scottish",
3915671,
"qfa-mix",
"Latn",
ancestors = "rom, sco",
}
m["trm"] = {
"Tregami",
34081,
"nur-sou",
}
m["trn"] = {
"Trinitario",
3539279,
"awd",
}
m["tro"] = {
"Tarao",
3515603,
"tbq-kuk",
"Latn",
}
m["trp"] = {
"กอกบอรอก",
35947,
"tbq-bdg",
"Beng, Latn" -- WP lists 2 more
}
m["trq"] = {
"San Martín Itunyoso Triqui",
12953934,
"omq-tri",
"Latn",
}
m["trr"] = {
"Taushiro",
1957508,
nil,
"Latn",
}
m["trs"] = {
"Chicahuaxtla Triqui",
3539587,
"omq-tri",
"Latn",
}
m["trt"] = {
"Tunggare",
615071,
"paa-egb",
"Latn",
}
m["tru"] = {
"Turoyo",
34040,
"sem-cna",
"Syrc, Latn",
translit = {
Syrc = "tru-translit",
},
strip_diacritics = {
Syrc = "Syrc-stripdiacritics",
},
}
m["trv"] = {
"Taroko",
716686,
"map-ata",
"Latn",
}
m["trw"] = {
"Torwali",
2665246,
"inc-koh",
"ur-Arab",
}
m["trx"] = {
"Tringgus",
7842365,
"day",
}
m["try"] = {
"ตุรุง",
7856514,
"tai-swe",
"as-Beng",
}
m["trz"] = {
"Torá",
7827518,
"sai-cpc",
}
m["tsa"] = {
"Tsaangi",
36675,
"bnt-nze",
}
m["tsb"] = {
"Tsamai",
2371358,
"cus-eas",
}
m["tsc"] = {
"Tswa",
2085051,
"bnt-tsr",
}
m["tsd"] = {
"Tsakonian",
220607,
"grk",
"Grek",
ancestors = "grc-dor",
translit = "el-translit",
-- Grek display_text, strip_diacritics, sort_key in [[Module:scripts/data]]
}
m["tse"] = {
"Tunisian Sign Language",
7853191,
"sgn",
}
m["tsf"] = {
"Southwestern Tamang",
12953176,
"sit-tam",
}
m["tsg"] = {
"ซูก",
34142,
"phi",
"Latn, Arab",
}
m["tsh"] = {
"Tsuvan",
3502326,
"cdc-cbm",
}
m["tsi"] = {
"Tsimshian",
20085721,
"nai-tsi",
"Latn",
}
m["tsj"] = {
"Tshangla",
36840,
"sit-tsk",
"Tibt, Latn, Deva",
translit = {
Deva = "Deva-translit",
},
override_translit = true,
-- Tibt translit, display_text, strip_diacritics, sort_key in [[Module:scripts/data]]
}
m["tsl"] = {
"Ts'ün-Lao",
3446675,
"tai",
}
m["tsm"] = {
"Turkish Sign Language",
36885,
"sgn",
}
m["tsp"] = {
"Northern Toussian",
11155635,
"alv-sav",
}
m["tsq"] = {
"มือไทย",
7709156,
"sgn",
"Sgnw",
}
m["tsr"] = {
"Akei",
2828964,
"poz-vnn",
"Latn",
}
m["tss"] = {
"Taiwan Sign Language",
34019,
"sgn-jsl",
}
m["tsu"] = {
"โจว",
716681,
"map",
"Latn",
}
m["tsv"] = {
"Tsogo",
36674,
"bnt-tso",
}
m["tsw"] = {
"Tsishingini",
13123571,
"nic-kam",
}
m["tsx"] = {
"Mubami",
6930815,
"paa-ani",
"Latn",
}
m["tsy"] = {
"Tebul Sign Language",
7692090,
"sgn",
}
m["tta"] = {
"Tutelo",
2311602,
"sio-ohv",
"Latn",
}
m["ttb"] = {
"Gaa",
3438361,
"nic-dak",
"Latn",
}
m["ttc"] = {
"Tektiteko",
36686,
"myn",
"Latn",
}
m["ttd"] = {
"Tauade",
7688634,
"qfa-dis", -- Papuan; isolate per Glottolog; Glottolog says "A Goilalan family uniting Kunimaipan, Tauade and Fuyug
-- is often posited based on the lexicostatistical figures reported in Tom E. Dutton 1975: 631-632" but
-- goes on to say the data "is clearly insufficient, as the lexical links so far proposed are few and
-- show irregular one-consonant correspondences".
"Latn",
}
m["tte"] = {
"Bwanabwana",
5003667,
"poz-ocw",
"Latn",
}
m["ttf"] = {
"Tuotomb",
7853459,
"nic-mbw",
"Latn",
}
m["ttg"] = {
"Tutong",
3507990,
"poz-swa",
"Latn",
}
m["tth"] = {
"Upper Ta'oih",
3512660,
"mkh-kat",
}
m["tti"] = {
"Tobati",
7811556,
"poz-ocw",
"Latn",
}
m["ttj"] = {
"Tooro",
7824218,
"bnt-nyg",
"Latn",
}
m["ttk"] = {
"Totoro",
3532756,
"sai-bar",
"Latn",
}
m["ttl"] = {
"Totela",
10962316,
"bnt-bot",
"Latn",
}
m["ttm"] = {
"Northern Tutchone",
20822,
"ath-nor",
"Latn",
}
m["ttn"] = {
"Towei",
7829606,
"paa-pau",
}
m["tto"] = {
"Lower Ta'oih",
25559539,
"mkh-kat",
}
m["ttp"] = {
"Tombelala",
6799663,
"poz-kal",
}
m["ttr"] = {
"Tera",
56267,
"cdc-cbm",
}
m["tts"] = {
"อีสาน",
33417,
"tai-swe",
"Thai", -- also Tai Noi/Lao Buhan script
--sort_key = "Thai-sortkey",
}
m["ttt"] = {
"Tat",
56489,
"ira-swi",
"Cyrl, Latn, Armn, fa-Arab",
-- Armn translit in [[Module:scripts/data]] (NOTE: formerly not present, probably an accidental omission)
ancestors = "fa",
}
m["ttu"] = {
"Torau",
3532208,
"poz-ocw",
}
m["ttv"] = {
"Titan",
3445811,
"poz-aay",
"Latn",
}
m["ttw"] = {
"Long Wat",
7856961,
"poz-swa",
}
m["tty"] = {
"Sikaritai",
7513600,
"paa-lkp",
"Latn",
}
m["ttz"] = {
"Tsum",
12953223,
"sit-kyk",
}
m["tua"] = {
"Wiarumus",
7998045,
"paa-tor",
"Latn",
}
m["tub"] = {
"Tübatulabal",
56704,
"azc",
"Latn",
}
m["tuc"] = {
"Mutu",
3331003,
"poz-ocw",
"Latn",
}
m["tud"] = {
"Tuxá",
7857217,
}
m["tue"] = {
"Tuyuca",
2520538,
"sai-tuc",
"Latn",
}
m["tuf"] = {
"Central Tunebo",
12953942,
"cba",
"Latn",
}
m["tug"] = {
"Tunia",
863721,
"alv-bua",
}
m["tuh"] = {
"Taulil",
3516141,
}
m["tui"] = {
"Tupuri",
36646,
"alv-mbm",
"Latn",
}
m["tuj"] = {
"Tugutil",
12953228,
"paa-nha",
"Latn",
}
m["tul"] = {
"Tula",
3914907,
"alv-wjk",
}
m["tum"] = {
"Tumbuka",
34138,
"bnt-nys",
"Latn",
}
m["tun"] = {
"Tunica",
56619,
"qfa-iso",
"Latn",
}
m["tuo"] = {
"Tucano",
3541834,
"sai-tuc",
"Latn",
}
m["tuq"] = {
"Tedaga",
36639,
"ssa-sah",
"Latn",
}
m["tus"] = {
"Tuscarora",
36944,
"iro-nor",
"Latn",
}
m["tuu"] = {
"Tututni",
20627,
"ath-pco",
"Latn",
}
m["tuv"] = {
"Turkana",
36958,
"sdv-ttu",
"Latn",
}
m["tux"] = {
"Tuxináwa",
7857204,
"sai-pan",
"Latn",
}
m["tuy"] = {
"Tugen",
3541935,
"sdv-nma",
}
m["tuz"] = {
"Turka",
36643,
"nic-gur",
"Latn",
}
m["tva"] = {
"Vaghua",
3553248,
"poz-ocw",
"Latn",
}
m["tvd"] = {
"Tsuvadi",
3914936,
"nic-kam",
}
m["tve"] = {
"Te'un",
7690709,
"poz-cet",
"Latn",
}
m["tvk"] = {
"Southeast Ambrym",
252411,
"poz-vnc",
"Latn",
}
m["tvl"] = {
"ตูวาลู",
34055,
"poz-pnp",
"Latn",
}
m["tvm"] = {
"Tela-Masbuar",
7695666,
"poz-tim",
}
m["tvn"] = {
"Tavoyan",
7689158,
"tbq-brm",
"Mymr",
ancestors = "obr",
}
m["tvo"] = {
"ตีโดเร",
3528199,
"paa-nha",
"Latn, Arab",
}
m["tvs"] = {
"Taveta",
15632387,
"bnt-par",
"Latn",
}
m["tvt"] = {
"Tutsa Naga",
7856987,
"sit-tno",
}
m["tvu"] = {
"Tunen",
36632,
"nic-mbw",
}
m["tvw"] = {
"Sedoa",
7445362,
"poz-kal",
}
m["tvx"] = {
"Taivoan",
1975271,
"map",
"Latn",
}
m["tvy"] = {
"Timor Pidgin",
4904029,
"crp",
ancestors = "pt",
}
m["twa"] = {
"Twana",
7857412,
"sal",
"Latn",
}
m["twb"] = {
"Western Tawbuid",
12953912,
"phi",
}
m["twc"] = {
"Teshenawa",
3436597,
"phi",
}
m["twe"] = {
"Teiwa",
3519302,
"paa-tap",
"Latn",
}
m["twf"] = {
"เทาส์",
7684320,
"nai-kta",
"Latn",
}
m["twg"] = {
"Tereweng",
12953200,
"paa-tap",
"Latn",
}
m["twh"] = {
"ไทขาว",
7675751,
"tai-swe",
"Tavt",
--translit = "Tavt-translit",
sort_key = {
from = {"[꪿ꫀ꫁ꫂ]", "([ꪵꪶꪹꪻꪼ])([ꪀ-ꪯ])"},
to = {"", "%2%1"}
},
}
m["twm"] = {
"Tawang Monpa",
36586,
"sit-ebo",
"Tibt",
override_translit = true,
-- Tibt translit, display_text, strip_diacritics, sort_key in [[Module:scripts/data]]
}
m["twn"] = {
"Twendi",
7857682,
"nic-mmb",
}
m["two"] = {
"Tswapong",
3446241,
"bnt-sts",
}
m["twp"] = {
"Ere",
3056045,
"poz-aay",
"Latn",
}
m["twq"] = {
"Tasawaq",
36564,
"son",
}
m["twr"] = {
"Southwestern Tarahumara",
12953909,
"azc-trc",
"Latn",
}
m["twt"] = {
"Turiwára",
3542307,
"tup-gua",
"Latn",
}
m["twu"] = {
"Termanu",
7702572,
"poz-tim",
"Latn",
}
m["tww"] = {
"Tuwari",
7857159,
"paa-wal",
"Latn",
}
m["twy"] = {
"Tawoyan",
3513542,
"poz-bre",
"Latn",
}
m["txa"] = {
"Tombonuo",
7818692,
"poz-san",
"Latn",
}
m["txb"] = {
"โทแคเรียนบี",
3199353,
"ine-toc",
"Latn",
standard_chars = "AaÄäĀāCcEeIiKkLlMmṂṃNnṄṅÑñOoPpRrSsŚśṢṣTtUuWwYy" .. c.punc,
}
m["txc"] = {
"Tsetsaut",
20829,
"ath-nor",
"Latn",
}
m["txe"] = {
"Totoli",
7828387,
"poz-tot",
"Latn",
}
m["txg"] = {
"ตังกุต",
2727930,
"ero",
"Tang",
-- Tang translit in [[Module:scripts/data]]
}
m["txh"] = {
"Thracian",
36793,
"ine",
"Latn, Polyt",
-- Polyt translit, display_text, strip_diacritics, sort_key in [[Module:scripts/data]]
}
m["txi"] = {
"Ikpeng",
9344891,
"sai-pek",
"Latn",
}
m["txj"] = {
"Tarjumo",
24906088,
"ssa-sah",
"Latn, Arab",
}
m["txm"] = {
"Tomini",
7818911,
"poz",
"Latn",
}
m["txn"] = {
"West Tarangan",
3515594,
"poz",
"Latn",
}
m["txo"] = {
"Toto",
36709,
"sit-dhi",
"Beng, Toto"
}
m["txq"] = {
"Tii",
7801784,
"poz-tim",
}
m["txr"] = {
"Tartessian",
36795,
"qfa-unc", -- extinct, no consensus on classification
}
m["txs"] = {
"Tonsea",
3531659,
"phi",
"Latn",
}
m["txt"] = {
"Citak",
3447279,
"ngf-ask",
"Latn",
}
m["txu"] = {
"Kayapó",
3101212,
"sai-nje",
"Latn",
}
m["txx"] = {
"Tatana",
18643518,
"poz-san",
"Latn",
}
m["tya"] = {
"Tauya",
7688978,
"ngf-rai",
"Latn",
}
m["tye"] = {
"Kyenga",
3913304,
"dmn-bbu",
"Latn",
}
m["tyh"] = {
"O'du",
3347428,
"mkh",
}
m["tyi"] = {
"Teke-Tsaayi",
33123613,
"bnt-nze",
}
m["tyj"] = {
"ไทแมน", -- Thai word for this
7675746,
"tai-nor", -- Chamberlain (1991), but Pittayaporn (2009) suggests tai-swe
"Latn, Tayo", -- Vietnam
}
m["tyl"] = {
"Thu Lao",
12953921,
"tai-cen",
}
m["tyn"] = {
"Kombai",
6428241,
"ngf-gaw",
"Latn",
}
m["typ"] = {
"Kuku-Thaypan",
3915693,
"aus-pmn",
"Latn",
}
m["tyr"] = {
"ไทแดง",
3915207,
"tai-swe",
"Tavt",
}
m["tys"] = {
"ซาปา",
3446668,
"tai-sap",
"Latn",
}
m["tyt"] = {
"Tày Tac",
7862029,
"tai-swe",
}
m["tyu"] = {
"Kua",
3832933,
"khi-kal",
}
m["tyv"] = {
"ตูวา",
34119,
"trk-ssb",
"Cyrl",
translit = "tyv-translit",
override_translit = true,
sort_key = "tyv-sortkey",
}
m["tyx"] = {
"Teke-Tyee",
36634,
"bnt-nze",
"Latn",
}
m["tyz"] = {
"ตั่ย", -- This does not mean its family "Tai" languages.
2511476,
"tai-tay",
"Latn, Hani",
sort_key = {
Hani = "Hani-sortkey"
},
}
m["tza"] = {
"Tanzanian Sign Language",
7684177,
"sgn",
}
m["tzh"] = {
"Tzeltal",
36808,
"myn",
"Latn",
}
m["tzj"] = {
"Tz'utujil",
36941,
"myn",
"Latn",
}
m["tzl"] = {
"Talossan",
1063911,
"art",
"Latn",
type = "appendix-constructed",
sort_key = "tzl-sortkey",
}
m["tzm"] = {
"Central Atlas Tamazight",
49741,
"ber",
"Tfng, Arab, Latn",
translit = {
Tfng = "Tfng-translit",
},
}
m["tzn"] = {
"Tugun",
12953225,
"poz-tim",
"Latn",
}
m["tzo"] = {
"โซตซิล",
36809,
"myn",
"Latn",
}
m["tzx"] = {
"Tabriak",
56872,
"paa-lsp",
"Latn",
}
return require("Module:languages").finalizeData(m, "language")
asseyub07jy456q835v2ykghmnlaha4
มอดูล:languages/data/3/s
828
36368
5720765
5719153
2026-04-21T07:01:14Z
OctraBot
3198
บอต: แทนที่ข้อความโดยอัตโนมัติ (-standardChars +standard_chars)
5720765
Scribunto
text/plain
local m_langdata = require("Module:languages/data")
-- Loaded on demand, as it may not be needed (depending on the data).
local function u(...)
u = require("Module:string utilities").char
return u(...)
end
local c = m_langdata.chars
local p = m_langdata.puaChars
local s = m_langdata.shared
local m = {}
m["saa"] = {
"Saba",
3914885,
"cdc-est",
"Latn",
}
m["sab"] = {
"Buglere",
3368506,
"cba",
"Latn",
}
m["sac"] = {
"Fox",
12714767,
"alg-sfk",
"Latn",
}
m["sad"] = {
"Sandawe",
34016,
"qfa-iso",
"Latn",
}
m["sae"] = {
"Sabanê",
3460478,
"sai-nmk",
"Latn",
}
m["saf"] = {
"Safaliba",
36432,
"nic-mre",
"Latn",
}
m["sah"] = {
"ซาคา",
34299,
"trk-nsb",
"Cyrl",
translit = "sah-translit",
override_translit = true,
}
m["saj"] = {
"Sahu",
7399757,
"paa-nha",
"Latn",
}
m["sak"] = {
"Sake",
36425,
"bnt-kel",
"Latn",
}
m["sam"] = {
"Samaritan Aramaic",
56612,
"sem-arw",
"Samr",
translit = "Samr-translit",
-- Samr strip_diacritics, sort_key in [[Module:scripts/data]]
}
m["sao"] = {
"Sause",
4409155,
"paa-tkw",
"Latn",
}
m["saq"] = {
"Samburu",
56536,
"sdv-lma",
}
m["sar"] = {
"Saraveca",
3450556,
"awd",
"Latn",
}
m["sas"] = {
"Sasak",
1294047,
"poz-bss",
"Latn, Bali, Java",
}
m["sat"] = {
"สันถาลี",
33965,
"mun",
"Olck",
translit = "Olck-translit",
override_translit = true,
}
m["sau"] = {
"Saleman",
7404262,
"poz-cet",
}
m["sav"] = {
"Saafi-Saafi",
36308,
"alv-cng",
"Arab, Latn",
}
m["saw"] = {
"Sawi",
677064,
"ngf-gaw",
"Latn",
}
m["sax"] = {
"Sa",
3460352,
"poz-vnn",
"Latn",
}
m["say"] = {
"Saya",
3914431,
"cdc-wst",
"Latn",
}
m["saz"] = {
"Saurashtra",
13292,
"inc-wes",
"Saur, Latn, Taml, Deva",
translit = {
Taml = "Taml-translit",
Deva = "Deva-translit",
},
ancestors = "inc-ogu",
}
m["sba"] = {
"Ngambay",
2372207,
"csu-sar",
"Latn",
}
m["sbb"] = {
"Simbo",
3484101,
"poz-ocw",
}
m["sbc"] = {
"Gele'",
3194847,
"poz-aay",
"Latn",
}
m["sbd"] = {
"Southern Samo",
33122730,
"dmn-sam",
"Latn",
}
m["sbe"] = {
"Saliba (New Guinea)",
3469737,
"poz-ocw",
}
m["sbf"] = {
"Shabo",
36342,
"ssa",
"Latn",
}
m["sbg"] = {
"Seget",
7446237,
"paa-wbh",
"Latn",
}
m["sbh"] = {
"Sori-Harengan",
36515,
"poz-aay",
"Latn",
}
m["sbi"] = {
"Seti",
7456682,
"paa-tor",
"Latn",
}
m["sbj"] = {
"Surbakhal",
759995,
}
m["sbk"] = {
"Safwa",
4121160,
"bnt-mby",
"Latn",
}
m["sbl"] = {
"Botolan Sambal",
4095195,
"phi",
"Latn",
}
m["sbm"] = {
"Sagala",
11732610,
"bnt-ruv",
"Latn",
}
m["sbn"] = {
"Sindhi Bhil",
25559289,
"inc-snd",
"Arab, Deva, Sind, Guru",
translit = {
Deva = "Deva-translit",
Sind = "Sind-translit",
Guru = "Guru-translit",
},
ancestors = "sd",
}
m["sbo"] = {
"Sabüm",
7396535,
"mkh-asl",
}
m["sbp"] = {
"Sangu (Tanzania)",
7418149,
"bnt-bki",
"Latn",
}
m["sbq"] = {
"Sileibi",
7514337,
"ngf-nso",
"Latn",
}
m["sbr"] = {
"Sembakung Murut",
7449148,
"poz-san",
}
m["sbs"] = {
"Subiya",
6442073,
"bnt-bot",
"Latn",
}
m["sbt"] = {
"Kimki",
6410160,
"paa-pau",
}
m["sbu"] = {
"Stod Bhoti",
15622700,
"sit-las",
}
m["sbv"] = {
"Sabine",
65455885,
"itc-sbl",
"Latn",
display_text = s["itc-Latn-displaytext"],
strip_diacritics = s["itc-Latn-stripdiacritics"],
sort_key = s["itc-Latn-sortkey"],
}
m["sbw"] = {
"Simba",
36430,
"bnt-tso",
"Latn",
}
m["sbx"] = {
"Seberuang",
12473470,
"poz-mly",
}
m["sby"] = {
"Soli",
7557754,
"bnt-bot",
"Latn",
}
m["sbz"] = {
"Sara Kaba",
25559318,
"csu-kab",
"Latn",
}
m["scb"] = {
"Chut",
2967709,
"mkh-vie",
"Latn",
}
m["sce"] = {
"ตงเซียง",
32947,
"xgn-shr",
"Arab, Latn",
}
m["scf"] = {
"San Miguel Creole French",
12953094,
"crp",
"Latn",
ancestors = "gcf",
sort_key = s["roa-oil-sortkey"],
}
m["scg"] = {
"Sanggau",
12473466,
"day",
}
m["sch"] = {
"Sakachep",
37054,
"tbq-kuk",
}
m["sci"] = {
"Sri Lankan Creole Malay",
1089151,
"crp",
"Latn",
ancestors = "ms",
}
m["sck"] = {
"Sadri",
765922,
"inc-bih",
"Deva, Kthi",
translit = {
Deva = "Deva-translit",
Kthi = "Kthi-translit",
},
}
m["scl"] = {
"Shina",
1353320,
"inc-shn",
"ur-Arab, Deva",
translit = {
Deva = "Deva-translit",
},
}
m["scn"] = {
"ซิซิลี",
33973,
"roa-itr",
"Latn",
}
m["sco"] = {
"สกอต",
14549,
"gmw-ang",
"Latn",
ancestors = "gmw-msc",
}
m["scp"] = {
"Yolmo",
22662107,
"sit-kyk",
"Deva",
translit = {
Deva = "Deva-translit",
},
}
m["scq"] = {
"สะโอจ",
6583617,
"mkh-pea",
}
m["scs"] = {
"North Slavey",
20628,
"den",
"Latn",
}
m["scu"] = {
"Shumcho",
22077739,
"sit-kin",
}
m["scv"] = {
"Sheni",
11015820,
"nic-jer",
"Latn",
ancestors = "zir",
}
m["scw"] = {
"Sha",
3438816,
"cdc-wst",
"Latn",
}
m["scx"] = {
"Sicel",
36667,
"itc",
"Polyt",
-- Polyt translit, display_text, strip_diacritics, sort_key in [[Module:scripts/data]]
}
m["scz"] = {
"Shetland",
3069598,
"qfa-mix",
"Latn",
ancestors = "nrn, gmw-msc",
standard_chars = "AaBbCcDdEeFfGgHhIiJjKkLlMmNnOoPpQqRrSsTtUuVvWwXxYyZzØøÖüÜü0123456789" .. c.punc,
}
m["sda"] = {
"Toraja-Sa'dan",
36673,
"poz-ssw",
"Latn",
}
m["sdb"] = {
"Shabak",
3289596,
"ira-zgr",
ancestors = "hac",
}
m["sdc"] = {
"ซัสซารี",
845441,
"roa-itr",
"Latn",
}
m["sde"] = {
"Surubu",
3913336,
"nic-kau",
"Latn",
}
m["sdf"] = {
"Sarli",
7424256,
"ira-zgr",
ancestors = "hac",
}
m["sdg"] = {
"Savi",
3474654,
"inc-dng",
}
m["sdh"] = {
"Southern Kurdish",
1496597,
"ku",
"ku-Arab",
translit = "sdh-translit",
strip_diacritics = {remove_diacritics = c.kasra .. c.sukun},
}
m["sdj"] = {
"Suundi",
7650407,
"bnt-kng",
"Latn",
}
m["sdk"] = {
"Sos Kundi",
7563811,
"paa-ndu",
"Latn",
}
m["sdl"] = {
"Saudi Arabian Sign Language",
3504160,
"sgn",
}
m["sdm"] = {
"Semandang",
7449012,
"day",
}
m["sdn"] = {
"Gallurese",
612220,
"roa-itr",
"Latn",
ancestors = "co",
}
m["sdo"] = {
"Bukar-Sadung Bidayuh",
2927799,
"day",
"Latn",
}
m["sdp"] = {
"Sherdukpen",
7494785,
"sit-khm",
}
m["sdr"] = {
"Oraon Sadri",
12953860,
"inc-bih",
}
m["sds"] = {
"เบอร์เบอร์แบบตูนีเซีย",
5329732,
"ber",
}
m["sdu"] = {
"Sarudu",
7424700,
"poz-cet",
}
m["sdx"] = {
"Sibu Melanau",
18642842,
"poz-bnn",
}
m["sea"] = {
"เซอไม",
3135426,
"mkh-asl",
"Latn",
}
-- seb is a duplicate code of spp
m["sec"] = {
"Sechelt",
7442898,
"sal",
"Latn",
}
m["sed"] = {
"Sedang",
56448,
"mkh-nbn",
"Latn",
}
m["see"] = {
"Seneca",
1185133,
"iro-nor",
"Latn",
}
m["sef"] = {
"Cebaara Senoufo",
10975121,
"alv-snr",
"Latn",
}
m["seg"] = {
"Segeju",
17584599,
"bnt-mij",
"Latn",
}
m["seh"] = {
"Sena",
2964008,
"bnt-sna",
"Latn",
}
m["sei"] = {
"Seri",
36583,
"qfa-iso",
"Latn",
}
m["sej"] = {
"Sene",
7450252,
"ngf-huo",
"Latn",
}
m["sek"] = {
"Sekani",
28562,
"ath-nor",
"Latn",
}
m["sen"] = {
"Nanerigé Sénoufo",
36002,
"alv-sma",
}
m["seo"] = {
"Suarmin",
7630513,
"qfa-dis", -- Papuan; isolate or unclassified in Glottolog; Sepik language in Foley (2018)
"Latn",
}
m["sep"] = {
"Sìcìté Sénoufo",
56787,
"alv-sma",
}
m["seq"] = {
"Senara Sénoufo",
35210,
"alv-snr",
}
m["ser"] = {
"Serrano",
3479942,
"azc-tak",
"Latn",
}
m["ses"] = {
"Koyraboro Senni",
35655,
"son",
"Latn",
}
m["set"] = {
"Sentani",
3441672,
"paa-sen",
"Latn",
}
m["seu"] = {
"Serui-Laut",
7455503,
"poz-hce",
"Latn",
}
m["sev"] = {
"Nyarafolo Senoufo",
36306,
"alv-snr",
}
m["sew"] = {
"Sewa Bay",
7458126,
"poz-ocw",
}
m["sey"] = {
"Secoya",
3477218,
"sai-tuc",
"Latn",
}
m["sez"] = {
"Senthang Chin",
7451223,
"tbq-kuk",
}
m["sfb"] = {
"French Belgian Sign Language",
3217332,
"sgn",
}
m["sfe"] = {
"Eastern Subanun",
63311321,
"phi",
"Latn",
}
m["sfm"] = {
"Small Flowery Miao",
7542773,
"hmn",
}
m["sfs"] = {
"South African Sign Language",
3322093,
"sgn",
}
m["sfw"] = {
"Sehwi",
36593,
"alv-ctn",
"Latn",
}
m["sga"] = {
"ไอริชเก่า",
35308,
"cel-gae",
"Latn, Ogam",
strip_diacritics = {remove_diacritics = c.dotabove .. c.diaer .. "·"},
sort_key = "sga-sortkey",
standard_chars = "AaÁáBbCcDdEeÉéFfGgHhIiÍíLlMmNnOoÓóPpRrSsTtUuÚú0123456789ᚁᚂᚃᚄᚅᚆᚇᚈᚉᚊᚋᚌᚍᚎᚏᚐᚑᚒᚓᚔ" .. c.punc,
}
m["sgb"] = {
"Mag-Anchi Ayta",
4356243,
"phi",
"Latn",
}
m["sgc"] = {
"Kipsigis",
56339,
"sdv-nma",
}
m["sgd"] = {
"ซูรีเกาโนน",
34140,
"phi",
"Latn",
}
m["sge"] = {
"Segai",
7446180,
}
m["sgg"] = {
"Swiss-German Sign Language",
35150,
"sgn",
}
m["sgh"] = {
"Shughni",
34053,
"ira-shr",
"Latn, Cyrl",
translit = "sgh-translit",
override_translit = true,
}
m["sgi"] = {
"Suga",
36475,
"nic-mmb",
"Latn",
}
m["sgk"] = {
"Sangkong",
2945610,
"tbq-bis",
}
m["sgm"] = {
"Singa",
7522797,
"bnt-lok",
"Latn",
}
m["sgp"] = {
"Singpho",
7524158,
"sit-jnp",
"Latn",
}
m["sgr"] = {
"Sangisari",
3394363,
"ira-kms",
"Arab",
}
m["sgs"] = {
"Samogitian",
213434,
"bat-eas",
"Latn",
wikimedia_codes = "bat-smg",
ancestors = "olt",
display_text = "lt-common",
strip_diacritics = "lt-common",
sort_key = "lt-common",
}
m["sgt"] = {
"Brokpake",
56603,
"sit-tib",
"Tibt",
override_translit = true,
-- Tibt translit, display_text, strip_diacritics, sort_key in [[Module:scripts/data]]
}
m["sgu"] = {
"Salas",
7403694,
"poz-cma",
}
m["sgw"] = {
"Sebat Bet Gurage",
2707343,
"sem-eth",
"Ethi",
}
m["sgx"] = {
"Sierra Leone Sign Language",
7511448,
"sgn",
}
m["sgy"] = {
"Sanglechi",
3472220,
"ira-sgi",
}
m["sgz"] = {
"Sursurunga",
36511,
"poz-ocw",
"Latn",
}
m["sha"] = {
"Shall-Zwall",
3915355,
"nic-beo",
}
m["shb"] = {
"Ninam",
3436586,
"sai-ynm",
"Latn",
}
m["shc"] = {
"Sonde",
7560881,
"bnt-pen",
"Latn",
}
m["shd"] = {
"Kundal Shahi",
6444265,
"inc-shn",
"Arab",
}
m["she"] = {
"Sheko",
3183355,
"omv-diz",
}
m["shg"] = {
"Shua",
3501092,
"khi-kal",
"Latn",
}
m["shh"] = {
"Shoshone",
33811,
"azc-num",
"Latn",
}
m["shi"] = {
"Tashelhit",
34152,
"ber",
"Latn, Arab, Tfng, Hebr",
ancestors = "shi-med",
translit = {
Tfng = "Tfng-translit",
},
strip_diacritics = {
Arab = "ar-stripdiacritics",
},
-- Hebr display_text, strip_diacritics, sort_key in [[Module:scripts/data]]
}
m["shj"] = {
"Shatt",
56344,
"sdv-daj",
}
m["shk"] = {
"Shilluk",
36486,
"sdv-lon",
"Latn",
}
m["shl"] = {
"Shendu",
22074616,
"tbq-kuk",
}
m["shm"] = {
"Shahrudi",
7462280,
"xme-ttc",
"fa-Arab, Latn",
ancestors = "xme-ttc-cen",
}
m["shn"] = {
"ไทใหญ่",
56482,
"tai-swe",
"Mymr",
translit = "shn-translit",
sort_key = {
from = {"[ၢႃ]", "ဵ", "ႅ", "ႇ", "ႈ", "း", "ႉ", "ႊ"},
to = {"ာ", "ေ", "ႄ", "႒", "႓", "႔", "႕", "႖"}
},
}
m["sho"] = {
"Shanga",
3913931,
"dmn-bbu",
"Latn",
}
m["shp"] = {
"Shipibo-Conibo",
2671988,
"sai-pan",
"Latn",
}
m["shq"] = {
"Sala",
10961665,
"bnt-bot",
"Latn",
}
m["shr"] = {
"Shi",
3481999,
"bnt-shh",
"Latn",
}
m["shs"] = {
"Shuswap",
3482685,
"sal",
"Latn",
}
m["sht"] = {
"Shasta",
56396,
"nai-shs",
"Latn",
}
m["shu"] = {
"Chadian Arabic",
56497,
"sem-arb",
"Arab",
strip_diacritics = {
remove_diacritics = c.kashida .. c.fathatan .. c.dammatan .. c.kasratan .. c.fatha .. c.damma .. c.kasra .. c.shadda .. c.sukun .. c.superalef,
from = {u(0x0671)},
to = {u(0x0627)}
},
}
m["shv"] = {
"Shehri",
33445,
"sem-sar",
"Arab, Latn",
}
m["shw"] = {
"Shwai",
36527,
"alv-hei",
}
m["shx"] = {
"She",
2605689,
"hmn",
}
m["shy"] = {
"Tachawit",
33274,
"ber",
"Tfng, Arab, Latn",
translit = "Tfng-translit",
}
m["shz"] = {
"Syenara Senoufo",
36316,
"alv-snr",
}
m["sia"] = {
"Akkala Sami",
35241,
"smi",
"Cyrl, Latn",
translit = "sia-translit",
display_text = {
from = {"'"},
to = {"ˈ"}
},
strip_diacritics = {remove_diacritics = "'ˈ"},
}
m["sib"] = {
"Sebop",
7442799,
"poz-swa",
"Latn",
}
m["sid"] = {
"ซีดามา",
33786,
"cus-hec",
"Latn, Ethi",
}
m["sie"] = {
"Simaa",
7517329,
"bnt-kav",
"Latn",
}
m["sif"] = {
"Siamou",
36252,
}
m["sig"] = {
"Paasaal",
36426,
"nic-sis",
"Latn",
}
m["sih"] = {
"Sîshëë",
8072753,
"poz-cln",
"Latn",
}
m["sii"] = {
"Shom Peng",
1039346,
"aav",
}
m["sij"] = {
"Numbami",
3346277,
"poz-ocw",
"Latn",
}
m["sik"] = {
"Sikiana",
3443734,
"sai-prk",
"Latn",
}
m["sil"] = {
"Tumulung Sisaala",
25383006,
"nic-sis",
"Latn",
}
m["sim"] = {
"Seim",
7446815,
"paa-spk",
"Latn",
}
m["sip"] = {
"สิกขิม",
35285,
"sit-tib",
"Tibt",
ancestors = "xct",
override_translit = true,
-- Tibt translit, display_text, strip_diacritics, sort_key in [[Module:scripts/data]]
}
m["siq"] = {
"Sonia",
7561770,
"ngf-bos",
"Latn",
}
m["sir"] = {
"Siri",
3438729,
"cdc-wst",
"Latn",
}
m["sis"] = {
"Siuslaw",
2315424,
}
m["siu"] = {
"Sinagen",
7521655,
"paa-tor",
"Latn",
}
m["siv"] = {
"Sumariup",
7636966,
"paa-spk",
"Latn",
}
m["siw"] = {
"Siwai",
7532519,
"paa-sbo",
"Latn",
}
m["six"] = {
"Sumau",
7637021,
"ngf-pek",
"Latn",
}
m["siy"] = {
"Sivandi",
13269,
"xme",
"fa-Arab, Latn",
ancestors = "xme-mid",
}
m["siz"] = {
"Siwi",
36814,
"ber",
"Tfng, Arab, Latn",
}
m["sja"] = {
"Epena",
3055682,
"sai-chc",
"Latn",
}
m["sjb"] = {
"Sajau Basap",
4684353,
"poz-bnn",
}
m["sjc"] = {
"Shaojiang Min",
3431451,
"zhx-inm",
"Hants",
generate_forms = "zh-generateforms",
sort_key = "Hani-sortkey",
}
m["sjd"] = {
"ซามีแบบกิลดิน",
33656,
"smi",
"Cyrl",
translit = "sjd-translit",
display_text = {
from = {"'"},
to = {"ˈ"}
},
strip_diacritics = {remove_diacritics = "'ˈ"},
}
m["sje"] = {
"ซามีแบบปีเต",
56314,
"smi",
"Latn",
display_text = {
from = {"'"},
to = {"ˈ"}
},
strip_diacritics = {remove_diacritics = c.macron .. "'ˈ"},
sort_key = "sje-sortkey",
}
m["sjg"] = {
"Assangori",
3502255,
"sdv-tmn",
}
m["sjk"] = {
"Kemi Sami",
35871,
"smi",
"Latn",
display_text = {
from = {"'"},
to = {"ˈ"}
},
strip_diacritics = {remove_diacritics = "'ˈ"},
}
m["sjl"] = {
"Miji",
6845470,
"sit-hrs",
}
m["sjm"] = {
"Mapun",
3287253,
"poz-sbj",
"Latn",
}
m["sjn"] = {
"ซินดาริน",
56437,
"art",
"Latn, Teng",
type = "appendix-constructed",
}
m["sjo"] = {
"Xibe",
13223,
"tuw-jrc",
"sjo-Mong",
ancestors = "mnc",
}
m["sjp"] = {
"Surjapuri",
7645351,
"inc-krd",
"Deva, as-Beng, Kthi",
translit = {
Deva = "Deva-translit",
["as-Beng"] = "Beng-translit",
Kthi = "Kthi-translit",
},
}
m["sjr"] = {
"Siar-Lak",
3482907,
"poz-ocw",
}
m["sjs"] = {
"Senhaja de Srair",
56744,
"ber",
"Latn, Tfng, Arab",
strip_diacritics = {
Arab = "ar-stripdiacritics",
},
translit = {
Tfng = "Tfng-translit",
}
}
m["sjt"] = {
"Ter Sami",
36656,
"smi",
"Cyrl, Latn",
display_text = {
from = {"'"},
to = {"ˈ"}
},
strip_diacritics = {remove_diacritics = "'ˈ"},
translit = "sjt-translit",
}
m["sju"] = {
"Ume Sami",
56415,
"smi",
"Latn",
strip_diacritics = {remove_diacritics = c.macron .. "'ˈ"},
display_text = {
from = {"'"},
to = {"ˈ"}
},
sort_key = "sju-sortkey",
}
m["sjw"] = {
"Shawnee",
2669206,
"alg",
"Latn",
}
m["ska"] = {
"Skagit",
25559652,
"sal",
"Latn",
}
m["skb"] = {
"แสก",
36437,
"tai-nor",
"Thai",
--sort_key = "Thai-sortkey",
}
m["skc"] = {
"Ma Manda",
6720783,
"ngf-fin",
"Latn",
}
m["skd"] = {
"Southern Sierra Miwok",
3492334,
"nai-utn",
"Latn",
}
m["ske"] = {
"Ske",
7534244,
"poz-vnn",
"Latn",
}
m["skf"] = {
"Mekéns",
3304806,
"tup",
"Latn",
}
m["skh"] = {
"Sikule",
3121081,
"poz-nws",
}
m["ski"] = {
"Sika",
33960,
"poz-cet",
"Latn",
}
m["skj"] = { -- compare 'ths'
"Seke",
30226846,
"sit-tam",
}
m["skk"] = {
"Sok",
12953887,
"mkh-ban",
}
m["skm"] = {
"Sakam",
6448517,
"ngf-fin",
"Latn",
}
m["skn"] = {
"Kolibugan Subanon",
18755617,
"phi",
"Latn",
}
m["sko"] = {
"Seko Tengah",
15613270,
"poz",
}
m["skp"] = {
"Sekapan",
7447132,
"poz-bnn",
}
m["skq"] = {
"Sininkere",
3914896,
"dmn-man",
"Latn",
}
m["skr"] = {
"Saraiki",
33902,
"inc-pan",
"pa-Arab, Mult, Deva",
ancestors = "lah",
strip_diacritics = {remove_diacritics = c.fathatan .. c.dammatan .. c.kasratan .. c.fatha .. c.damma .. c.kasra .. c.shadda .. c.sukun .. c.nunghunna},
translit = {
["pa-Arab"] = "pa-Arab-translit",
Deva = "Deva-translit",
Mult = "Mult-translit",
},
}
m["sks"] = {
"ไมยา",
12952760,
"ngf-kau",
"Latn",
}
m["skt"] = {
"Sakata",
36691,
"bnt-bnm",
"Latn",
}
m["sku"] = {
"Sakao",
3298421,
"poz-vnn",
"Latn",
}
m["skv"] = {
"Skou",
3915200,
"paa-msk",
"Latn",
}
m["skw"] = {
"Skepi Creole Dutch",
2522153,
"crp",
"Latn",
ancestors = "nl",
}
m["skx"] = {
"Seko Padang",
15613282,
"poz-ssw",
"Latn",
}
m["sky"] = {
"ซีกายานา",
7439242,
"poz-pnp",
"Latn",
}
m["skz"] = {
"Sekar",
7447136,
"poz-cet",
}
m["slc"] = {
"Saliba (Colombia)",
3441097,
nil,
"Latn",
}
m["sld"] = {
"Sisaala",
11020264,
"nic-sis",
"Latn",
}
m["sle"] = {
"Sholaga",
7500203,
"dra-kan",
"Knda",
-- Knda translit in [[Module:scripts/data]]
}
m["slf"] = {
"Swiss-Italian Sign Language",
12953479,
"sgn",
}
m["slg"] = {
"Selungai Murut",
7448844,
"poz-san",
}
m["slh"] = {
"Southern Puget Sound Salish",
12642471,
"sal",
"Latn",
}
-- "sli" "Silesian German" IS SUBSUMED INTO "gmw-ecg" "East Central German"
m["slj"] = {
"Salumá",
7406296,
"sai-prk",
"Latn",
}
m["sll"] = {
"Salt-Yui",
7405785,
"ngf-chw",
"Latn",
}
m["slm"] = {
"ซามาแบบปางูตารัน",
3362086,
"poz-sbj",
"Latn",
}
m["sln"] = {
"Salinan",
1568938,
"qfa-iso",
"Latn",
}
m["slp"] = {
"Lamaholot",
6480777,
"poz-cet",
"Latn",
}
m["slr"] = {
"ซาลาร์",
33963,
"trk-ogz",
"Arab, Latn",
ancestors = "trk-eog",
}
m["sls"] = {
"Singapore Sign Language",
7512563,
"sgn",
}
m["slt"] = {
"Sila",
7514021,
"tbq-sil",
}
m["slu"] = {
"Selaru",
7447500,
"poz-cet",
"Latn",
}
m["slw"] = {
"Sialum",
7506694,
"ngf-huo",
"Latn",
}
m["slx"] = {
"Salampasu",
7403607,
"bnt-lun",
"Latn",
}
m["sly"] = {
"Selayar",
7447520,
"poz-ssw",
}
m["slz"] = {
"Ma'ya",
2291492,
"poz-hce",
"Latn",
}
m["sma"] = {
"ซามีใต้",
13293,
"smi",
"Latn",
display_text = {
from = {"'"},
to = {"ˈ"}
},
strip_diacritics = {remove_diacritics = "'ˈ"},
sort_key = "sma-sortkey",
}
m["smb"] = {
"Simbari",
7517427,
"ngf-ang",
"Latn",
}
m["smc"] = {
"Som",
7559081,
"ngf-fin",
"Latn",
}
m["smd"] = {
"Sama",
6407456,
"bnt-kmb",
"Latn",
}
m["smf"] = {
"Auwe",
3502072,
"paa-brd",
"Latn",
ancestors = "dnd",
}
m["smg"] = {
"Simbali",
56692,
"paa-bng",
"Latn",
}
m["smh"] = {
"Samei",
7409269,
"tbq-axi",
}
m["smj"] = {
"ซามีแบบลูเล",
56322,
"smi",
"Latn",
display_text = {
from = {"'"},
to = {"ˈ"}
},
strip_diacritics = {remove_diacritics = c.macron .. "'ˈ"},
sort_key = "smj-sortkey",
}
m["smk"] = {
"โบลีเนา",
2669235,
"phi",
"Latn, Tglg",
}
m["sml"] = {
"ซามาตอนกลาง",
3470593,
"poz-sbj",
"Latn",
}
m["smm"] = {
"Musasa",
6940122,
"inc-eas",
ancestors = "bh",
}
m["smn"] = {
"ซามีแบบอีนารี",
33462,
"smi",
"Latn",
display_text = {
from = {"'"},
to = {"ˈ"}
},
strip_diacritics = {remove_diacritics = c.dotbelow .. "'ˈ"},
sort_key = "smn-sortkey",
}
m["smp"] = {
"Samaritan Hebrew",
56502,
"sem-can",
"Samr",
translit = "Samr-translit",
-- Samr strip_diacritics, sort_key in [[Module:scripts/data]]
ancestors = "hbo",
}
m["smq"] = {
"Samo",
7409884,
"ngf-est",
"Latn",
}
m["smr"] = {
"Simeulue",
2992833,
"poz-nws",
"Latn",
}
m["sms"] = {
"Skolt Sami",
13271,
"smi",
"Latn",
display_text = {
from = {"'"},
to = {"ˈ"}
},
strip_diacritics = {remove_diacritics = c.dotbelow .. "'ˈ"},
sort_key = "sms-sortkey",
}
m["smt"] = {
"Simte",
7521268,
"tbq-kuk",
}
m["smu"] = {
"สมราย",
6583612,
"mkh-pea",
}
m["smv"] = {
"Samvedi",
6345632,
"inc-sou",
}
m["smw"] = {
"Sumbawa",
3182585,
"poz-bss",
"Latn",
}
m["smx"] = {
"Samba",
11120157,
"bnt-pen",
"Latn",
}
m["smy"] = {
"Semnani",
14531212,
"xme",
"fa-Arab, Latn",
}
m["smz"] = {
"Simeku",
7517534,
"paa-sbo",
"Latn",
}
m["snb"] = {
"Sebuyau",
7442836,
"poz-mly",
"Latn",
}
m["snc"] = {
"Sinaugoro",
4170719,
"poz-ocw",
"Latn",
}
m["sne"] = {
"Bau Bidayuh",
2891938,
"day",
"Latn",
}
m["snf"] = {
"Noon",
36304,
"alv-cng",
"Latn",
}
m["sng"] = {
"Sanga (Congo)",
3438316,
"bnt-lub",
"Latn",
}
m["sni"] = {
"Sensi",
7451029,
"sai-pan",
"Latn",
}
m["snj"] = {
"Riverain Sango",
25559751,
"crp",
"Latn",
ancestors = "ngb",
}
m["snk"] = {
"Soninke",
36660,
"dmn-snb",
"Latn",
}
m["snl"] = {
"Sangil",
3472206,
"phi",
"Latn",
}
m["snm"] = {
"Southern Ma'di",
15637273,
"csu-mma",
}
m["snn"] = {
"Siona",
3485116,
"sai-tuc",
"Latn",
}
m["sno"] = {
"Snohomish",
25559662,
"sal",
"Latn",
}
m["snp"] = {
"Siane",
7506812,
"ngf-kag",
"Latn",
}
m["snq"] = {
"Sangu (Gabon)",
36609,
"bnt-sir",
"Latn",
}
m["snr"] = {
"Sihan",
7513400,
"ngf-gum",
"Latn",
}
m["sns"] = {
"Nahavaq",
2160435,
"poz-vnc",
"Latn",
}
m["snu"] = {
"Senggi",
7929052,
"paa-brd",
"Latn",
}
m["snv"] = {
"Sa'ban",
3474891,
"poz-swa",
"Latn",
}
m["snw"] = {
"Selee",
36272,
"alv-ntg",
"Latn",
}
m["snx"] = {
"Sam",
7408387,
"ngf-min",
"Latn",
}
m["sny"] = {
"Saniyo-Hiyewe",
7418302,
"paa-spk",
"Latn",
}
m["snz"] = {
"Kou",
7525035, -- also 4803639
"ngf-eva",
"Latn",
}
m["soa"] = {
"โซ่ง",
7709159,
"tai-swe",
"Tavt, Thai",
--translit = "Tavt-translit",
sort_key = {
from = {"([ꪵꪶꪹꪻꪼ])([ꪀ-ꪯ])", "([เแโใไ])([ก-ฮ])"},
to = {"%2%1", "%2%1"}
},
}
m["sob"] = {
"Sobei",
3121035,
"poz-ocw",
"Latn",
}
m["soc"] = {
"Soko",
7555138,
"bnt-ske",
"Latn",
}
m["sod"] = {
"Songoora",
7561296,
"bnt-lgb",
"Latn",
}
m["soe"] = {
"Songomeno",
5713543,
"bnt-bsh",
"Latn",
}
m["sog"] = {
"Sogdian",
205979,
"ira-sgc",
"Sogd, Mani, Syrc, Sogo",
translit = {
Sogd = "Sogd-translit",
-- Mani translit in [[Module:scripts/data]]
Sogo = "Sogo-translit",
},
}
m["soh"] = {
"Aka (Sudan)",
3450949,
"sdv-eje",
"Latn",
}
m["soi"] = {
"Sonha",
12953890,
"inc-eas",
}
m["sok"] = {
"Sokoro",
3441303,
"cdc-est",
"Latn",
}
m["sol"] = {
"Solos",
3489591,
"poz-ocw",
}
m["soo"] = {
"Nsong",
12953148,
"bnt-bdz",
"Latn",
}
m["sop"] = {
"Songe",
3130911,
"bnt-lbn",
"Latn",
}
m["soq"] = {
"Kanasi",
11732656,
"ngf-dag",
"Latn",
}
m["sor"] = {
"Somrai",
3123566,
"cdc-est",
"Latn",
}
m["sos"] = {
"Seenku",
36274,
"dmn-smg",
}
m["sou"] = {
"ปักษ์ใต้",
56508,
"tai-swe",
"Thai",
--sort_key = "Thai-sortkey",
}
m["sov"] = {
"Sonsorolese",
13281,
"poz-mic",
"Latn",
}
m["sow"] = {
"Sowanda",
7571845,
"paa-brd",
"Latn",
}
m["sox"] = {
"Swo",
36604,
"bnt-mka",
"Latn",
}
m["soy"] = {
"Miyobe",
35913,
"alv-sav",
"Latn",
}
m["soz"] = {
"Temi",
13278,
"bnt-kka",
"Latn",
}
m["spb"] = {
"Sepa (Indonesia)",
18603687,
"poz-cma",
"Latn",
}
m["spc"] = {
"Sapé",
2888158,
nil,
"Latn",
}
m["spd"] = {
"Saep",
7398312,
"ngf-yag",
"Latn",
}
m["spe"] = {
"Sepa (New Guinea)",
7451725,
"poz-ocw",
"Latn",
}
m["spg"] = {
"Sian",
7506806,
"poz-bnn",
}
m["spi"] = {
"Saponi",
3915418,
"paa-lkp",
"Latn",
}
m["spk"] = {
"Sengo",
7450584,
"paa-ndu",
"Latn",
}
m["spl"] = {
"Selepet",
7447917,
"ngf-huo",
"Latn",
}
m["spm"] = {
"Sepen",
4701931,
"paa-ram",
"Latn",
}
m["spn"] = {
"Sanapaná",
3033556,
"sai-mas",
"Latn",
}
m["spo"] = {
"Spokane",
3493704,
"sal",
}
m["spp"] = {
"Supyire",
56284,
"alv-sma",
"Latn",
}
m["spr"] = {
"Saparua",
7420921,
"poz-cma",
"Latn",
}
m["sps"] = {
"Saposa",
3473187,
"poz-ocw",
}
m["spt"] = {
"Spiti Bhoti",
22080879,
"sit-las",
}
m["spu"] = {
"Sapuan",
7421168,
"mkh-ban",
}
m["spv"] = {
"Sambalpuri",
6433240,
"inc-eas",
"Orya",
translit = "or-translit",
ancestors = "or",
}
m["spx"] = {
"South Picene",
36688,
"itc-sbl",
"Ital, Latn",
-- Ital translit in [[Module:scripts/data]]
display_text = {
Latn = s["itc-Latn-displaytext"]
},
strip_diacritics = {
Latn = s["itc-Latn-stripdiacritics"],
},
sort_key = {
Latn = s["itc-Latn-sortkey"],
},
}
m["spy"] = {
"Sabaot",
7395896,
"sdv-kln",
}
m["sqa"] = {
"Shama-Sambuga",
3914392,
"nic-kmk",
"Latn",
}
m["sqh"] = {
"Shau",
3913925,
"nic-jer",
"Latn",
}
m["sqk"] = {
"Albanian Sign Language",
4709168,
"sgn",
}
m["sqm"] = {
"Suma",
11008431,
"gba-wes",
}
m["sqn"] = {
"Susquehannock",
3505736,
"iro-nor",
}
m["sqo"] = {
"Sorkhei",
3491964,
"ira-kms",
}
m["sqq"] = {
"Sou",
16979751,
"mkh-ban",
}
m["sqr"] = {
"Siculo-Arabic",
1069489,
"sem-arb",
"Arab",
}
m["sqs"] = {
"Sri Lankan Sign Language",
3915466,
"sgn",
}
m["sqt"] = {
"Soqotri",
13283,
"sem-sar",
"Arab, Latn",
}
m["squ"] = {
"Squamish",
2484579,
"sal",
"Latn",
}
m["sra"] = {
"Saruga",
7424699,
"ngf-han",
"Latn",
}
m["srb"] = {
"Sora",
13284,
"mun",
"Sora, Latn, Orya",
}
m["sre"] = {
"Sara",
33957,
"day",
}
m["srf"] = {
"Nafi",
6958174,
"poz-ocw",
}
m["srg"] = {
"Sulod",
7636489,
"phi",
}
m["srh"] = {
"Sarikoli",
33873,
"ira-shr",
"Latn, ug-Arab, Cyrl",
}
m["sri"] = {
"Siriano",
3485264,
"sai-tuc",
"Latn",
}
m["srk"] = {
"Serudung Murut",
7455497,
"poz-san",
}
m["srl"] = {
"Isirawa",
4203802,
"paa-tkw",
}
m["srm"] = {
"Saramaccan",
33779,
"crp",
"Latn",
ancestors = "en, pt",
}
m["srn"] = {
"Sranan Tongo",
33989,
"crp",
"Latn",
ancestors = "en",
}
m["srq"] = {
"Sirionó",
3027953,
"tup-gua",
"Latn",
}
m["srr"] = {
"Serer",
36284,
"alv-fwo",
"Latn",
}
m["srs"] = {
"Tsuut'ina",
20825,
"ath-nor",
"Latn",
}
m["srt"] = {
"Sauri",
7427547,
"paa-egb",
"Latn",
}
m["sru"] = {
"Suruí",
7646993,
"tup",
"Latn",
}
m["srv"] = {
"Waray Sorsogon",
18755610,
"phi",
"Latn",
}
m["srw"] = {
"Serua",
14916905,
"poz-cet",
}
m["srx"] = {
"Sirmauri",
7530505,
"him",
}
m["sry"] = {
"Sera",
7452602,
"poz-ocw",
"Latn",
}
m["srz"] = {
"Shahmirzadi",
12953126,
"ira-msh",
"fa-Arab",
}
m["ssb"] = {
"Southern Sama",
3470594,
"poz-sbj",
"Latn",
}
m["ssc"] = {
"Suba-Simbiti",
7630687,
"bnt-lok",
"Latn",
}
m["ssd"] = {
"Siroi",
10771067,
"ngf-rai",
"Latn",
}
m["sse"] = {
"Balangingi",
2880535,
"poz-sbj",
"Latn",
}
m["ssf"] = {
"Thao",
676492,
"map",
"Latn",
}
m["ssg"] = {
"Seimat",
3182581,
"poz-aay",
"Latn",
}
m["ssh"] = {
"Shihhi Arabic",
56571,
"sem-arb",
"Arab",
strip_diacritics = {
remove_diacritics = c.kashida .. c.fathatan .. c.dammatan .. c.kasratan .. c.fatha .. c.damma .. c.kasra .. c.shadda .. c.sukun .. c.superalef,
from = {u(0x0671)},
to = {u(0x0627)}
},
}
m["ssi"] = {
"Sansi",
3309366,
"inc-nwe",
}
m["ssj"] = {
"Sausi",
7427605,
"ngf-eva",
"Latn",
}
m["ssk"] = {
"Sunam",
11002210,
"sit-kin",
}
m["ssl"] = {
"Western Sisaala",
11154776,
"nic-sis",
"Latn",
}
m["ssm"] = {
"Semnam",
7449713,
"mkh-asl",
"Latn",
}
m["sso"] = {
"Sissano",
7530937,
"poz-ocw",
"Latn",
}
m["ssp"] = {
"Spanish Sign Language",
3100814,
"sgn",
}
m["ssq"] = {
"So'a",
7572120,
"poz-cet",
"Latn",
}
m["ssr"] = {
"Swiss-French Sign Language",
12953483,
"sgn",
}
m["sss"] = {
"Sô",
3082037,
"mkh-kat",
}
m["sst"] = {
"Sinasina",
7521813,
"ngf-chw",
"Latn",
}
m["ssu"] = {
"Susuami",
7649752,
"ngf-ang",
"Latn",
}
m["ssv"] = {
"Shark Bay",
7489783,
"poz-vnn",
"Latn",
}
m["ssx"] = {
"Samberigi",
7409020,
"ngf-eng",
"Latn",
}
m["ssy"] = {
"ซาโฮ",
36353,
"cus-eas",
"Latn, Ethi, Arab",
}
m["ssz"] = {
"Sengseng",
7450601,
"poz-ocw",
"Latn",
}
m["stb"] = {
"Northern Subanen",
12953892,
"phi",
"Latn",
}
m["std"] = {
"Sentinelese",
568377,
"qfa-unc", -- presumed Ongan
}
m["ste"] = {
"Liana-Seti",
6539924,
"poz-cma",
}
m["stf"] = {
"Seta",
7456326,
"paa-tor",
"Latn",
}
m["stg"] = {
"Trieng",
22694648,
"mkh-ban",
}
m["sth"] = {
"Shelta",
36705,
"qfa-mix",
"Latn",
ancestors = "ga, en",
}
m["sti"] = {
"Bulo Stieng",
15771431,
"mkh-ban",
"Khmr, Latn",
}
m["stj"] = {
"Matya Samo",
10974879,
"dmn-sam",
"Latn",
}
m["stk"] = {
"Arammba",
3502094,
"paa-yam",
"Latn",
}
m["stm"] = {
"Setaman",
7456333,
"ngf-okk",
"Latn",
}
m["stn"] = {
"Owa",
1324132,
"poz-sls",
"Latn",
}
m["sto"] = {
"Stoney",
3033570,
"sio-dkt",
"Latn",
}
m["stp"] = {
"Southeastern Tepehuan",
12953917,
"azc-pim",
"Latn",
}
m["stq"] = {
"ฟรีเชียแบบซาเทอร์ลันท์",
27154,
"gmw-fri",
"Latn",
}
m["str"] = {
"Saanich",
36444,
"sal",
"Latn",
}
m["sts"] = {
"Shumashti",
33777,
"inc-kun",
"Arab",
}
m["stt"] = {
"Budeh Stieng",
12953891,
"mkh-ban",
}
m["stu"] = {
"Samtao",
25559550,
"mkh-pal",
}
m["stv"] = {
"Silt'e",
33880,
"sem-eth",
"Ethi",
}
m["stw"] = {
"Satawalese",
28477,
"poz-mic",
"Latn",
}
m["sty"] = {
"Siberian Tatar",
4418344,
"trk-kno",
"Cyrl",
}
m["sua"] = {
"Sulka",
7636341,
"qfa-iso", -- Papuan; isolate in Glottolog and Palmer (2018)
"Latn",
}
m["sub"] = {
"Suku",
12953160,
"bnt-yak",
"Latn",
}
m["suc"] = {
"Western Subanon",
16113894,
"phi",
"Latn",
}
m["sue"] = {
"Suena",
7634386,
"paa-bin",
"Latn",
}
m["sug"] = {
"Suganga",
7634706,
"ngf-okk",
"Latn",
}
m["sui"] = {
"Suki",
2089984,
"ngf-gsu",
"Latn",
}
m["suk"] = {
"Sukuma",
2638144,
"bnt-tkm",
"Latn",
}
-- suo (Bouni, Papua New Guinea, called Bouni-Bobe in Glottolog): not yet accepted; in the Sko/Skou family
m["suq"] = {
"Suri",
5364172,
"sdv",
}
m["sur"] = {
"Mwaghavul",
3440486,
"cdc-wst",
"Latn",
}
m["sus"] = {
"Susu",
33990,
"dmn-sya",
"Latn",
}
m["sut"] = {
"Subtiaba",
3915405,
"omq",
"Latn",
}
m["suv"] = {
"Puroik",
56408,
"sit-khb",
"Beng, Deva, Latn",
ancestors = "sit-khp-pro",
}
m["suw"] = {
"Sumbwa",
7637055,
"bnt-glb",
"Latn",
}
m["sux"] = {
"ซูเมอร์",
36790,
"qfa-iso",
"Xsux, Latn",
}
m["suy"] = {
"Suyá",
3505859,
"sai-nje",
"Latn",
}
m["suz"] = {
"Sunwar",
56549,
"sit-kiw",
"Deva, Sunu",
translit = {
Deva = "Deva-translit",
},
}
m["sva"] = {
"Svan",
34067,
"ccs",
"Geor, Cyrl",
translit = {
Geor = "sva-translit",
},
override_translit = true,
}
m["svb"] = {
"Ulau-Suain",
7878769,
"poz-ocw",
"Latn",
}
m["svc"] = {
"Vincentian Creole English",
3501785,
"crp",
"Latn",
ancestors = "en",
}
m["sve"] = {
"Serili",
7454834,
"poz-tim",
}
m["svk"] = {
"Slovakian Sign Language",
7541557,
"sgn",
}
m["svm"] = {
"Slavomolisano",
36254,
"zls",
"Latn",
ancestors = "sh",
}
m["svs"] = {
"Savosavo",
3130296,
"qfa-dis", -- Papuan; isolate in Glottolog; in the tentative Central Solomons family by Ross (2005) and Pedrós
-- (2015)
"Latn",
}
m["svx"] = {
"Skalvian",
3486125,
"bat-wes",
"Latn",
}
m["swb"] = {
"Maore Comorian",
34075,
"bnt-com",
"Latn",
sort_key = "bnt-com-sortkey",
}
m["swf"] = {
"Sere",
7453056,
"nic-ser",
"Latn",
}
m["swg"] = {
"Swabian",
327274,
"gmw-hgm",
"Latn",
ancestors = "gsw",
}
m["swi"] = {
"สุ่ย",
3112388,
"qfa-kms",
"Latn, Shui, Hani",
sort_key = {Hani = "Hani-sortkey"},
}
m["swj"] = {
"Sira",
36599,
"bnt-sir",
"Latn",
}
m["swl"] = {
"Swedish Sign Language",
36558,
"sgn",
}
m["swm"] = {
"Samosa",
7410037,
"ngf-nwh",
"Latn",
}
m["swn"] = {
"Sokna",
2988323,
"ber",
}
m["swo"] = {
"Shanenawa",
61974839,
"sai-pan",
"Latn",
}
m["swp"] = {
"Suau",
3502368,
"poz-ocw",
}
m["swq"] = {
"Sharwa",
56791,
"cdc-cbm",
"Latn",
}
m["swr"] = {
"Saweru",
3474649,
"paa-yaw",
"Latn",
}
m["sws"] = {
"Seluwasan",
7448845,
"poz-cet",
}
m["swt"] = {
"Sawila",
7428639,
"paa-tap",
"Latn",
}
m["swu"] = {
"Suwawa",
7650588,
"phi",
}
m["sww"] = {
"Sowa",
7571843,
"poz-vnn",
"Latn",
}
m["swx"] = {
"Suruahá",
3114402,
"auf",
}
m["swy"] = {
"Sarua",
56261,
"cdc-est",
"Latn",
}
m["sxb"] = {
"Suba",
33916,
"bnt-lok",
"Latn",
}
m["sxc"] = {
"Sicanian",
36335,
"qfa-unc", -- extinct, lack of data; only names deciphered
"Polyt",
}
m["sxe"] = {
"Sighu",
36431,
"bnt-kel",
"Latn",
}
m["sxg"] = {
"Shixing",
56337,
"sit-nax",
"Latn",
}
m["sxk"] = {
"Southern Kalapuya",
3192122,
"nai-klp",
}
m["sxl"] = {
"Selonian",
36491,
"bat-eas",
"Latn",
}
m["sxm"] = {
"สำเร",
6583615,
"mkh-pea",
}
m["sxn"] = {
"Sangir",
25714758,
"phi",
"Latn",
}
m["sxo"] = {
"Sorothaptic",
2762254,
}
m["sxr"] = {
"Saaroa",
716599,
"map",
"Latn",
}
m["sxs"] = {
"Sasaru",
3913384,
"alv-yek",
"Latn",
}
-- "sxu" "Upper Saxon" IS SUBSUMED INTO "gmw-ecg" "East Central German"
m["sxw"] = {
"Saxwe Gbe",
7428892,
"alv-pph",
"Latn",
}
m["sya"] = {
"Siang",
3482903,
}
m["syb"] = {
"Central Subanen",
12953893,
"phi",
"Latn",
}
m["syc"] = {
"ซีรีแอกคลาสสิก",
33538,
"sem-are",
"Syrc",
strip_diacritics = {remove_diacritics = c.macron .. c.diaer .. c.macronbelow .. u(0x0730) .. "-" .. u(0x0748)},
}
m["syi"] = {
"Seki",
36547,
"bnt-kel",
"Latn",
}
m["syk"] = {
"Sukur",
56292,
"cdc-cbm",
"Latn",
}
m["syl"] = {
"สิเลฏ",
2044560,
"inc-bas",
"Sylo, Beng",
ancestors = "inc-obn",
translit = "syl-translit",
}
m["sym"] = {
"Maya Samo",
10950421,
"dmn-sam",
"Latn",
}
m["syn"] = {
"Senaya",
33914,
"sem-nna",
}
m["syo"] = {
"Suoy",
7641864,
"mkh-pea",
}
m["sys"] = {
"Sinyar",
56840,
"csu",
"Latn",
}
m["syw"] = {
"Kagate",
12952538,
"sit-kyk",
"Deva",
translit = {
Deva = "Deva-translit",
},
}
m["syx"] = {
"Osamayi",
7408415,
"bnt-kel",
"Latn",
}
m["syy"] = {
"Al-Sayyid Bedouin Sign Language",
2915457,
"sgn",
}
m["sza"] = {
"Semelai",
3111827,
"mkh-asl",
"Latn",
}
m["szb"] = {
"Ngalum",
11732516,
"ngf-okk",
"Latn",
}
m["szc"] = {
"Semaq Beri",
7449119,
"mkh-asl",
}
m["szd"] = {
"Seru",
7455488,
"poz-bnn",
"Latn",
}
m["sze"] = {
"Seze",
373683,
"omv-mao",
"Latn",
}
m["szg"] = {
"Sengele",
7450555,
"bnt-mon",
"Latn",
}
m["szl"] = {
"ไซลีเซีย",
30319,
"zlw-lch",
"Latn",
ancestors = "zlw-opl",
}
m["szn"] = {
"Sula",
3503403,
"poz-cma",
"Latn",
}
m["szp"] = {
"Suabo",
7630429,
"ngf-sbh",
"Latn",
}
m["szv"] = {
"Isubu",
35431,
"bnt-saw",
"Latn",
}
m["szw"] = {
"Sawai",
3447258,
"poz-hce",
"Latn",
}
m["szy"] = {
"ซากีซายา",
718269,
"map",
"Latn",
}
return require("Module:languages").finalizeData(m, "language")
mifenuct2b7k5jmhz3iypt0d3b2ocj6
มอดูล:languages/data/3/r
828
36369
5720764
5684165
2026-04-21T07:01:11Z
OctraBot
3198
บอต: แทนที่ข้อความโดยอัตโนมัติ (-standardChars +standard_chars)
5720764
Scribunto
text/plain
local m_langdata = require("Module:languages/data")
-- Loaded on demand, as it may not be needed (depending on the data).
local function u(...)
u = require("Module:string utilities").char
return u(...)
end
local c = m_langdata.chars
local p = m_langdata.puaChars
local s = m_langdata.shared
local m = {}
m["raa"] = {
"Dungmali",
56871,
"sit-kic",
}
m["rab"] = {
"Chamling",
3436664,
"sit-kic",
"Deva",
translit = {
Deva = "Deva-translit",
},
}
m["rac"] = {
"Rasawa",
56443,
"paa-lkp",
"Latn",
}
m["rad"] = {
"Rade",
3429088,
"cmc",
"Latn",
}
m["raf"] = {
"Western Meohang",
17442461,
"sit-kie",
}
m["rag"] = {
"Logooli",
6667767,
"bnt-lok",
"Latn",
}
m["rah"] = {
"Rabha",
7278686,
"tbq-bdg",
"Beng, Latn",
}
m["rai"] = {
"Ramoaaina",
3418509,
"poz-ocw",
"Latn",
}
m["rak"] = {
"Tulu-Bohuai",
2908807,
"poz-aay",
"Latn",
}
m["ral"] = {
"Ralte",
7288392,
"tbq-kuk",
"Latn",
}
m["ram"] = {
"Canela",
2936334,
"sai-nje",
"Latn",
}
m["ran"] = {
"Riantana",
7322169,
"paa-kol",
"Latn",
}
m["rao"] = {
"Rao",
11732596,
"paa-ram",
"Latn",
}
m["rap"] = {
"ราปานูอี",
36746,
"poz-pep",
"Latn",
}
m["raq"] = {
"Saam",
7395644,
"sit-kic",
}
m["rar"] = {
"ราโรโตงา",
36745,
"poz-pep",
"Latn",
}
m["ras"] = {
"Tegali",
36522,
"nic-ras",
"Latn",
}
m["rat"] = {
"Razajerdi",
7299461,
"xme-ttc",
ancestors = "xme-ttc-eas",
}
m["rau"] = {
"Raute",
7296262,
"sit-gma",
"Deva, Latn",
translit = {
Deva = "Deva-translit",
},
}
m["rav"] = {
"Sampang",
3449115,
"sit-kic",
}
m["raw"] = {
"เรอหวั่ง",
542564,
"sit-nng",
"Latn",
sort_key = {remove_diacritics = c.grave .. c.acute .. c.macron},
}
m["rax"] = {
"Rang",
3913345,
"alv-mum",
}
m["ray"] = {
"Rapa",
36417,
"poz-pep",
"Latn",
}
m["raz"] = {
"Rahambuu",
3417555,
"poz-btk",
}
m["rbb"] = {
"Rumai Palaung",
12953797,
"mkh-pal",
"Mymr",
}
m["rbk"] = {
"Northern Bontoc",
63311016,
"phi",
"Latn",
}
m["rbl"] = {
"Miraya Bikol",
18664557,
"phi",
"Latn",
}
m["rcf"] = {
"Réunion Creole French",
13198,
"crp",
"Latn",
ancestors = "fr",
sort_key = s["roa-oil-sortkey"],
}
m["rdb"] = {
"Rudbari",
12953072,
"xme",
ancestors = "xme-mid",
}
m["rea"] = {
"Rerau",
7314883,
"ngf-rai", -- placed in Nuru subfamily by Pawley-Hammarström
"Latn",
}
m["reb"] = {
"Rembong",
7311570,
"poz-cet",
"Latn",
}
m["ree"] = {
"Rejang Kayan",
3423957,
"poz",
"Latn",
}
m["reg"] = {
"Kara (Tanzania)",
6367567,
"bnt-haj",
}
m["rei"] = {
"Reli",
7310982,
}
m["rej"] = {
"Rejang",
3056339,
"poz",
"Rjng, Latn",
}
m["rel"] = {
"Rendille",
3447297,
"cus-som",
"Latn",
}
m["rem"] = {
"Remo",
3501825,
"sai-pan",
"Latn",
}
m["ren"] = {
"Rengao",
6583692,
"mkh",
}
m["rer"] = {
"Rer Bare",
12953857,
"qfa-unc", -- extinct, might not exist
}
m["res"] = {
"Reshe",
36258,
"nic-knj",
}
m["ret"] = {
"Retta",
7317113,
"paa-tap",
"Latn",
}
m["rey"] = {
"Reyesano",
3111857,
"sai-tac",
"Latn",
}
m["rga"] = {
"Roria",
7366825,
"poz-vnn",
"Latn",
}
m["rge"] = {
"Romano-Greek",
3915435,
"qfa-mix",
"Latn", -- and/or Grek?
ancestors = "rom, el",
}
m["rgk"] = {
"Rangkas",
7292645,
"sit-alm",
}
m["rgn"] = {
"โรมัญญา",
1641543,
"roa-emr",
"Latn",
wikimedia_codes = "eml",
}
m["rgr"] = {
"Resígaro",
3450504,
"awd",
"Latn",
}
m["rgs"] = {
"Southern Roglai",
12953069,
"cmc",
"Latn",
}
m["rgu"] = {
"Ringgou",
7334886,
"poz-tim",
}
m["rhg"] = {
"โรฮีนจา",
3241177,
"inc-bas",
"Rohg, Arab, Mymr, Latn, Beng",
ancestors = "inc-obn",
translit = {
Rohg = "Rohg-translit",
},
}
m["rhp"] = {
"Yahang",
8046792,
"paa-tor",
"Latn",
}
m["ria"] = {
"Reang",
12953063,
"tbq-bdg",
}
m["rif"] = {
"ริฟ",
34174,
"ber",
"Latn, Tfng, Arab",
translit = { Tfng = "Tfng-translit" },
standard_chars = {
Latn = "AaBbCcDdḌḍEeƐɛFfGgƔɣĞğHhḤḥIiJjKkLlMmNnPpQqRrŘřSsṢṣTtṬṭUuWwXxYyZzẒẓʷ",
Tfng = "ⴰⴳⴷⴹⴼⵖⵉⴽⵍⵎⵏⵓⵔⵙⵛⵜⵡⵢⵣⵥⴱⵀⵅⵊⴳⵯⵕⵚⵟⵇⵃⵄⴻⴽⵯ",
c.punc
},
}
m["ril"] = {
"Riang",
2741615,
"mkh-pal",
}
m["rim"] = {
"Nyaturu",
7193418,
"bnt-tkm",
"Latn",
}
m["rin"] = {
"Nungu",
3913350,
"nic-nin",
"Latn",
}
m["rir"] = {
"Ribun",
7322443,
"day",
"Latn",
}
m["rit"] = {
"Ritarungo",
7336730,
"aus-yol",
"Latn",
}
m["riu"] = {
"Riung",
7336938,
"poz-cet",
"Latn",
}
m["rjg"] = {
"Rajong",
7286370,
"poz-cet",
"Latn",
}
m["rji"] = {
"Raji",
7286138,
"sit-gma",
}
m["rjs"] = {
"ราชพังสี",
12640969,
"inc-krd",
"Deva",
translit = {
Deva = "Deva-translit",
},
}
m["rka"] = {
"Kraol",
3199593,
"mkh-ban",
"Khmr", -- also Latn?
}
m["rkb"] = {
"Rikbaktsa",
2585357,
"sai-mje",
"Latn",
}
m["rkh"] = {
"Rakahanga-Manihiki",
3119695,
"poz-pep",
"Latn",
}
m["rki"] = {
"ยะไข่",
3450749,
"tbq-brm",
"Mymr",
ancestors = "obr",
}
m["rkm"] = {
"Marka",
36030,
"dmn-wmn",
"Latn",
}
m["rkt"] = {
"Kamta",
3241618,
"inc-krd",
"as-Beng, Latn",
translit = "as-translit",
}
m["rkw"] = {
"Arakwal",
34295800,
"aus-pam",
"Latn",
}
m["rma"] = {
"Rama",
3444486,
"cba",
}
m["rmb"] = {
"Rembarunga",
7311553,
"aus-gun",
"Latn",
}
m["rmc"] = {
"Carpathian Romani",
5045611,
"inc-rom",
"Latn",
}
m["rmd"] = {
"Traveller Danish",
12640779,
"qfa-mix",
"Latn",
ancestors = "rom, da",
}
m["rme"] = {
"Angloromani",
541279,
"qfa-mix",
"Latn",
ancestors = "rom, en",
}
m["rmf"] = {
"Kalo Finnish Romani",
2093214,
"inc-rom",
"Latn",
}
m["rmg"] = {
"Traveller Norwegian",
3177352,
"qfa-mix",
"Latn",
ancestors = "rom, no",
}
m["rmh"] = {
"Murkim",
4308074,
"paa-pau",
}
m["rmi"] = {
"Lomavren",
2495696,
"qfa-mix",
"Latn, Armn",
ancestors = "pra-sau, hy",
-- Armn translit in [[Module:scripts/data]]
override_translit = true,
}
m["rmk"] = {
"Romkun",
7363236,
"paa-ram",
"Latn",
}
m["rml"] = {
"Baltic Romani",
513736,
"inc-rom",
"Latn",
}
m["rmm"] = {
"Roma",
4414831,
}
m["rmn"] = {
"Balkan Romani",
1256701,
"inc-rom",
"Latn",
}
m["rmo"] = {
"Sinte Romani",
1793299,
"inc-rom",
"Latn",
}
m["rmp"] = {
"Rempi",
7312214,
"ngf-han",
"Latn",
}
m["rmq"] = {
"Caló",
35466,
"qfa-mix",
"Latn",
ancestors = "rom, osp, roa-opt",
}
m["rms"] = {
"Romanian Sign Language",
7362575,
"sgn",
}
m["rmt"] = {
"Domari",
35394,
"inc-cen",
"Latn, Arab, Hebr",
-- Hebr display_text, strip_diacritics, sort_key in [[Module:scripts/data]]
}
m["rmu"] = {
"Tavringer Romani",
27808413,
"qfa-mix",
"Latn",
ancestors = "rom, sv",
}
m["rmv"] = {
"Romanova",
1298715,
"art",
type = "appendix-constructed",
}
m["rmw"] = {
"Welsh Romani",
2097387,
"inc-rom",
"Latn",
}
m["rmx"] = {
"Romam",
22694600,
"mkh",
}
m["rmy"] = {
"Vlax Romani",
2669199,
"inc-rom",
"Latn",
}
m["rmz"] = {
"Marma",
21403256,
"tbq-brm",
"Mymr",
ancestors = "obr",
}
m["rnd"] = {
"Ruwund",
7383564,
"bnt-lun",
}
m["rng"] = {
"Ronga",
2520717,
"bnt-tsr",
"Latn",
}
m["rnl"] = {
"Ranglong",
7292878,
}
m["rnn"] = {
"Roon",
7366335,
"poz-hce",
"Latn",
}
m["rnp"] = {
"Rongpo",
7365672,
"sit-whm",
}
m["rnw"] = {
"Rungwa",
7379873,
"bnt-mwi",
"Latn",
}
m["rob"] = {
"Tae'",
12473476,
"poz-ssw",
"Latn",
}
m["roc"] = {
"Cacgia Roglai",
2932485,
"cmc",
"Latn",
}
m["rod"] = {
"Rogo",
3914894,
"nic-kmk",
}
m["roe"] = {
"Ronji",
3441763,
"poz-ocw",
}
m["rof"] = {
"Rombo",
33330,
"bnt-chg",
"Latn",
}
m["rog"] = {
"Northern Roglai",
3439680,
"cmc",
"Latn",
}
m["rol"] = {
"Romblomanon",
13202,
"phi",
"Latn",
}
m["rom"] = {
"โรมานี",
13201,
"inc-rom",
"Latn, Cyrl",
}
m["roo"] = {
"Rotokas",
13203,
"paa-nbo",
"Latn",
}
m["rop"] = {
"Australian Kriol",
35671,
"crp",
"Latn",
ancestors = "en",
}
m["ror"] = {
"Rongga",
12473464,
}
m["rou"] = {
"Runga",
56793,
}
m["row"] = {
"Dela-Oenale",
5253046,
"poz-tim",
}
m["rpn"] = {
"Repanbitip",
7313900,
"poz-vnc",
"Latn",
}
m["rpt"] = {
"Rapting",
7294362,
"ngf-han",
"Latn",
}
m["rri"] = {
"Ririo",
2404190,
"poz-ocw",
}
m["rro"] = {
"Roro",
34197,
"poz-ocw",
"Latn",
}
m["rrt"] = {
"Arritinngithigh",
4796002,
nil,
"Latn",
}
m["rsb"] = {
"Romano-Serbian",
1268244,
"qfa-mix",
"Latn", -- and Cyrl?
ancestors = "rom, sh",
}
m["rsl"] = {
"มือรัสเซีย",
13210,
"sgn",
}
m["rsk"] = {
"รูซินแบบพันโนเนีย",
35660,
"zlw",
"Cyrl",
ancestors = "zlw-osk",
--translit = "rsk-translit",
sort_key = {
Cyrl = {
from = {"ґ", "є", "ї", "ь"},
to = {"г" .. p[1], "е" .. p[1], "и" .. p[1], "я" .. p[1]}
}
},
standard_chars = "АаБбВвГ㥴ДдЕеЄєЖжЗзИиІіЇїЙйКкЛлМмНнОоПпРрСсТтУуФфХхЦцЧчШшЩщЬьЮюЯя" .. c.punc:gsub("'", ""), -- Exclude apostrophe.
}
m["rsm"] = {
"Miriwoong Sign Language",
24090240,
"sgn",
}
m["rsn"] = {
"Rwandan Sign Language",
25041935,
"sgn",
}
m["rtc"] = {
"Rungtu",
7379867,
"tbq-kuk",
}
m["rth"] = {
"Ratahan",
3420026,
"phi",
"Latn",
}
m["rtm"] = {
"Rotuman",
36754,
"poz-pcc",
"Latn",
}
m["rtw"] = {
"Rathawi",
12953854,
"inc-bhi",
}
m["rub"] = {
"Gungu",
11165235,
"bnt-glb",
"Latn",
}
m["ruc"] = {
"Ruuli",
7383562,
"bnt-nyg",
}
m["rue"] = {
"รูซินแบบคาร์พาเทีย",
26245,
"zle",
"Cyrl",
ancestors = "zle-ort",
translit = "rue-translit",
strip_diacritics = {remove_diacritics = c.grave .. c.acute},
sort_key = "rue-sortkey",
}
m["ruf"] = {
"Luguru",
3437661,
"bnt-ruv",
"Latn",
}
m["rug"] = {
"Roviana",
3445546,
"poz-ocw",
"Latn",
}
m["ruh"] = {
"Ruga",
7378127,
}
m["rui"] = {
"Rufiji",
7377946,
"bnt-mbi",
}
m["ruk"] = {
"Che",
3915445,
"nic-nin",
"Latn",
}
m["ruo"] = {
"Istro-Romanian",
33622,
"roa-eas",
"Latn",
}
m["rup"] = {
"Aromanian",
29316,
"roa-eas",
"Latn, Polyt",
translit = {
-- FIXME: formerly no translit specified for Polyt; unclear if the default [[Module:grc-translit]] is
-- acceptable, so we disable it for now
Polyt = false,
},
sort_key = {
Latn = { from = {"ã"}, to = {"a"..p[1]} },
},
-- Polyt display_text, strip_diacritics, sort_key in [[Module:scripts/data]]
wikimedia_codes = "roa-rup",
}
m["ruq"] = {
"Megleno-Romanian",
13358,
"roa-eas",
"Latn",
}
m["rut"] = {
"Rutul",
36757,
"cau-wsm",
"Cyrl, Latn",
translit = "rut-translit",
override_translit = true,
display_text = {
Cyrl = s["cau-Cyrl-displaytext"]
},
strip_diacritics = {
Cyrl = s["cau-Cyrl-stripdiacritics"],
Latn = s["cau-Latn-stripdiacritics"],
},
}
m["ruu"] = {
"Lanas Lobu",
12953676,
}
m["ruy"] = {
"Mala (Nigeria)",
3913381,
"nic-kau",
}
m["ruz"] = {
"Ruma",
3913326,
"nic-kau",
}
m["rwa"] = {
"Rawo",
3504269,
"paa-msk",
"Latn",
}
m["rwk"] = {
"Rwa",
7985624,
"bnt-chg",
}
m["rwm"] = {
"Amba",
788423,
"bnt-kbi",
"Latn",
}
m["rwo"] = {
"Rawa",
11732598,
"ngf-fin",
"Latn",
}
m["rxd"] = {
"Ngardi",
7022063,
}
m["rxw"] = {
"Karuwali",
6881575,
}
m["ryn"] = {
"อามามิโอชิมะเหนือ",
2840988,
"jpx-nry",
"Jpan",
translit = s["jpx-translit"],
display_text = s["jpx-displaytext"],
strip_diacritics = s["jpx-stripdiacritics"],
sort_key = s["jpx-sortkey"],
}
m["rys"] = {
"ยาเอยามะ",
34203,
"jpx-sry",
"Jpan",
translit = s["jpx-translit"],
display_text = s["jpx-displaytext"],
strip_diacritics = s["jpx-stripdiacritics"],
sort_key = s["jpx-sortkey"],
}
m["ryu"] = {
"โอกินาวะ",
34233,
"jpx-nry",
"Jpan",
translit = s["jpx-translit"],
display_text = s["jpx-displaytext"],
strip_diacritics = s["jpx-stripdiacritics"],
sort_key = s["jpx-sortkey"],
}
m["rzh"] = {
"Razihi",
16911222,
"sem-osa",
"Arab",
ancestors = "sem-srb",
}
return require("Module:languages").finalizeData(m, "language")
trlsbvxrbif6e5qi3zud6fwywprrtkm
มอดูล:languages/data/3/p
828
36371
5720763
5684164
2026-04-21T07:01:09Z
OctraBot
3198
บอต: แทนที่ข้อความโดยอัตโนมัติ (-standardChars +standard_chars)
5720763
Scribunto
text/plain
local m_langdata = require("Module:languages/data")
-- Loaded on demand, as it may not be needed (depending on the data).
local function u(...)
u = require("Module:string utilities").char
return u(...)
end
local c = m_langdata.chars
local p = m_langdata.puaChars
local s = m_langdata.shared
local m = {}
m["pab"] = {
"Pareci",
3504312,
"awd",
"Latn",
}
m["pac"] = {
"ปาโกะห์",
3441136,
"mkh-kat",
"Latn",
}
m["pad"] = {
"Paumarí",
389827,
"auf",
"Latn",
}
m["pae"] = {
"Pagibete",
7124357,
"bnt-bta",
"Latn",
}
m["paf"] = {
"Paranawát",
12953806,
"tup-gua",
"Latn",
}
m["pag"] = {
"Pangasinan",
33879,
"phi",
"Latn, Tglg",
strip_diacritics = {
Latn = {remove_diacritics = c.grave .. c.acute .. c.circ .. c.diaer},
},
}
m["pah"] = {
"Tenharim",
10266010,
"tup-gua",
"Latn",
}
m["pai"] = {
"Pe",
3914871,
"nic-tar",
"Latn",
}
m["pak"] = {
"Parakanã",
12953804,
"tup-gua",
"Latn",
}
m["pal"] = {
"เปอร์เซียกลาง",
32063,
"ira-swi",
"Latn, Phli, pal-Avst, Mani, Phlp, Phlv", -- Latn for translit; Phlv not in Unicode
translit = {
Phli = "Phli-translit",
["pal-Avst"] = "Avst-translit",
-- Mani translit in [[Module:scripts/data]]
},
ancestors = "peo",
}
m["pam"] = {
"กาปัมปางัน",
36121,
"phi",
"Latn, Kulit",
strip_diacritics = {
Latn = {remove_diacritics = c.grave .. c.acute .. c.circ}
},
standard_chars = {
Latn = "AaBbDdEeGgHhIiKkLlMmNnOoPpRrSsTtUuWwYy",
c.punc
},
sort_key = {
Latn = "tl-sortkey"
},
}
m["pao"] = {
"Northern Paiute",
3360656,
"azc-num",
"Latn",
}
m["pap"] = {
"ปาเปียเมนตู",
33856,
"crp",
"Latn",
ancestors = "pt",
}
m["paq"] = {
"Parya",
1135134,
"inc-cen",
}
m["par"] = {
"Panamint",
33926,
"azc-num",
"Latn",
}
m["pas"] = {
"Papasena",
7132508,
"paa-lkp",
"Latn",
}
m["pau"] = {
"ปาเลา",
33776,
"poz",
"Latn, Kana",
sort_key = {
Kana = "Kana-sortkey"
},
}
m["pav"] = {
"Wari'",
3027909,
"sai-cpc",
"Latn",
}
m["paw"] = {
"Pawnee",
56751,
"cdd",
"Latn",
strip_diacritics = {remove_diacritics = c.acute},
}
m["pax"] = {
"Pankararé",
25559779,
nil,
"Latn",
}
m["pay"] = {
"Pech",
4898889,
"cba",
"Latn",
}
m["paz"] = {
"Pankararú",
7131310,
nil,
"Latn",
}
m["pbb"] = {
"Páez",
33677,
nil,
"Latn",
}
m["pbc"] = {
"Patamona",
3915921,
"sai-pem",
"Latn",
}
m["pbe"] = {
"Mezontla Popoloca",
42365630,
"omq-pop",
"Latn",
}
m["pbf"] = {
"Coyotepec Popoloca",
5180100,
"omq-pop",
"Latn",
}
m["pbg"] = {
"Paraujano",
3501747,
"awd-taa",
"Latn",
}
m["pbh"] = {
"Panare",
56610,
"sai-ven",
"Latn",
}
m["pbi"] = {
"Podoko",
3515096,
"cdc-cbm",
"Latn",
}
m["pbl"] = {
"Mak (Nigeria)",
3915349,
"alv-bwj",
"Latn",
}
m["pbm"] = {
"Puebla Mazatec",
31102530,
"omq-maz",
"Latn",
}
m["pbn"] = {
"Kpasam",
3914902,
"alv-mye",
"Latn",
}
m["pbo"] = {
"Papel",
36314,
"alv-pap",
"Latn",
}
m["pbp"] = {
"Badyara",
35095,
"alv-ten",
"Latn",
}
m["pbr"] = {
"Pangwa",
3847550,
"bnt-bki",
"Latn",
}
m["pbs"] = {
"Central Pame",
3361763,
"omq",
"Latn",
}
m["pbv"] = {
"ปนัร",
3501850,
"aav-pkl",
"Latn",
}
m["pby"] = {
"Pyu (New Guinea)",
2567925,
"qfa-dis", -- Papuan; isolate per Glottolog, in a putative Arai-Samaia family in Usher (2020)
"Latn",
}
m["pca"] = {
"Santa Inés Ahuatempan Popoloca",
42365276,
"omq-pop",
"Latn",
}
m["pcb"] = {
"Pear",
6583669,
"mkh-pea",
"Khmr",
}
m["pcc"] = {
"ปู้อี",
35100,
"tai-nor",
"Latn, Hani",
sort_key = {
Hani = "Hani-sortkey"
},
}
m["pcd"] = {
"ปีการ์",
34024,
"roa-oil",
"Latn",
sort_key = s["roa-oil-sortkey"],
}
m["pce"] = {
"Ruching Palaung",
12953798,
"mkh-pal",
"Mymr",
}
m["pcf"] = {
"Paliyan",
7127643,
"dra-tam",
}
m["pcg"] = {
"Paniya",
7131211,
"dra-mal",
}
m["pch"] = {
"Pardhan",
7133207,
"dra-gon",
}
m["pci"] = {
"Duruwa",
56753,
"dra-pgd",
"Deva, Orya",
translit = {
Deva = "Deva-translit",
Orya = "Orya-translit",
},
}
m["pcj"] = {
"Parenga",
3111396,
"mun",
}
m["pck"] = {
"Paite",
12952337,
"tbq-kuk",
}
m["pcl"] = {
"Pardhi",
7136554,
"inc-bhi",
}
m["pcm"] = {
"Nigerian Pidgin",
33655,
"crp",
"Latn",
ancestors = "en",
strip_diacritics = {remove_diacritics = c.grave .. c.acute .. c.circ .. c.caron .. c.macronbelow},
sort_key = {
remove_diacritics = c.tilde,
from = {"ẹ", "gb", "kp", "ọ", "sh", "zh"},
to = {"e" .. p[1], "g" .. p[1], "k" .. p[1], "o" .. p[1], "s" .. p[1], "z" .. p[1]}
},
}
m["pcn"] = {
"Piti",
3913375,
"nic-kne",
"Latn",
}
m["pcp"] = {
"Pacahuara",
2591165,
"sai-pan",
"Latn",
}
m["pcw"] = {
"Pyapun",
3438807,
nil,
"Latn",
}
m["pda"] = {
"Anam",
3501930,
"ngf-pom",
"Latn",
}
m["pdc"] = {
"เยอรมันแบบเพนซิลเวเนีย",
22711,
"gmw-hgm",
"Latn",
ancestors = "gmw-rfr",
}
m["pdi"] = {
"Pa Di",
3359940,
nil,
"Latn",
}
m["pdn"] = {
"Fedan",
7206699,
"poz-ocw",
"Latn",
}
m["pdo"] = {
"Padoe",
3360370,
"poz-btk",
"Latn",
}
m["pdt"] = {
"เพลาท์ดิทช์",
1751432,
"gmw-lgm",
"Latn",
ancestors = "nds-de",
}
m["pdu"] = {
"กะยัน",
7123283,
"kar",
"Latn",
}
m["pea"] = {
"Peranakan Indonesian",
653415,
"crp",
"Latn",
ancestors = "ms",
}
m["peb"] = {
"Eastern Pomo",
3396032,
"nai-pom",
"Latn",
}
m["ped"] = {
"Mala (New Guinea)",
11732569,
"ngf-kau",
"Latn",
}
m["pee"] = {
"Taje",
12953902,
nil,
"Latn",
}
m["pef"] = {
"Northeastern Pomo",
3396018,
"nai-pom",
"Latn",
}
m["peg"] = {
"Pengo",
56758,
"dra-kki",
"Orya",
translit = "Orya-translit",
}
m["peh"] = {
"Bonan",
32983,
"xgn-shr",
"Latn",
}
m["pei"] = {
"Chichimeca-Jonaz",
3915427,
"omq-otp",
"Latn",
}
m["pej"] = {
"Northern Pomo",
3396021,
"nai-pom",
"Latn",
}
m["pek"] = {
"Penchal",
3374631,
"poz-aay",
"Latn",
}
m["pel"] = {
"Pekal",
3241781,
nil,
"Latn",
}
m["pem"] = {
"Phende",
7162372,
"bnt-pen",
"Latn",
}
m["peo"] = {
"เปอร์เซียเก่า",
35225,
"ira-swi",
"Xpeo, Latn",
--translit = "peo-translit",
}
m["pep"] = {
"Kunja",
6444807,
"paa-yam",
"Latn",
}
m["peq"] = {
"Southern Pomo",
3396023,
"nai-pom",
"Latn",
}
-- "pes" is treated as "fa" (or as etymology-only), see [[WT:LT]]
m["pev"] = {
"Pémono",
3439012,
"sai-map",
"Latn",
}
m["pex"] = {
"Petats",
3376353,
"poz-ocw",
"Latn",
}
m["pey"] = {
"Petjo",
940486,
nil,
"Latn",
}
m["pez"] = {
"Eastern Penan",
18638342,
"poz-swa",
"Latn",
}
m["pfa"] = {
"Pááfang",
3063517,
"poz-mic",
"Latn",
}
m["pfe"] = {
"Peere",
36377,
"alv-dur",
"Latn",
}
m["pga"] = {
"Juba Arabic",
1262143,
"crp",
"Latn",
ancestors = "apd",
}
m["pgd"] = {
"คานธาระ",
3124623,
"inc-mid",
"Deva, Khar",
ancestors = "inc-ash",
translit = {
Deva = "Deva-translit",
Khar = "Khar-translit",
},
}
m["pgg"] = {
"ปังควาฬฺ",
13600429,
"him",
"Deva, Takr",
translit = {
Deva = "Deva-translit",
Takr = "Takr-translit",
},
}
m["pgi"] = {
"Pagi",
7124354,
"paa-brd",
"Latn",
}
m["pgk"] = {
"Rerep",
586907,
"poz-vnc",
"Latn",
}
m["pgl"] = {
"Primitive Irish",
3320030,
"cel-gae",
"Ogam, Latn",
translit = "pgl-translit",
}
m["pgn"] = {
"Paelignian",
65455883,
"itc-sbl",
"Ital, Latn",
-- Ital translit in [[Module:scripts/data]]
display_text = {
Latn = s["itc-Latn-displaytext"]
},
strip_diacritics = {
Latn = s["itc-Latn-stripdiacritics"]
},
sort_key = {
Latn = s["itc-Latn-sortkey"]
},
}
m["pgs"] = {
"Pangseng",
3914027,
"alv-mum",
"Latn",
}
m["pgu"] = {
"Pagu",
7124462,
"paa-nha",
"Latn",
}
m["pgz"] = {
"Papua New Guinean Sign Language",
25044405,
"sgn",
}
m["pha"] = {
"Pa-Hng",
2625410,
"hmn",
}
m["phd"] = {
"Phudagi",
7188289,
}
m["phg"] = {
"Phuong",
7188376,
"mkh-kat",
}
m["phh"] = {
"Phukha",
7188298,
"tbq-phw",
}
m["phk"] = {
"พ่าเก",
7675798,
"tai-swe",
"Mymr",
translit = "aio-phk-translit",
display_text = s["phk-displaytext"],
strip_diacritics = s["phk-stripdiacritics"],
}
m["phl"] = {
"Palula",
2449549,
"inc-dng",
"Latn, ur-Arab",
strip_diacritics = {
-- character "ۂ" code U+06C2 to "ه" and "هٔ" (U+0647 + U+0654) to "ه"; hamzatu l-waṣli to a regular alif
from = {"هٔ", "ۂ", "ٱ"},
to = {"ہ", "ہ", "ا"},
remove_diacritics = c.fathatan .. c.dammatan .. c.kasratan .. c.fatha .. c.damma .. c.kasra .. c.shadda .. c.sukun .. c.nunghunna .. c.superalef
},
}
m["phm"] = {
"Phimbi",
11007144,
"bnt-sna",
"Latn",
}
m["phn"] = {
"ฟินิเชีย",
36734,
"sem-can",
"Phnx",
-- Phnx translit in [[Module:scripts/data]]
}
m["pho"] = {
"ผู้น้อย",
7188361,
"tbq-bis",
}
m["phq"] = {
"Phana'",
7180427,
"tbq-sil",
}
m["phr"] = {
"Pahari-Potwari",
33739,
"inc-pan",
"pa-Arab, Guru",
ancestors = "lah",
translit = {
Guru = "Guru-translit",
["pa-Arab"] = "pa-Arab-translit",
},
strip_diacritics = {
["pa-Arab"] = {
remove_diacritics = c.fathatan .. c.dammatan .. c.kasratan .. c.fatha .. c.damma .. c.kasra .. c.shadda .. c.sukun .. c.nunghunna,
from = {"ݨ", "ࣇ"},
to = {"ن", "ل"}
},
}
}
m["pht"] = {
"ผู้ไท",
3626597,
"tai-swe",
"Thai",
}
m["phu"] = {
"พวน",
3915665,
}
m["phv"] = {
"Pahlavani",
7124567,
}
m["phw"] = {
"Phangduwali",
12953036,
"sit-kie",
ancestors = "ybh",
}
m["pia"] = {
"Pima Bajo",
3388544,
"azc-pim",
"Latn",
}
m["pib"] = {
"Yine",
3135432,
"awd",
"Latn",
}
m["pic"] = {
"Pinji",
36296,
"bnt-tso",
"Latn",
}
m["pid"] = {
"Piaroa",
3382207,
nil,
"Latn",
}
m["pie"] = {
"Piro",
7198055,
"nai-kta",
"Latn",
}
m["pif"] = {
"Pingelapese",
36421,
"poz-mic",
"Latn",
}
m["pig"] = {
"Pisabo",
966883,
"sai-pan",
"Latn",
}
m["pih"] = {
"Pitcairn-Norfolk",
36554,
"crp",
"Latn",
ancestors = "en",
}
m["pii"] = {
"Pini",
10631925,
}
m["pij"] = {
"Pijao",
7193519,
}
m["pil"] = {
"Yom",
36893,
"nic-yon",
}
m["pim"] = {
"Powhatan",
2270532,
"alg-eas",
"Latn",
}
m["pin"] = {
"Piame",
7190042,
"paa-spk",
"Latn",
}
m["pio"] = {
"Piapoco",
3382208,
"awd-nwk",
"Latn",
}
m["pip"] = {
"Pero",
2411063,
"cdc-wst",
}
m["pir"] = {
"Piratapuyo",
3389119,
"sai-tuc",
"Latn",
}
m["pis"] = {
"Pijin",
36699,
"crp",
"Latn",
ancestors = "en",
}
m["pit"] = {
"Pitta-Pitta",
6433116,
"aus-kar",
"Latn",
}
m["piu"] = {
"Pintupi-Luritja",
2591175,
"aus-pam",
"Latn",
}
m["piv"] = {
"Pileni",
2976736,
"poz-pnp",
"Latn",
}
m["piw"] = {
"Pimbwe",
3894132,
"bnt-mwi",
}
m["pix"] = {
"Piu",
7199578,
}
m["piy"] = {
"Piya-Kwonci",
3440492,
}
m["piz"] = {
"Pije",
3388339,
"poz-cln",
"Latn",
}
m["pjt"] = {
"Pitjantjatjara",
2982063,
"aus-pam",
"pjt-Latn",
}
m["pkb"] = {
"Kipfokomo",
7208693,
"bnt-sab",
"Latn",
}
m["pkc"] = {
"แพ็กเจ",
4841264,
"qfa-kor",
"Hani, Kana",
sort_key = {
Hani = "Hani-sortkey",
Kana = "Kana-sortkey"
},
}
m["pkg"] = {
"Pak-Tong",
3360711,
}
m["pkh"] = {
"Pankhu",
7130962,
"tbq-kuk",
}
m["pkn"] = {
"Pakanha",
954916,
"aus-pmn",
}
m["pko"] = {
"Pökoot",
36323,
"sdv-kln",
"Latn",
}
m["pkp"] = {
"ปูกาปูกา",
36447,
"poz-pnp",
"Latn",
}
m["pkr"] = {
"Attapady Kurumba",
16835180,
"dra-imd",
"Mlym",
-- Mlym translit in [[Module:scripts/data]] (NOTE: not present before, presumably an accidental omission)
}
m["pks"] = {
"Pakistan Sign Language",
22964057,
"sgn",
}
m["pkt"] = {
"Maleng",
6583562,
"mkh-vie",
}
m["pku"] = {
"Paku",
2932604,
"poz-bre",
"Latn",
}
m["pla"] = {
"Miani",
12952844,
"ngf-kau",
"Latn",
}
m["plb"] = {
"Polonombauk",
7225957,
"poz-vnn",
"Latn",
}
m["plc"] = {
"ปาลาวาโนตอนกลาง",
12953795,
"phi",
"Latn",
}
m["ple"] = {
"Palu'e",
2196866,
"poz-cet",
"Latn",
}
m["plg"] = {
"Pilagá",
2748259,
"sai-guc",
"Latn",
}
m["plh"] = {
"Paulohi",
7155331,
"poz-cma",
}
m["plj"] = {
"Polci",
3914383,
}
m["plk"] = {
"Kohistani Shina",
12953882,
"inc-shn",
"ur-Arab, Latn",
}
m["pll"] = {
"Shwe Palaung",
27941664,
"mkh-pal",
"Mymr",
}
m["pln"] = {
"Palenquero",
36665,
"crp",
"Latn",
ancestors = "es",
}
m["plo"] = {
"Oluta Popoluca",
5908687,
"nai-miz",
"Latn",
}
m["plq"] = {
"Palaic",
36582,
"ine-ana",
"Xsux",
}
m["plr"] = {
"Palaka Senoufo",
36346,
"alv-snf",
"Latn",
}
m["pls"] = {
"San Marcos Tlalcoyalco Popoloca",
12641692,
"omq-pop",
"Latn",
}
m["plu"] = {
"Palikur",
3073448,
"awd",
"Latn",
}
m["plv"] = {
"ปาลาวาโนตะวันตกเฉียงใต้",
15614922,
"phi",
"Latn",
}
m["plw"] = {
"ปาลาวาโนแบบบรูกส์พอยต์",
12953796,
"phi",
"Latn",
}
m["ply"] = {
"Bolyu",
3361723,
"mkh-pkn",
"Latn",
}
m["plz"] = {
"Paluan",
7128795,
nil,
"Latn",
}
m["pma"] = {
"Paamese",
3130286,
"poz-vnc",
"Latn",
}
m["pmb"] = {
"Pambia",
36267,
"znd",
"Latn",
}
m["pmd"] = {
"Pallanganmiddang",
7127734,
"aus-pam",
"Latn",
}
m["pme"] = {
"Pwaamèi",
3411152,
"poz-cln",
"Latn",
}
m["pmf"] = {
"Pamona",
3513320,
"poz-kal",
"Latn",
}
m["pmi"] = {
"Northern Pumi",
3403245,
"sit-qia",
}
m["pmj"] = {
"Southern Pumi",
3403246,
"sit-qia",
}
m["pmk"] = {
"Pamlico",
111366045,
"alg-eas",
"Latn",
}
m["pml"] = {
"Sabir",
636479,
"crp",
"Latn",
ancestors = "lij, pro, vec",
}
m["pmm"] = {
"Pol",
36408,
"bnt-kak",
"Latn",
}
m["pmn"] = {
"Pam",
7129017,
"alv-mbm",
}
m["pmo"] = {
"Pom",
7227178,
"poz-hce",
"Latn",
}
m["pmq"] = {
"Northern Pame",
3361762,
"omq",
"Latn",
}
m["pmr"] = {
"Paynamar",
3450824,
"ngf-sog",
"Latn",
}
m["pms"] = {
"ปีเยมอนเต",
15085,
"roa-git",
"Latn",
}
m["pmt"] = {
"Tuamotuan",
36763,
"poz-pep",
"Latn",
}
m["pmu"] = {
"Mirpur Panjabi",
6874480,
}
m["pmw"] = {
"Plains Miwok",
3391031,
"nai-utn",
"Latn",
}
m["pmx"] = {
"Poumei Naga",
12952910,
"tbq-anp",
}
m["pmy"] = {
"Papuan Malay",
12473446,
"crp",
"Latn",
ancestors = "ms",
}
m["pmz"] = {
"Southern Pame",
3361765,
"omq",
"Latn",
}
m["pna"] = {
"Punan Bah-Biau",
4842201,
"poz-bnn",
"Latn",
}
m["pnc"] = {
"Pannei",
7131391,
}
m["pnd"] = {
"Mpinda",
63308194,
"bnt-kmb",
}
m["pne"] = {
"Western Penan",
12953808,
"poz-swa",
"Latn",
}
m["png"] = {
"Pongu",
36282,
"nic-shi",
}
m["pnh"] = {
"Penrhyn",
3130301,
"poz-pep",
"Latn",
}
m["pni"] = {
"Aoheng",
4778608,
"poz",
"Latn",
}
m["pnj"] = {
"Pinjarup",
33103591,
}
m["pnk"] = {
"Paunaka",
2064378,
"awd",
"Latn",
}
m["pnl"] = {
"Paleni",
7127118,
"alv-wan",
"Latn",
}
m["pnm"] = {
"Punan Batu",
7259892,
}
m["pnn"] = {
"Pinai-Hagahai",
5638511,
"paa-pia",
"Latn",
}
m["pno"] = {
"Panobo",
3141869,
"sai-pan",
"Latn",
}
m["pnp"] = {
"Pancana",
7130204,
}
m["pnq"] = {
"Pana (West Africa)",
7129739,
"nic-gnn",
"Latn",
}
m["pnr"] = {
"Panim",
11732562,
"ngf-gum",
"Latn",
}
m["pns"] = {
"Ponosakan",
7227956,
"phi",
"Latn",
}
m["pnt"] = {
"Pontic Greek",
36748,
"grk",
"Grek, Latn, Cyrl",
ancestors = "gkm",
translit = {
Grek = "el-translit"
},
-- Grek display_text, strip_diacritics, sort_key in [[Module:scripts/data]]
}
m["pnu"] = {
"Jiongnai Bunu",
56325,
"hmn",
}
m["pnv"] = {
"Pinigura",
10631927,
"aus-psw",
"Latn",
}
m["pnw"] = {
"Panyjima",
3913830,
"aus-nga",
"Latn",
}
m["pnx"] = {
"Phong-Kniang",
3914627,
"mkh",
}
m["pny"] = {
"Pinyin",
36250,
"nic-nge",
"Latn",
}
m["pnz"] = {
"Pana (Central Africa)",
36241,
"alv-mbm",
"Latn",
}
m["poc"] = {
"Poqomam",
36416,
"myn",
"Latn",
}
m["poe"] = {
"San Juan Atzingo Popoloca",
12953819,
"omq-pop",
"Latn",
}
m["pof"] = {
"Poke",
7208577,
"bnt-ske",
}
m["pog"] = {
"Potiguára",
56722,
"tup-gua",
"Latn",
}
m["poh"] = {
"Poqomchi'",
36414,
"myn",
"Latn",
}
m["poi"] = {
"Highland Popoluca",
7511556,
"nai-miz",
"Latn",
}
m["pok"] = {
"Pokangá",
25559704,
"sai-tuc",
"Latn",
}
m["pom"] = {
"Southeastern Pomo",
3396025,
"nai-pom",
"Latn",
}
m["pon"] = {
"Pohnpeian",
28422,
"poz-mic",
"Latn",
}
m["poo"] = {
"Central Pomo",
3396020,
"nai-pom",
"Latn",
}
m["pop"] = {
"Pwapwâ",
3411153,
"poz-cln",
"Latn",
}
m["poq"] = {
"Texistepec Popoluca",
5908707,
"nai-miz",
"Latn",
}
m["pos"] = {
"Sayula Popoluca",
5908722,
"nai-miz",
"Latn",
}
m["pot"] = {
"Potawatomi",
56749,
"alg",
"Latn",
}
m["pov"] = {
"ครีโอลกินี-บิสเซา",
33339,
"crp",
"Latn",
ancestors = "pt",
}
m["pow"] = {
"San Felipe Otlaltepec Popoloca",
25559598,
"omq-pop",
"Latn",
}
m["pox"] = {
"Polabian",
36741,
"zlw-lch",
"Latn",
}
m["poy"] = {
"Pogolo",
2429648,
"bnt-kil",
}
m["ppa"] = {
"Pao",
7132069,
}
m["ppe"] = {
"Papi",
7132809,
}
m["ppi"] = {
"Paipai",
56726,
"nai-yuc",
"Latn",
}
m["ppk"] = {
"Uma",
7881036,
"poz-kal",
"Latn",
}
m["ppl"] = {
"ปีปิล", -- ใช้ชื่อนี้เพราะ นาวัต (Nawat/Nahuat) อ่านเหมือนกับ นาวัตล์ (Nahuatl)
1186896,
"azc-nah",
"Latn",
strip_diacritics = {remove_diacritics = c.acute .. c.macron},
}
m["ppm"] = {
"Papuma",
7133239,
"poz-hce",
"Latn",
}
m["ppn"] = {
"Papapana",
3362757,
"poz-ocw",
"Latn",
}
m["ppo"] = {
"Folopa",
5464843,
"paa-teb",
"Latn",
}
m["ppq"] = {
"Pei",
7160903,
"paa-wal",
"Latn",
}
m["pps"] = {
"San Luís Temalacayuca Popoloca",
25559602,
"omq-pop",
"Latn",
}
m["ppt"] = {
"Pa",
3504757,
"paa-kae",
"Latn",
}
m["ppu"] = {
"Papora",
2094884,
"map",
"Latn",
}
m["pqa"] = {
"Pa'a",
3441315,
"cdc-wst",
}
m["pqm"] = {
"Malecite-Passamaquoddy",
3183144,
"alg-eas",
"Latn",
}
m["pra"] = {
"ปรากฤต",
192170,
"inc-mid",
"Brah, Deva, Gujr, Knda",
ancestors = "inc-ash",
translit = {
-- Brah translit in [[Module:scripts/data]]
Deva = "Deva-translit",
Gujr = "Gujr-translit",
Knda = "Knda-translit",
},
strip_diacritics = {
-- FIXME: separate by script
from = {"ऎ", "ऒ", u(0x0946), u(0x094A), "य़", "ಯ಼", u(0x11071), u(0x11072), u(0x11073), u(0x11074)},
to = {"ए", "ओ", u(0x0947), u(0x094B), "य", "ಯ", "𑀏", "𑀑", u(0x11042), u(0x11044)}
} ,
}
m["prc"] = {
"Parachi",
2640637,
"ira-orp",
"Arab",
}
-- "prd" is not included, see [[WT:LT]]
m["pre"] = {
"Principense",
36520,
"crp",
"Latn",
ancestors = "pt",
}
m["prf"] = {
"Paranan",
7135433,
"phi",
}
m["prg"] = {
"Old Prussian",
35501,
"bat-wes",
"Latn",
}
m["prh"] = {
"Porohanon",
6583710,
"phi",
"Latn",
}
m["pri"] = {
"Paicî",
732131,
"poz-cln",
"Latn",
}
m["prk"] = {
"Parauk",
3363719,
"mkh-pal",
"Latn",
}
m["prl"] = {
"Peruvian Sign Language",
3915508,
"sgn",
}
m["prm"] = {
"Kibiri",
56745,
"qfa-iso", -- Papuan; isolate in Glottolog and Wurm; suggested grouping with Kiwaian languages by Ross based only on 1sg and 2sg pronouns
"Latn",
}
m["prn"] = {
"Prasuni",
32689,
"nur-nor",
}
m["pro"] = {
"อุตซิตาเก่า",
2779185,
"roa-ocr",
"Latn",
sort_key = {remove_diacritics = c.cedilla},
}
-- "prp" is not included, see [[WT:LT]]
m["prq"] = {
"Ashéninka Perené",
3450601,
"awd",
"Latn",
}
m["prr"] = {
"Puri",
7261687,
}
-- "prs" is treated as "fa" (or as etymology-only), see [[WT:LT]]
m["prt"] = {
"Phai",
7180184,
"mkh",
}
m["pru"] = {
"Puragi",
7260800,
"ngf-sbh",
"Latn",
}
m["prw"] = {
"Parawen",
7136291,
"ngf-num",
"Latn",
}
m["prx"] = {
"Purik",
567905,
"sit-lab",
}
m["prz"] = {
"Providencia Sign Language",
3322084,
"sgn",
}
m["psa"] = {
"Asue Awyu",
11266334,
"ngf-gaw",
"Latn",
}
m["psc"] = {
"Persian Sign Language",
7170221,
"sgn",
}
m["psd"] = {
"Plains Indian Sign Language",
2380124,
"sgn",
}
m["pse"] = {
"มลายูตอนกลาง", -- This does not mean the central of Malaysia. It is spoken in Indonesia.
3367751,
"poz-mly",
"Latn, Rjng",
}
m["psg"] = {
"Penang Sign Language",
4924925,
"sgn",
}
m["psh"] = {
"Southwest Pashayi",
16112270,
"inc-pas",
"fa-Arab",
}
m["psi"] = {
"Southeast Pashayi",
23713536,
"inc-pas",
"fa-Arab",
}
m["psl"] = {
"Puerto Rican Sign Language",
7258608,
"sgn-fsl",
}
m["psm"] = {
"Pauserna",
2912846,
"tup-gua",
"Latn",
}
m["psn"] = {
"Panasuan",
7130113,
"poz",
}
m["pso"] = {
"Polish Sign Language",
3915194,
"sgn-gsl",
}
m["psp"] = {
"Philippine Sign Language",
3551357,
"sgn-fsl",
}
m["psq"] = {
"Pasi",
7142091,
"paa-spk",
"Latn",
}
m["psr"] = {
"Portuguese Sign Language",
3915472,
"sgn",
}
m["pss"] = {
"Kaulong",
3194294,
"poz-ocw",
}
m["psw"] = {
"Port Sandwich",
3398324,
"poz-vnc",
"Latn",
}
m["psy"] = {
"Piscataway",
3504233,
"alg-eas",
}
m["pta"] = {
"Pai Tavytera",
7124619,
"tup-gua",
"Latn",
}
m["pth"] = {
"Pataxó Hã-Ha-Hãe",
7144304,
}
m["pti"] = {
"Pintiini",
10632026,
"aus-pam",
}
m["ptn"] = {
"Patani",
7144242,
"poz-hce",
"Latn",
}
m["pto"] = {
"Zo'é",
8073148,
"tup-gua",
"Latn",
}
m["ptp"] = {
"Patep",
3368679,
"poz-ocw",
"Latn",
}
m["ptq"] = {
"Pattapu",
60785085,
"dra-tam",
}
m["ptr"] = {
"Piamatsina",
7190040,
"poz-vnn",
"Latn",
}
m["ptt"] = {
"Enrekang",
12953520,
nil,
"Latn",
}
m["ptu"] = {
"Bambam",
4853321,
"poz-ssw",
"Latn",
}
m["ptv"] = {
"Port Vato",
3398323,
"poz-vnc",
"Latn",
}
m["ptw"] = {
"Pentlatch",
2069475,
"sal",
"Latn",
}
m["pty"] = {
"Pathiya",
7144790,
"dra-mal",
}
m["pua"] = {
"Purepecha",
16114351,
"qfa-iso",
"Latn",
sort_key = {remove_diacritics = c.acute},
}
m["pub"] = {
"Purum",
6400562,
"tbq-kuk",
"Latn",
}
m["puc"] = {
"Punan Merap",
7259895,
"poz",
"Latn",
}
m["pud"] = {
"Punan Aput",
4782333,
"poz-swa",
"Latn",
}
m["pue"] = {
"Puelche",
33660,
}
m["puf"] = {
"Punan Merah",
7259894,
"poz-swa",
"Latn",
}
m["pug"] = {
"Phuie",
36375,
"nic-gnw",
}
m["pui"] = {
"Puinave",
3027918,
nil,
"Latn",
}
m["puj"] = {
"Punan Tubu",
7259896,
"poz-swa",
"Latn",
}
m["pum"] = {
"Puma",
33736,
"sit-kic",
}
m["puo"] = {
"Puoc",
6440803,
"mkh",
"Latn",
}
m["pup"] = {
"Pulabu",
7259163,
"ngf-rai",
"Latn",
}
m["puq"] = {
"Puquina",
1207739,
}
m["pur"] = {
"Puruborá",
7261619,
"tup",
}
m["put"] = {
"Putoh",
12953832,
"poz-swa",
"Latn",
}
m["puu"] = {
"Punu",
36401,
"bnt-sir",
"Latn",
}
m["puw"] = {
"Puluwat",
36397,
"poz-mic",
"Latn",
}
m["pux"] = {
"Puare",
3507983,
"paa-msk",
"Latn",
}
m["puy"] = {
"Purisimeño",
2967638,
"nai-chu",
"Latn",
}
m["pwa"] = {
"Pawaia",
7156099,
"qfa-dis", -- Papuan; isolate in Glottolog; unclassified in Pawley and Hammarström (2018); sister to the Teberan
-- languages by Usher (2020); tentatively TNG by Ross (2005)
"Latn",
}
m["pwb"] = {
"Panawa",
47385077,
"nic-jer",
"Latn",
ancestors = "jer",
}
m["pwg"] = {
"Gapapaiwa",
3095245,
"poz-ocw",
"Latn",
}
m["pwi"] = {
"Patwin",
3370188,
"nai-wtq",
"Latn",
}
m["pwm"] = {
"Molbog",
6895718,
"poz-san",
"Latn",
}
m["pwn"] = {
"ไปวัน",
715755,
"map",
"Latn",
}
m["pwo"] = {
"กะเหรี่ยงโปตะวันตก",
7988202,
"kar",
"Mymr",
translit = "pwo-translit",
}
m["pwr"] = {
"Powari",
12640277,
"inc-hie",
"Deva",
translit = {
Deva = "Deva-translit",
},
}
m["pww"] = {
"กะเหรี่ยงโปเหนือ",
7058885,
"kar",
"Thai",
}
m["pxm"] = {
"Quetzaltepec Mixe",
6842374,
"nai-miz",
"Latn",
}
m["pye"] = {
"Pye Krumen",
11157382,
"kro-grb",
}
m["pym"] = {
"Fyam",
3914025,
"nic-ple",
"Latn",
}
m["pyn"] = {
"Poyanáwa",
3401023,
"sai-pan",
}
m["pys"] = {
"Paraguayan Sign Language",
7134698,
"sgn",
}
m["pyu"] = {
"Puyuma",
716690,
"map",
"Latn",
}
m["pyx"] = {
"ปยู",
36259,
"sit",
}
m["pyy"] = {
"Pyen",
7262966,
"tbq-bis",
}
m["pzh"] = {
"Pazeh",
36435,
"map",
"Latn",
}
m["pzn"] = {
"Para Naga",
7133667,
"sit-aao",
}
return require("Module:languages").finalizeData(m, "language")
fm7vg2iafotk4mfy34rrfto7i1wxyqt
มอดูล:languages/data/3/o
828
36372
5720762
5684163
2026-04-21T07:01:08Z
OctraBot
3198
บอต: แทนที่ข้อความโดยอัตโนมัติ (-standardChars +standard_chars)
5720762
Scribunto
text/plain
local m_langdata = require("Module:languages/data")
-- Loaded on demand, as it may not be needed (depending on the data).
local function u(...)
u = require("Module:string utilities").char
return u(...)
end
local c = m_langdata.chars
local p = m_langdata.puaChars
local s = m_langdata.shared
local m = {}
m["oaa"] = {
"Orok",
33928,
"tuw-nan",
"Cyrl, Latn",
translit = "oaa-translit",
}
m["oac"] = {
"Oroch",
33650,
"tuw-udg",
"Latn, Cyrl",
}
m["oak"] = {
"Noakhali",
107548681,
"inc-bas",
"Beng",
}
m["oav"] = {
"อะวาร์เก่า",
65455879,
"cau-ava",
"Geor",
-- Geor translit in [[Module:scripts/data]] (NOTE: formerly not present, probably an accidental omission)
}
m["obi"] = {
"Obispeño",
1288385,
"nai-chu",
"Latn",
}
m["obk"] = {
"Southern Bontoc",
63308144,
"phi",
"Latn",
}
m["obl"] = {
"Oblo",
36309,
}
m["obm"] = {
"โมอับ",
36385,
"sem-can",
-- Phnx translit in [[Module:scripts/data]]
}
m["obo"] = {
"Obo Manobo",
12953699,
"mno",
"Latn",
}
m["obr"] = {
"พม่าเก่า",
17006600,
"tbq-brm",
"Mymr, Latn", --and also Pallava
}
m["obt"] = {
"เบรอตงเก่า",
3558112,
"cel-brs",
"Latn",
}
m["obu"] = {
"Obulom",
3813403,
"nic-cde",
"Latn",
}
m["oca"] = {
"Ocaina",
3182577,
"sai-wit",
"Latn",
}
m["och"] = {
"จีนเก่า",
35137,
"zhx",
"Hant",
translit = "zh-translit",
sort_key = "Hani-sortkey",
}
m["oco"] = {
"คอร์นวอลล์เก่า",
48304520,
"cel-brs",
"Latn",
}
m["ocu"] = {
"Tlahuica",
10751739,
"omq",
"Latn",
}
m["oda"] = {
"Odut",
3915388,
"nic-uce",
"Latn",
ancestors = "mfn",
}
m["odk"] = {
"Od",
7077191,
"inc-wes",
"Arab",
}
m["odt"] = {
"ดัตช์เก่า",
443089,
"gmw-frk",
"Latn, Runr",
strip_diacritics = {remove_diacritics = c.circ .. c.macron},
}
m["odu"] = {
"Odual",
3813392,
"nic-cde",
"Latn",
}
m["ofo"] = {
"Ofo",
3349758,
"sio-ohv",
}
m["ofs"] = {
"ฟรีเชียเก่า",
35133,
"gmw-fri",
"Latn",
strip_diacritics = {remove_diacritics = c.circ .. c.macron},
sort_key = {
from = {"æ", "ð", "þ"},
to = {"ae", "t" .. p[1], "t" .. p[2]}
},
}
m["ofu"] = {
"Efutop",
35297,
"nic-eko",
"Latn",
}
m["ogb"] = {
"Ogbia",
3813400,
"nic-cde",
"Latn",
}
m["ogc"] = {
"Ogbah",
36291,
"alv-igb",
"Latn",
}
m["oge"] = {
"จอร์เจียเก่า",
34834,
"ccs-gzn",
"Geor, Geok",
-- Geor, Geok translit in [[Module:scripts/data]]
override_translit = true,
strip_diacritics = {remove_diacritics = c.circ},
}
m["ogg"] = {
"Ogbogolo",
3813405,
"nic-cde",
"Latn",
}
m["ogo"] = {
"Khana",
3914409,
"nic-ogo",
"Latn",
}
m["ogu"] = {
"Ogbronuagum",
3914485,
"nic-cde",
"Latn",
}
m["ohu"] = {
"ฮังการีเก่า",
65455880,
"urj-ugr",
"Latn, Hung",
}
m["oia"] = {
"Oirata",
56738,
"paa-tap",
"Latn",
}
m["oin"] = {
"Inebu One",
12953782,
"paa-tor",
"Latn",
}
m["ojb"] = {
"Northwestern Ojibwa",
7060356,
"alg",
"Latn",
ancestors = "oj",
}
m["ojc"] = {
"Central Ojibwa",
5061548,
"alg",
"Latn",
ancestors = "oj",
}
m["ojg"] = {
"Eastern Ojibwa",
5330342,
"alg",
"Latn",
ancestors = "oj",
}
m["ojp"] = {
"ญี่ปุ่นเก่า",
5736700,
"jpx",
"Jpan",
display_text = s["jpx-displaytext"],
strip_diacritics = s["jpx-stripdiacritics"],
sort_key = s["jpx-sortkey"],
}
m["ojs"] = {
"Severn Ojibwa",
56494,
"alg",
"Latn",
ancestors = "oj",
}
m["ojv"] = {
"Ontong Java",
7095071,
"poz-pnp",
"Latn",
}
m["ojw"] = {
"Western Ojibwa",
3474222,
"alg",
"Latn",
ancestors = "oj",
}
m["oka"] = {
"Okanagan",
2984602,
"sal",
"Latn",
}
m["okb"] = {
"Okobo",
3813398,
"nic-lcr",
"Latn",
}
m["okd"] = {
"Okodia",
36300,
"ijo",
"Latn",
}
m["oke"] = {
"Okpe (Southwestern Edo)",
268924,
"alv-swd",
"Latn",
}
m["okg"] = {
"Kok-Paponk",
55254102,
"aus-pmn",
"Latn",
}
m["okh"] = {
"Koresh-e Rostam",
6432160,
"xme-ttc",
ancestors = "xme-ttc-cen",
}
m["oki"] = {
"Okiek",
56367,
"sdv-kln",
"Latn",
}
m["okj"] = {
"Oko-Juwoi",
3436832,
"qfa-adc",
}
m["okk"] = {
"Kwamtim One",
19830649,
"paa-tor",
"Latn",
}
m["okl"] = {
"Old Kentish Sign Language",
7084319,
"sgn",
}
m["okm"] = {
"เกาหลีกลาง",
715339,
"qfa-kor",
"Kore, Latn",
ancestors = "oko",
translit = "okm-translit",
sort_key = "okm-sortkey",
-- Kore strip_diacritics in [[Module:scripts/data]]
}
m["okn"] = {
"โอกิโนเอราบุ",
3350036,
"jpx-nry",
"Jpan",
translit = s["jpx-translit"],
display_text = s["jpx-displaytext"],
strip_diacritics = s["jpx-stripdiacritics"],
sort_key = s["jpx-sortkey"],
}
m["oko"] = {
"เกาหลีเก่า",
715364,
"qfa-kor",
"Kore",
-- Kore strip_diacritics in [[Module:scripts/data]]
}
m["okr"] = {
"Kirike",
11006763,
"ijo",
"Latn",
}
m["oks"] = {
"Oko-Eni-Osayen",
36302,
"alv-von",
"Latn",
}
m["oku"] = {
"Oku",
36289,
"nic-rnc",
"Latn",
}
m["okv"] = {
"Orokaiva",
7103752,
"paa-bin",
"Latn",
}
m["okx"] = {
"Okpe (Northwestern Edo)",
7082547,
"alv-nwd",
"Latn",
}
m["okz"] = {
"เขมรเก่า",
9205,
"mkh-kmr",
"Latn, Khmr", --and also Khom, Pallava
translit = {
Khmr = "Khmr-translit",
},
}
m["old"] = {
"Mochi",
12952852,
"bnt-chg",
"Latn",
}
m["ole"] = {
"Olekha",
3695204,
"sit",
"Tibt, Latn",
override_translit = true,
-- Tibt translit, display_text, strip_diacritics, sort_key in [[Module:scripts/data]]
}
m["olm"] = {
"Oloma",
3441166,
"alv-nwd",
"Latn",
}
m["olo"] = {
"ลิววี",
36584,
"urj-fin",
"Latn",
}
m["olr"] = {
"Olrat",
3351562,
"poz-vnn",
"Latn",
}
m["olt"] = {
"ลิทัวเนียเก่า",
17417801,
"bat-eas",
"Latn",
strip_diacritics = {remove_diacritics = c.grave .. c.acute .. c.tilde},
}
m["olu"] = {
"Kuvale",
6448765,
"bnt-swb",
"Latn",
}
m["oma"] = {
"Omaha-Ponca",
2917968,
"sio-dhe",
"Latn",
}
m["omb"] = {
"Omba",
2841471,
"poz-vnn",
"Latn",
}
m["omc"] = {
"Mochica",
1951641,
"qfa-iso",
"Latn",
}
m["omg"] = {
"Omagua",
33663,
"tup-gua",
"Latn",
}
m["omi"] = {
"Omi",
56795,
"csu-mma",
}
m["omk"] = {
"Omok",
4334657,
"qfa-yuk",
"Cyrl",
translit = "omk-translit",
}
m["oml"] = {
"Ombo",
7089928,
"bnt-tet",
"Latn",
}
m["omn"] = {
"ไมนอส",
1669994,
"qfa-unc", -- undeciphered
"Lina",
}
m["omo"] = {
"Utarmbung",
7902577,
"ngf-sad",
"Latn",
}
m["omp"] = {
"มณีปุระเก่า",
105953310,
"sit",
"Mtei",
translit = "Mtei-translit",
}
m["omr"] = {
"มราฐีเก่า",
65455881,
"inc-sou",
"Deva, Modi",
translit = {
Deva = "Deva-translit",
Modi = "Modi-translit",
},
}
m["omt"] = {
"Omotik",
36313,
"sdv-nis",
}
m["omu"] = {
"Omurano",
1957612,
}
m["omw"] = {
"South Tairora",
20210553,
"ngf-kag",
"Latn",
}
m["omx"] = {
"มอญเก่า",
111364697,
"mkh-mnc",
"Mymr, Latn", --and also Pallava
}
m["ona"] = {
"Selk'nam",
2721227,
"sai-cho",
"Latn",
}
m["onb"] = {
"เบ",
7093790,
"qfa-onb",
"Latn",
}
m["one"] = {
"Oneida",
857858,
"iro-nor",
"Latn",
}
m["ong"] = {
"Olo",
592162,
"paa-tor",
"Latn",
}
m["oni"] = {
"Onin",
7093910,
"poz-cet",
"Latn",
}
m["onj"] = {
"Onjob",
7093968,
"ngf-dag",
"Latn",
}
m["onk"] = {
"Kabore One",
12953783,
"paa-tor",
"Latn",
}
m["onn"] = {
"Onobasulu",
7094437,
"ngf-bos",
"Latn",
}
m["ono"] = {
"Onondaga",
1077450,
"iro-nor",
"Latn",
ancestors = "iro-oon",
}
m["onp"] = {
"Sartang",
7424639,
"sit-khm",
"Latn, Deva",
}
m["onr"] = {
"Northern One",
19830648,
"paa-tor",
"Latn",
}
m["ons"] = {
"Ono",
11732548,
"ngf-huo",
"Latn",
}
m["ont"] = {
"Ontenu",
3352827,
}
m["onu"] = {
"Unua",
3552042,
"poz-vnc",
"Latn",
}
m["onw"] = {
"นิวเบียเก่า",
2268,
"nub",
"Copt",
translit = "Copt-translit",
sort_key = "Copt-sortkey",
}
m["onx"] = {
"Pidgin Onin",
12953788,
"crp",
"Latn",
ancestors = "oni",
}
m["ood"] = {
"O'odham",
2393095,
"azc-pim",
"Latn",
}
m["oog"] = {
"Ong",
12953787,
"mkh-kat",
}
m["oon"] = {
"Önge",
2475551,
"qfa-ong",
"Latn",
}
m["oor"] = {
"Oorlams",
2484337,
}
m["opa"] = {
"Okpamheri",
3913331,
"alv-nwd",
"Latn",
}
m["opk"] = {
"Kopkaka",
6431129,
"ngf-okk",
"Latn",
}
m["opm"] = {
"Oksapmin",
1068097,
"ngf", -- per Glottolog, in an Ok-Oksapmin family, under Awyu-Ok, under Asmat-Awyu-Ok, but we don't have these
"Latn",
}
m["opo"] = {
"Opao",
7095585,
"paa-wel",
"Latn",
}
m["opt"] = {
"Opata",
2304583,
"azc-trc",
"Latn",
}
m["opy"] = {
"Ofayé",
3446691,
"sai-mje",
"Latn",
}
m["ora"] = {
"Oroha",
36298,
"poz-sls",
"Latn",
}
m["ore"] = {
"Orejón",
3355834,
"sai-tuc",
"Latn",
}
m["org"] = {
"Oring",
3915308,
"nic-ucn",
"Latn",
}
m["orh"] = {
"Oroqen",
1367309,
"tuw-ewe",
"Latn",
}
m["oro"] = {
"Orokolo",
7103758,
"paa-wel",
"Latn",
}
m["orr"] = {
"Oruma",
36299,
"ijo",
"Latn",
}
m["ort"] = {
"Adivasi Odia",
12953791,
"inc-eas",
"Orya",
ancestors = "or",
}
m["oru"] = {
"Ormuri",
33740,
"ira-orp",
"fa-Arab",
}
m["orv"] = {
"สลาวิกตะวันออกเก่า",
35228,
"zle",
"Cyrs",
translit = {Cyrs = "Cyrs-translit"},
-- Cyrs strip_diacritics, sort_key in [[Module:scripts/data]]
}
m["orw"] = {
"Oro Win",
3450423,
"sai-cpc",
"Latn",
}
m["orx"] = {
"Oro",
3813396,
"nic-lcr",
"Latn",
}
m["orz"] = {
"Ormu",
7103494,
"poz-ocw",
"Latn",
}
m["osa"] = {
"Osage",
2600085,
"sio-dhe",
"Latn, Osge",
}
m["osc"] = {
"Oscan",
36653,
"itc-sbl",
"Ital, Latn, Polyt",
display_text = {
Latn = s["itc-Latn-displaytext"],
},
strip_diacritics = {
Latn = s["itc-Latn-stripdiacritics"],
},
sort_key = {
Latn = s["itc-Latn-sortkey"],
},
-- Ital translit in [[Module:scripts/data]]
-- Polyt translit, display_text, strip_diacritics, sort_key in [[Module:scripts/data]]
}
m["osi"] = {
"โอซิง",
2701322,
"poz",
"Latn",
}
m["osn"] = {
"Old Sundanese",
56197074,
"poz-msa",
"Latn, Sund, Kawi",
}
m["oso"] = {
"Ososo",
3913398,
"alv-yek",
"Latn",
}
m["osp"] = {
"สเปนเก่า",
1088025,
"roa-cas",
"Latn",
}
m["ost"] = {
"Osatu",
36243,
"nic-grs",
"Latn",
}
m["osu"] = {
"Southern One",
12953785,
"paa-tor",
"Latn",
}
m["osx"] = {
"แซกซันเก่า",
35219,
"gmw-lgm",
"Latn",
strip_diacritics = {remove_diacritics = c.circ .. c.macron},
}
m["ota"] = {
"ตุรกีแบบออตโตมัน",
36730,
"trk-ogz",
"ota-Arab, Armn",
ancestors = "trk-oat",
strip_diacritics = {
["ota-Arab"] = {
remove_diacritics = c.fathatan .. c.dammatan .. c.kasratan .. c.fatha .. c.damma .. c.kasra .. c.shadda .. c.sukun .. c.superalef,
from = {"گ", "ڭ", "ۀ"},
to = {"ك", "ك", "ه"}
},
Armn = {
from = {"՚"},
to = {"’"}
},
},
translit = {Armn = "ota-Armn-translit"},
standard_chars = {
["ota-Arab"] = "آاأبپتثجچحخدذرزژسشصضطظعغفقكلمنوؤهیئةءـ",
c.punc
},
}
m["otb"] = {
"ทิเบตเก่า",
7085214,
"sit-tib",
"Tibt",
override_translit = true,
-- Tibt translit, display_text, strip_diacritics, sort_key in [[Module:scripts/data]]
}
m["otd"] = {
"Ot Danum",
3033781,
"poz-brw",
"Latn",
}
m["ote"] = {
"Mezquital Otomi",
23755711,
"oto-otm",
"Latn",
}
m["oti"] = {
"Oti",
3357881,
}
m["otk"] = {
"เตอร์กิกเก่า",
34988,
"trk-sib",
"Orkh, Sogd",
-- Orkh translit in [[Module:scripts/data]]
}
m["otl"] = {
"Tilapa Otomi",
7802050,
"oto-otm",
"Latn",
}
m["otm"] = {
"Eastern Highland Otomi",
13581718,
"oto-otm",
"Latn",
}
m["otn"] = {
"Tenango Otomi",
25559589,
"oto-otm",
"Latn",
}
m["otq"] = {
"Querétaro Otomi",
23755688,
"oto-otm",
"Latn",
}
m["otr"] = {
"Otoro",
36328,
"alv-hei",
}
m["ots"] = {
"Estado de México Otomi",
7413841,
"oto-otm",
"Latn",
}
m["ott"] = {
"Temoaya Otomi",
7698191,
"oto-otm",
"Latn",
}
m["otu"] = {
"Otuke",
7110049,
"sai-mje",
"Latn",
}
m["otw"] = {
"Ottawa",
133678,
"alg",
"Latn",
ancestors = "oj",
}
m["otx"] = {
"Texcatepec Otomi",
25559590,
"oto-otm",
"Latn",
}
m["oty"] = {
"ทมิฬเก่า",
20987452,
"dra-tam",
"Brah",
-- Brah translit in [[Module:scripts/data]]
}
m["otz"] = {
"Ixtenco Otomi",
6101171,
"oto-otm",
"Latn",
}
m["oub"] = {
"Glio-Oubi",
3914977,
"kro-grb",
}
m["oue"] = {
"Oune",
7110521,
"paa-sbo",
"Latn",
}
m["oui"] = {
"อุยกูร์เก่า",
428299,
"trk-ssb",
"Ougr, Latn, Hani, Phag, Brah, Mani, Syrc, Orkh, Sogd, Arab, mnc-Mong, Sogo, Tibt",
ancestors = "otk",
translit = {
Ougr = "Ougr-translit",
-- Orkh translit in [[Module:scripts/data]]
-- Mani translit in [[Module:scripts/data]]
-- mnc-Mong translit in [[Module:scripts/data]] (NOTE: Formerly not present; I assume accidentally left out)
-- Brah translit in [[Module:scripts/data]]
},
-- Tibt translit, display_text, strip_diacritics, sort_key in [[Module:scripts/data]]
-- NOTE: Formerly there was only a sort_key for Tibetan. I assume the other three were left out accidentally.
sort_key = {
Hani = "Hani-sortkey",
},
}
m["oum"] = {
"Ouma",
7110494,
"poz-ocw",
"Latn",
}
m["ovd"] = {
"แอลฟ์ดาเลิน", --Älvdalen
254950,
"gmq-eas",
"Latn, Runr",
}
m["owi"] = {
"Owiniga",
56454,
"paa-lem",
"Latn",
}
m["owl"] = {
"เวลส์เก่า",
2266723,
"cel-brw",
"Latn",
}
m["oyb"] = {
"Oy",
13593748,
"mkh-ban",
}
m["oyd"] = {
"Oyda",
7116251,
"omv-nom",
}
m["oym"] = {
"Wayampi",
7975842,
"tup-gua",
"Latn",
}
m["oyy"] = {
"Oya'oya",
7116243,
"poz-ocw",
"Latn",
}
m["ozm"] = {
"Koonzime",
35566,
"bnt-ndb",
"Latn",
}
return require("Module:languages").finalizeData(m, "language")
6rb04ncw5aniuger1eohv8hivxlbgv0
มอดูล:languages/data/3/m
828
36374
5720761
5684379
2026-04-21T07:01:05Z
OctraBot
3198
บอต: แทนที่ข้อความโดยอัตโนมัติ (-standardChars +standard_chars)
5720761
Scribunto
text/plain
local m_langdata = require("Module:languages/data")
-- Loaded on demand, as it may not be needed (depending on the data).
local function u(...)
u = require("Module:string utilities").char
return u(...)
end
local c = m_langdata.chars
local p = m_langdata.puaChars
local s = m_langdata.shared
local m = {}
m["maa"] = {
"San Jerónimo Tecóatl Mazatec",
7692927,
"omq-maz",
"Latn",
}
m["mab"] = {
"Yutanduchi Mixtec",
12645448,
"omq-mxt",
"Latn",
}
m["mad"] = {
"Madurese",
36213,
"poz-msa",
"Latn, Java",
}
m["mae"] = {
"Bo-Rukul",
34967,
"nic-ple",
"Latn",
}
m["maf"] = {
"Mafa",
35819,
"cdc-cbm",
"Latn",
}
m["mag"] = {
"มคหะ", -- Not to be confused with Magadhi Prakrit (pra-mag)
33728,
"inc-bih",
"Deva, Kthi",
translit = {
Deva = "Deva-translit",
Kthi = "Kthi-translit",
},
}
m["mai"] = {
"ไมถิลี",
36109,
"inc-bih",
"Deva, Tirh, Kthi, Newa",
translit = {
Deva = "Deva-translit",
Tirh = "Tirh-translit",
Kthi = "Kthi-translit",
Newa = "Newa-translit",
},
}
m["maj"] = {
"Jalapa de Díaz Mazatec",
3915999,
"omq-maz",
"Latn",
}
m["mak"] = {
"มากัซซาร์",
33643,
"poz-ssw",
"Latn, Bugi, Maka",
}
m["mam"] = {
"Mam",
33467,
"myn",
"Latn",
}
m["man"] = {
"Mandingo",
35772,
"dmn-man",
"Latn",
}
m["maq"] = {
"Chiquihuitlán Mazatec",
5101757,
"omq-maz",
"Latn",
}
m["mas"] = {
"มาไซ",
35787,
"sdv-lma",
"Latn",
}
m["mat"] = {
"Matlatzinca",
12953704,
"omq",
"Latn",
}
m["mau"] = {
"Huautla Mazatec",
36230,
"omq-maz",
"Latn",
}
m["mav"] = {
"Sateré-Mawé",
6794475,
"tup",
"Latn",
}
m["maw"] = {
"Mampruli",
35804,
"nic-wov",
"Latn",
}
m["max"] = {
"North Moluccan Malay",
7056136,
"crp",
"Latn",
ancestors = "ms",
}
m["maz"] = {
"Central Mazahua",
36228,
"oto",
"Latn",
}
m["mba"] = {
"Higaonon",
5753411,
"mno",
"Latn",
}
m["mbb"] = {
"Western Bukidnon Manobo",
7987643,
"mno",
"Latn",
}
m["mbc"] = {
"Macushi",
56633,
"sai-pem",
"Latn",
}
m["mbd"] = {
"Dibabawon Manobo",
18755523,
"mno",
"Latn",
}
m["mbe"] = {
"Molale",
3319444,
"nai-plp",
"Latn",
}
m["mbf"] = {
"Baba Malay",
18642798,
"crp",
"Latn",
ancestors = "ms",
}
m["mbh"] = {
"Mangseng",
6749147,
"poz-ocw",
"Latn",
}
m["mbi"] = {
"Ilianen Manobo",
14916911,
"mno",
"Latn",
}
m["mbj"] = {
"Nadëb",
3335011,
"sai-nad",
"Latn",
}
m["mbk"] = {
"Malol",
6744477,
"poz-ocw",
"Latn",
}
m["mbl"] = {
"Maxakalí",
3029682,
"sai-mje",
"Latn",
}
m["mbm"] = {
"Ombamba",
36407,
"bnt-mbt",
"Latn",
}
m["mbn"] = {
"Macaguán",
3273980,
"sai-guh",
"Latn",
}
m["mbo"] = { -- is, like 'bqz', 'bsi' and 'bss', a dialect of Manenguba
"Mbo (Cameroon)",
36011,
"bnt-mne",
"Latn",
}
m["mbp"] = {
"Wiwa",
3012604,
"cba",
"Latn",
}
m["mbq"] = {
"Maisin",
3448149,
nil,
"Latn",
}
m["mbr"] = {
"Nukak Makú",
3346228,
"sai-nad",
"Latn",
}
m["mbs"] = {
"Sarangani Manobo",
7423093,
"mno",
"Latn",
}
m["mbt"] = {
"Matigsalug Manobo",
6787447,
"mno",
"Latn",
}
m["mbu"] = {
"Mbula-Bwazza",
3913324,
"nic-jrn",
"Latn",
}
m["mbv"] = {
"Mbulungish",
36003,
"alv-nal",
"Latn",
}
m["mbw"] = {
"Maring",
3293280,
"ngf-chw",
"Latn",
}
m["mbx"] = {
"Sepik Mari",
6760942,
"paa-spk",
"Latn",
}
m["mby"] = {
"Memoni",
4180871,
"inc-snd",
"Gujr, ur-Arab",
}
m["mbz"] = {
"Amoltepec Mixtec",
13583504,
"omq-mxt",
"Latn",
}
m["mca"] = {
"Maca",
3281043,
"sai-mtc",
"Latn",
}
m["mcb"] = {
"Machiguenga",
3915441,
"awd",
"Latn",
}
m["mcc"] = {
"Bitur",
4919173,
"paa-ani",
"Latn",
}
m["mcd"] = {
"Sharanahua",
12953881,
"sai-pan",
"Latn",
}
m["mce"] = {
"Itundujia Mixtec",
12953727,
"omq-mxt",
"Latn",
}
m["mcf"] = {
"Matsés",
2981620,
"sai-pan",
"Latn",
}
m["mcg"] = {
"Mapoyo",
56946,
"sai-map",
"Latn",
}
m["mch"] = {
"Ye'kwana",
3082027,
"sai-car",
"Latn",
sort_key = {
remove_diacritics = "%-%s",
from = {"'", "ñ", "ö", "sh", "ü"},
to = {"’", "n" .. p[1], "o" .. p[1], "s" .. p[1], "u" .. p[1]}
}
}
m["mci"] = {
"Mese",
6821190,
"ngf-huo",
"Latn",
}
m["mcj"] = {
"Mvanip",
3913281,
"nic-mmb",
"Latn",
}
m["mck"] = {
"Mbunda",
34170,
"bnt-clu",
"Latn",
}
m["mcl"] = {
"Macaguaje",
6722435,
"sai-tuc",
"Latn",
}
m["mcm"] = {
"Kristang",
2669169,
"crp",
"Latn",
ancestors = "pt",
}
m["mcn"] = {
"Masana",
56668,
"cdc-mas",
}
m["mco"] = {
"Coatlán Mixe",
25559716,
"nai-miz",
"Latn",
}
m["mcp"] = {
"Makaa",
35803,
"bnt-mka",
}
m["mcq"] = {
"Ese",
5397551,
"ngf-koi",
"Latn",
}
m["mcr"] = {
"Menya",
11732444,
"ngf-ang",
"Latn",
}
m["mcs"] = {
"Mambai",
6748872,
"alv-mbm",
}
m["mcu"] = {
"Cameroon Mambila",
19359039,
"nic-mmb",
"Latn",
}
-- mcv (Minanibai) merged into ffi (Foia Foia) per Glottolog
m["mcw"] = {
"Mawa",
3441333,
"cdc-est",
"Latn",
}
m["mcx"] = {
"Mpiemo",
35908,
"bnt-bek",
}
m["mcy"] = {
"South Watut",
12953293,
"poz-ocw",
"Latn",
}
m["mcz"] = {
"Mawan",
11732429,
"ngf-han",
"Latn",
}
m["mda"] = {
"Mada (Nigeria)",
3915843,
"nic-nin",
"Latn",
}
m["mdb"] = {
"Morigi",
6912195,
"paa-kiw",
"Latn",
}
m["mdc"] = {
"Male",
6742927,
"ngf-min",
"Latn",
}
m["mdd"] = {
"Mbum",
36170,
"alv-mbm",
}
m["mde"] = {
"Bura Mabang",
35860,
"ssa",
"Arab, Latn",
}
m["mdf"] = {
"มอกชา",
13343,
"urj-mdv",
"Cyrl",
translit = "mdf-translit",
strip_diacritics = {remove_diacritics = c.acute},
override_translit = true,
sort_key = "mdf-sortkey",
}
m["mdg"] = {
"Massalat",
759984,
}
m["mdh"] = {
"มากินดาเนา",
33717,
"phi",
"Latn, Arab",
}
m["mdi"] = {
"Mamvu",
3033594,
"csu-mle",
}
m["mdj"] = {
"Mangbetu",
56327,
"csu-maa",
"Latn",
}
m["mdk"] = {
"Mangbutu",
6748877,
"csu-mle",
}
m["mdl"] = {
"Maltese Sign Language",
6744816,
"sgn",
}
m["mdm"] = {
"Mayogo",
6797580,
"nic-nke",
"Latn",
}
m["mdn"] = {
"Mbati",
36165,
"bnt-ngn",
}
m["mdp"] = {
"Mbala",
6799583,
"bnt-pen",
}
m["mdq"] = {
"Mbole",
6799727,
"bnt-mbe",
}
m["mdr"] = {
"Mandar",
35995,
"poz-ssw",
"Bugi, Latn",
}
m["mds"] = {
"Maria",
3448673,
"paa-man",
"Latn",
}
m["mdt"] = {
"Mbere",
36062,
"bnt-mbt",
}
m["mdu"] = {
"Mboko",
36058,
"bnt-mbo",
}
m["mdv"] = {
"Santa Lucía Monteverde Mixtec",
12953722,
"omq-mxt",
"Latn",
}
m["mdw"] = {
"Mbosi",
36035,
"bnt-mbo",
}
m["mdx"] = {
"Dizin",
35313,
"omv-diz",
"Ethi, Latn",
}
m["mdy"] = {
"Maale",
795327,
"omv-ome",
}
m["mdz"] = {
"Suruí Do Pará",
10322149,
"tup-gua",
"Latn",
}
m["mea"] = {
"Menka",
36078,
"nic-grs",
"Latn",
}
m["meb"] = {
"Ikobi-Mena",
11732241,
"paa-tuk",
"Latn",
}
m["mec"] = {
"Mara",
6772774,
}
m["med"] = {
"Melpa",
36166,
"ngf-chw",
"Latn",
}
m["mee"] = {
"Mengen",
3305831,
"poz-ocw",
"Latn",
}
m["mef"] = {
"Megam",
6808589,
}
m["meh"] = {
"Southwestern Tlaxiaco Mixtec",
7070686,
"omq-mxt",
"Latn",
}
m["mei"] = {
"Midob",
36007,
"nub",
"Latn",
}
m["mej"] = {
"Meyah",
11732436,
"paa-ebh",
"Latn",
}
m["mek"] = {
"Mekeo",
3304803,
"poz-ocw",
"Latn",
}
m["mel"] = {
"Central Melanau",
18638319,
"poz-swa",
"Latn",
}
m["mem"] = {
"Mangala",
6748664,
}
m["men"] = {
"Mende",
1478672,
"dmn-msw",
"Latn, Mend",
}
m["meo"] = {
"มลายูแบบเกอดะฮ์",
4925684,
"poz-mly",
"Latn, ms-Arab, Thai",
strip_diacritics = {
from = {u(0xF70F)},
to = {"ญ"}
},
--sort_key = {Thai = "Thai-sortkey"},
}
m["mep"] = {
"Miriwung",
3111847,
"aus-jar",
"Latn",
}
m["meq"] = {
"Merey",
3502314,
"cdc-cbm",
"Latn",
}
m["mer"] = {
"Meru",
13313,
"bnt-kka",
"Latn",
}
m["mes"] = {
"Masmaje",
3440448,
}
m["met"] = {
"Mato",
3299190,
"poz-ocw",
"Latn",
}
m["meu"] = {
"Motu",
33516,
"poz-ocw",
"Latn",
}
m["mev"] = {
"Mano",
3913286,
"dmn-mda",
"Latn",
}
m["mew"] = {
"Maaka",
3438764,
"cdc-wst",
"Latn",
}
m["mey"] = {
"Hassaniya Arabic",
56231,
"sem-arb",
"Arab",
}
m["mez"] = {
"Menominee",
13363,
"alg",
"Latn",
sort_key = {remove_diacritics = "·"},
}
m["mfa"] = {
"มลายูแบบปัตตานี",
1199751,
"poz-mly",
"Latn, ms-Arab, Thai",
strip_diacritics = {
from = {u(0xF70F)},
to = {"ญ"}
},
sort_key = {remove_diacritics = "'"}, -- only for thwikt
}
m["mfb"] = {
"Bangka",
3258818,
"poz-mly",
"Latn, Arab",
}
m["mfc"] = {
"Mba",
4286464,
"nic-mbc",
"Latn",
}
m["mfd"] = {
"Mendankwe-Nkwen",
11129537,
"nic-nge",
"Latn",
}
m["mfe"] = {
"ครีโอลมอริเชียส",
33661,
"crp",
"Latn",
ancestors = "fr",
sort_key = s["roa-oil-sortkey"],
}
m["mff"] = {
"Naki",
36083,
"nic-bbe",
"Latn",
}
m["mfg"] = {
"Mixifore",
3914478,
"dmn-mok",
}
m["mfh"] = {
"Matal",
3501751,
"cdc-cbm",
"Latn",
}
m["mfi"] = {
"Wandala",
3441249,
"cdc-cbm",
"Latn",
}
m["mfj"] = {
"Mefele",
3501871,
"cdc-cbm",
}
m["mfk"] = {
"North Mofu",
56303,
"cdc-cbm",
"Latn",
}
m["mfl"] = {
"Putai",
56291,
}
m["mfm"] = {
"Marghi South",
56248,
}
m["mfn"] = {
"Cross River Mbembe",
3915395,
"nic-uce",
"Latn",
}
m["mfo"] = {
"Mbe",
36075,
"nic-eko",
"Latn",
}
m["mfp"] = {
"Makassar Malay",
12952776,
"qfa-mix",
"Latn",
ancestors = "ms, mak"
}
m["mfq"] = {
"Moba",
19921578,
"nic-grm",
"Latn",
}
m["mfr"] = {
"Marrithiyel",
6773014,
"aus-dal",
"Latn",
}
m["mfs"] = {
"Mexican Sign Language",
3915511,
"sgn",
"Latn", -- when documented
}
m["mft"] = {
"Mokerang",
3319387,
"poz-aay",
"Latn",
}
m["mfu"] = {
"Mbwela",
11004988,
"bnt-clu",
ancestors = "lch",
}
m["mfv"] = {
"Mandjak",
35822,
"alv-pap",
}
m["mfw"] = {
"Mulaha",
6933720,
"paa-kwa",
"Latn",
}
m["mfx"] = {
"Melo",
6813268,
"omv-nom",
}
m["mfy"] = {
"Mayo",
56729,
"azc-trc",
"Latn",
sort_key = {remove_diacritics = c.acute},
}
m["mfz"] = {
"Mabaan",
20526385,
"sdv",
"Latn",
}
m["mga"] = {
"ไอริชกลาง",
36116,
"cel-gae",
"Latn",
ancestors = "sga",
strip_diacritics = {remove_diacritics = c.dotabove .. c.diaer .. "·"},
sort_key = "mga-sortkey",
}
m["mgb"] = {
"Mararit",
56359,
"sdv-tmn",
}
m["mgc"] = {
"Morokodo",
6913216,
"csu-bbk",
"Latn",
}
m["mgd"] = {
"Moru",
6915014,
"csu-mma",
"Latn, Arab",
}
m["mge"] = {
"Mango",
713659,
"csu-sar",
"Latn",
}
m["mgf"] = {
"Maklew",
6739816,
"paa-bul",
"Latn",
}
m["mgg"] = {
"Mpongmpong",
35924,
"bnt-bek",
}
m["mgh"] = {
"Makhuwa-Meetto",
33604,
"bnt-mak",
"Latn",
ancestors = "vmw",
}
m["mgi"] = {
"Jili",
3914497,
"nic-pls",
}
m["mgj"] = {
"Abureni",
3441256,
"nic-cde",
"Latn",
}
m["mgk"] = {
"Mawes",
6794395,
"qfa-dis", -- Papuan; isolate in Glottolog, Foley (2018) and Hammarström (2010); in the Tor-Kwerba languages per
-- Usher (2020)
"Latn",
}
m["mgl"] = {
"Maleu-Kilenge",
3281884,
}
m["mgm"] = {
"Mambae",
35774,
"poz-tim",
"Latn",
}
m["mgn"] = {
"Mbangi",
11017443,
"nic-ngd",
"Latn",
}
m["mgo"] = {
"Meta'",
36054,
"nic-mom",
"Latn",
}
m["mgp"] = {
"Eastern Magar",
12952758,
"sit-gma",
"Deva, Latn",
}
m["mgq"] = {
"Malila",
6743679,
"bnt-mby",
"Latn",
}
m["mgr"] = {
"Mambwe-Lungu",
626210,
"bnt-mwi",
"Latn",
}
m["mgs"] = {
"Manda (Tanzania)",
16939267,
"bnt-bki",
}
m["mgt"] = {
"Mongol",
11260674,
"paa-wke",
"Latn",
}
m["mgu"] = {
"Mailu",
3278246,
"paa-mal",
"Latn",
}
m["mgv"] = {
"Matengo",
6786446,
"bnt-mbi",
"Latn",
}
m["mgw"] = {
"Matumbi",
6791974,
"bnt-mbi",
"Latn",
}
m["mgy"] = {
"Mbunga",
6799817,
"bnt-kil",
}
m["mgz"] = {
"Mbugwe",
3426367,
"bnt-mra",
}
m["mha"] = {
"Manda (India)",
56760,
"dra-kki",
"Orya",
translit = "Orya-translit",
}
m["mhb"] = {
"Mahongwe",
35816,
"bnt-kel",
}
m["mhc"] = {
"Mocho",
1941682,
"myn",
}
m["mhd"] = {
"Mbugu",
36152,
"qfa-mix",
"Latn",
ancestors = "asa",
}
m["mhe"] = {
"Besisi",
2742262,
"mkh-asl",
"Latn",
}
m["mhf"] = {
"Mamaa",
6745346,
"ngf-fin",
"Latn",
}
m["mhg"] = {
"Marrgu",
6772812,
}
m["mhi"] = {
"Ma'di",
56670,
"csu-mma",
"Latn",
strip_diacritics = {remove_diacritics = c.acute .. c.grave .. c.tilde .. c.dotbelow},
}
m["mhj"] = {
"Mogholi",
13336,
"xgn",
"fa-Arab, Latn",
translit = "fa-cls-translit",
strip_diacritics = {
["fa-Arab"] = "ar-stripdiacritics",
},
}
m["mhk"] = {
"Mungaka",
36068,
"nic-nun",
}
m["mhl"] = {
"Mauwake",
6794095,
"ngf-kum",
"Latn",
}
m["mhm"] = {
"Makhuwa-Moniga",
6900145,
"bnt-mak",
}
m["mhn"] = {
"โมเชโน",
268130,
"gmw-hgm",
"Latn",
ancestors = "bar",
sort_key = {remove_diacritics = c.grave},
}
m["mho"] = {
"Mashi",
10962737,
"bnt-kav",
"Latn",
}
m["mhp"] = {
"Balinese Malay",
12473441,
"crp",
"Latn, Bali, ms-Arab",
}
m["mhq"] = {
"Mandan",
1957120,
"sio",
"Latn",
}
m["mhr"] = {
"Eastern Mari",
3906614,
"chm",
"Cyrl",
translit = "chm-translit",
override_translit = true,
strip_diacritics = {remove_diacritics = c.grave .. c.acute},
sort_key = {
from = {"ё", "ҥ", "ӧ", "ӱ"},
to = {"е" .. p[1], "н" .. p[1], "о" .. p[1], "у" .. p[1]}
}
}
m["mhs"] = {
"Buru (Indonesia)",
2928650,
"poz-cma",
"Latn",
}
m["mht"] = {
"Mandahuaca",
6747924,
"awd-nwk",
}
m["mhu"] = {
"Taraon",
56400,
"sit-gsi",
"Latn",
}
m["mhw"] = {
"Mbukushu",
2691548,
"bnt",
"Latn",
}
m["mhx"] = {
"Lhao Vo",
11149315,
"tbq-brm",
"Latn",
}
m["mhy"] = {
"Ma'anyan",
2328761,
"poz-bre",
"Latn",
}
m["mhz"] = {
"Mor (Austronesian)",
2122792,
"poz-hce",
"Latn",
}
m["mia"] = {
"Miami",
56523,
"alg",
"Latn",
}
m["mib"] = {
"Atatláhuca Mixtec",
32093046,
"omq-mxt",
"Latn",
}
m["mic"] = {
"Mi'kmaq",
13321,
"alg-eas",
"Latn",
}
m["mid"] = {
"Mandaic",
6991742,
"sem-ase",
"Mand",
ancestors = "myz",
translit = {
Mand = "Mand-translit",
},
strip_diacritics = {
Mand = "Mand-stripdiacritics",
}
}
m["mie"] = {
"Ocotepec Mixtec",
25559575,
"omq-mxt",
"Latn",
}
m["mif"] = {
"Mofu-Gudur",
1365132,
"cdc-cbm",
"Latn",
}
m["mig"] = {
"San Miguel el Grande Mixtec",
12953719,
"omq-mxt",
"Latn",
}
m["mih"] = {
"Chayuco Mixtec",
13583510,
"omq-mxt",
"Latn",
}
m["mii"] = {
"Chigmecatitlán Mixtec",
12953724,
"omq-mxt",
"Latn",
}
m["mij"] = {
"Mungbam",
34725,
"nic-beb",
"Latn",
}
m["mik"] = {
"Mikasuki",
13316,
"nai-mus",
"Latn",
}
m["mil"] = {
"Peñoles Mixtec",
42411307,
"omq-mxt",
"Latn",
}
m["mim"] = {
"Alacatlatzala Mixtec",
14697894,
"omq-mxt",
"Latn",
}
m["min"] = {
"มีนังกาเบา",
13324,
"poz-mly",
"Latn, Arab",
}
m["mio"] = {
"Pinotepa Nacional Mixtec",
7196415,
"omq-mxt",
"Latn",
}
m["mip"] = {
"Apasco-Apoala Mixtec",
13583505,
"omq-mxt",
"Latn",
}
m["miq"] = {
"Miskito",
1516803,
"nai-min",
"Latn",
strip_diacritics = {remove_diacritics = c.circ},
}
m["mir"] = {
"Isthmus Mixe",
6088873,
"nai-miz",
"Latn",
}
m["mit"] = {
"Southern Puebla Mixtec",
7570345,
"omq-mxt",
"Latn",
}
m["miu"] = {
"Cacaloxtepec Mixtec",
12953723,
"omq-mxt",
"Latn",
}
m["miw"] = {
"Akoye",
3327462,
"ngf-ang",
"Latn",
}
m["mix"] = {
"Mixtepec Mixtec",
6884125,
"omq-mxt",
"Latn",
}
m["miy"] = {
"Ayutla Mixtec",
13583508,
"omq-mxt",
"Latn",
}
m["miz"] = {
"Coatzospan Mixtec",
3317290,
"omq-mxt",
"Latn",
}
m["mjb"] = {
"Makalero",
35729,
"paa-tap",
"Latn",
}
m["mjc"] = {
"San Juan Colorado Mixtec",
12953718,
"omq-mxt",
"Latn",
}
m["mjd"] = {
"Northwest Maidu",
3198700,
"nai-mdu",
"Latn",
}
m["mje"] = {
"Muskum",
3913334,
}
-- mjg "Monguor" is not recognized as a language, but it is a family code
m["mji"] = {
"Kim Mun",
1115317,
"hmx-mie",
"Latn",
}
m["mjj"] = {
"Mawak",
11732427,
"ngf-tib",
"Latn",
}
m["mjk"] = {
"Matukar",
6791963,
"poz-ocw",
"Latn",
}
m["mjl"] = {
"Mandeali",
6747931,
"him",
"Deva, Takr",
translit = {
Deva = "Deva-translit",
Takr = "Takr-translit",
},
}
m["mjm"] = {
"Medebur",
6805227,
"poz-ocw",
"Latn",
}
m["mjn"] = {
"Mebu",
6804364,
"ngf-fin",
"Latn",
}
m["mjo"] = {
"Malankuravan",
14916887,
"dra-mal",
}
m["mjp"] = {
"Malapandaram",
10575729,
"dra-tam",
}
m["mjq"] = {
"Malaryan",
12952773,
"dra-mal",
}
m["mjr"] = {
"Malavedan",
12952775,
"dra-mal",
"Mlym",
-- Mlym translit in [[Module:scripts/data]]
}
m["mjs"] = {
"Miship",
3441264,
"cdc-wst",
"Latn",
}
m["mjt"] = {
"Sawriya Paharia",
33907,
"dra-mlo",
"Beng, Deva",
translit = {
Beng = "Beng-translit",
Deva = "Deva-translit",
},
}
m["mju"] = {
"Manna-Dora",
10576453,
"dra-tel",
}
m["mjv"] = {
"Mannan",
3286037,
"dra-tam",
"Mlym, Taml",
translit = {
Taml = "Taml-translit",
},
-- Mlym translit in [[Module:scripts/data]]
}
m["mjw"] = {
"Karbi",
56591,
"tbq-kuk",
"Latn",
}
m["mjx"] = {
"Mahali",
12953686,
"mun",
}
m["mjy"] = {
"Mahican",
3182562,
"alg-eas",
"Latn",
}
m["mjz"] = {
"Majhi",
6737786,
"inc-bih",
}
m["mka"] = {
"Mbre",
3450154,
"nic", --unclassified within niger-congo tho
}
m["mkb"] = {
"Mal Paharia",
6583595,
"inc-eas",
"Deva",
translit = {
Deva = "Deva-translit",
},
}
m["mkc"] = {
"Siliput",
7515090,
"paa-tor",
"Latn",
}
m["mke"] = {
"Mawchi",
21403317,
}
m["mkf"] = {
"Miya",
43328,
"cdc-wst",
"Latn",
}
m["mkg"] = {
"Mak (China)",
3280623,
"qfa-kms",
}
m["mki"] = {
"Dhatki",
32480,
"raj",
"Deva, Mahj, Arab",
}
m["mkj"] = {
"โมกิล",
2335528,
"poz-mic",
"Latn",
}
m["mkk"] = {
"Byep",
35052,
"bnt-mka",
}
m["mkl"] = {
"Mokole",
36047,
"alv-yor",
"Latn",
}
m["mkm"] = {
"Moklen",
3319380,
}
m["mkn"] = {
"Kupang Malay",
18458203,
"crp",
"Latn",
}
m["mko"] = {
"Mingang Doso",
3915382,
"alv-bwj",
}
m["mkp"] = {
"Moikodi",
6894594,
"ngf-yar",
"Latn",
}
m["mkq"] = {
"Bay Miwok",
3460957,
"nai-utn",
"Latn",
}
m["mkr"] = {
"Malas",
11732402,
"ngf-nad",
"Latn",
}
m["mks"] = {
"Silacayoapan Mixtec",
7514027,
"omq-mxt",
"Latn",
}
m["mkt"] = {
"Vamale",
14916907,
"poz-cln",
"Latn",
}
m["mku"] = {
"Konyanka Maninka",
11163298,
"dmn-mnk",
}
m["mkv"] = {
"Mav̋ea",
3073532,
"poz-vnn",
"Latn",
}
m["mkx"] = {
"Cinamiguin Manobo",
12953697,
"mno",
"Latn",
}
m["mky"] = {
"Taba",
3512690,
"poz-hce",
"Latn",
}
m["mkz"] = {
"Makasae",
35782,
"paa-tap",
"Latn",
}
m["mla"] = {
"Tamambo",
1153276,
"poz-vnn",
"Latn",
}
m["mlb"] = {
"Mbule",
35843,
"nic-ymb",
"Latn",
}
m["mlc"] = {
"Caolan",
3446682,
"tai-cho",
"Latn, Hani",
sort_key = {Hani = "Hani-sortkey"},
}
m["mle"] = {
"Manambu",
11732406,
"paa-ndu",
"Latn",
}
m["mlf"] = {
"มัล",
3281057,
"mkh-khm",
}
m["mlh"] = {
"Mape",
6753787,
"ngf-huo",
"Latn",
}
m["mli"] = {
"Malimpung",
12473435,
}
m["mlj"] = {
"Miltu",
3441310,
}
m["mlk"] = {
"Ilwana",
6001357,
"bnt-sab",
}
m["mll"] = {
"Malua Bay",
6744946,
"poz-vnc",
"Latn",
}
m["mlm"] = {
"Mulam",
3092284,
"qfa-kms",
"Latn",
}
m["mln"] = {
"Malango",
3281522,
"poz-sls",
"Latn",
}
m["mlo"] = {
"Mlomp",
36009,
"alv-bak",
}
m["mlp"] = {
"Bargam",
4860543,
"ngf-mad",
"Latn",
}
m["mlq"] = {
"Western Maninkakan",
11028033,
"dmn-wmn",
}
m["mlr"] = {
"Vame",
3515088,
"cdc-cbm",
"Latn",
}
m["mls"] = {
"Masalit",
56557,
"ssa",
}
m["mlu"] = {
"To'abaita",
36645,
"poz-sls",
"Latn",
}
m["mlv"] = {
"Mwotlap",
2475538,
"poz-vnn",
"Latn",
}
m["mlw"] = {
"Moloko",
1965222,
"cdc-cbm",
"Latn",
}
m["mlx"] = {
"Malfaxal",
2157421,
"poz-vnc",
"Latn",
}
m["mlz"] = {
"Malaynon",
18755512,
"phi",
}
m["mma"] = {
"Mama",
3913963,
"nic-jrn",
}
m["mmb"] = {
"Momina",
6897297,
}
m["mmc"] = {
"Michoacán Mazahua",
12953705,
"oto",
"Latn",
}
m["mmd"] = {
"Maonan",
3092293,
"qfa-kms",
"Latn",
}
m["mme"] = {
"Tirax",
3276286,
"poz-vnc",
"Latn",
}
m["mmf"] = {
"Mundat",
56263,
"cdc-wst",
"Latn",
}
m["mmg"] = {
"North Ambrym",
2842468,
"poz-vnc",
"Latn",
}
m["mmh"] = {
"Mehináku",
3501838,
"awd",
"Latn",
}
m["mmi"] = {
"Musar",
6940113,
"ngf-tib",
"Latn",
}
m["mmj"] = {
"Majhwar",
6737795,
}
m["mmk"] = {
"Mukha-Dora",
6933447,
}
m["mml"] = {
"Man Met",
3194984,
"mkh-pal",
}
m["mmm"] = {
"Maii",
6735599,
"poz-vnc",
"Latn",
}
m["mmn"] = {
"Mamanwa",
3206623,
"phi",
"Latn",
}
m["mmo"] = {
"Mangga Buang",
12952294,
"poz-ocw",
"Latn",
}
m["mmp"] = {
"Musan",
2605703,
"paa-amu",
"Latn",
}
m["mmq"] = {
"Aisi",
6940074,
"ngf-ais",
"Latn",
}
m["mmr"] = {
"Western Xiangxi Miao",
3307901,
"hmn",
"Latn",
}
m["mmt"] = {
"Malalamai",
3281496,
"poz-ocw",
"Latn",
}
m["mmu"] = {
"Mmaala",
13123461,
"nic-ymb",
"Latn",
}
m["mmv"] = {
"Miriti",
6873567,
"sai-tuc",
"Latn",
}
m["mmw"] = {
"Emae",
3051961,
"poz-pnp",
"Latn",
}
m["mmx"] = {
"Madak",
3275205,
"poz-ocw",
"Latn",
}
m["mmy"] = {
"Migaama",
56259,
"cdc-est",
"Latn",
}
m["mmz"] = {
"Mabaale",
11003249,
"bnt-ngn",
}
m["mna"] = {
"Mbula",
3303572,
"poz-ocw",
"Latn",
}
m["mnb"] = {
"Muna",
6935584,
"poz-mun",
"Latn",
}
m["mnc"] = {
"แมนจู",
33638,
"tuw-jrc",
"mnc-Mong, Latn",
ancestors = "juc",
-- mnc-Mong translit in [[Module:scripts/data]]
}
m["mnd"] = {
"Mondé",
6898840,
"tup",
"Latn",
}
m["mne"] = {
"Naba",
760732,
"csu-bgr",
}
m["mnf"] = {
"Mundani",
35839,
"nic-mom",
"Latn",
}
m["mng"] = {
"Eastern Mnong",
12953747,
"mkh-ban",
"Latn, Khmr",
}
m["mnh"] = {
"Mono (Congo)",
33501,
"bad-cnt",
"Latn",
}
m["mni"] = {
"มณีปุระ",
33868,
"sit",
"Mtei, Beng",
ancestors = "omp",
translit = {Mtei = "Mtei-translit"},
}
m["mnj"] = {
"Munji",
33639,
"ira-mny",
"Arab",
}
m["mnk"] = {
"Mandinka",
33678,
"dmn-wmn",
"Latn, Arab, Nkoo",
}
m["mnl"] = {
"Tiale",
6744350,
"poz-vnn",
"Latn",
}
m["mnm"] = {
"Mapena",
11732415,
"ngf-dag",
"Latn",
}
m["mnn"] = {
"มนองใต้",
23857582,
"mkh-ban",
}
m["mnp"] = {
"หมิ่นเหนือ",
36457,
"zhx-inm",
"Hants",
generate_forms = "zh-generateforms",
translit = "zh-translit",
sort_key = "Hani-sortkey",
}
m["mnq"] = {
"Minriq",
2742268,
"mkh-asl",
"Latn",
}
m["mnr"] = {
"Mono (California)",
33591,
"azc-num",
"Latn",
}
m["mnt"] = {
"Maykulan",
3915696,
"aus-pam",
"Latn",
}
m["mnu"] = {
"Mer",
6817854,
"paa-mai",
"Latn",
}
m["mnv"] = {
"Rennellese",
3397346,
"poz-pnp",
"Latn",
}
m["mnw"] = {
"มอญ",
13349,
"mkh-mnc",
"Mymr",
ancestors = "mkh-mmn",
translit = "mnw-translit",
sort_key = {
from = {"ျ", "ြ", "ွ", "ှ", "ၞ", "ၟ", "ၠ", "ၚ", "ဿ"},
to = {"္ယ", "္ရ", "္ဝ", "္ဟ", "္န", "္မ", "္လ", "င", "သ္သ"}
},
}
m["mnx"] = {
"Manikion",
3507964,
"paa-ebh",
"Latn",
}
m["mny"] = {
"Manyawa",
11002622,
"bnt-mak",
ancestors = "vmw",
}
m["mnz"] = {
"Moni",
6899857,
"ngf-pan",
"Latn",
}
m["moa"] = {
"Mwan",
3320111,
"dmn-nbe",
"Latn",
}
m["moc"] = {
"Mocoví",
3027906,
"sai-guc",
"Latn",
}
m["mod"] = {
"Mobilian",
13333,
"crp",
"Latn",
ancestors = "cho, cic",
}
m["moe"] = {
"มงตาแญ",
13351,
"alg",
"Latn",
ancestors = "cr",
strip_diacritics = {remove_diacritics = c.macron},
}
m["mog"] = {
"Mongondow",
3058458,
"phi",
"Latn",
}
m["moh"] = {
"Mohawk",
13339,
"iro-nor",
"Latn",
ancestors = "iro-omo",
}
m["moi"] = {
"Mboi",
3914417,
"alv-yun",
}
m["moj"] = {
"Monzombo",
11154772,
"nic-nkk",
"Latn",
}
m["mok"] = {
"Morori",
6913275,
}
m["mom"] = {
"Monimbo",
56542,
}
m["moo"] = {
"Monom",
6901726,
"mkh-ban",
}
m["mop"] = {
"Mopan Maya",
36183,
"myn",
"Latn",
}
m["moq"] = {
"Mor (Papuan)",
11732468,
"qfa-dis", -- Papuan; isolate in Glottolog and Palmer (2018); top-level TNG in Ross (2005), in Berau Gulf (under
-- TNG) in Usher (2020)
}
m["mor"] = {
"Moro",
36172,
"alv-hei",
"Latn",
}
m["mos"] = {
"Moore",
36096,
"nic-mre",
"Latn",
}
m["mot"] = {
"Barí",
2886281,
"cba",
"Latn",
}
m["mou"] = {
"Mogum",
3440473,
"cdc-est",
"Latn",
}
m["mov"] = {
"Mojave",
56510,
"nai-yuc",
"Latn",
}
m["mow"] = {
"Moi (Congo)",
11124792,
"bnt-bmo",
"Latn",
}
m["mox"] = {
"Molima",
3319495,
"poz-ocw",
"Latn",
}
m["moy"] = {
"Shekkacho",
56827,
"omv-gon",
}
m["moz"] = {
"Mukulu",
3440403,
"cdc-est",
}
m["mpa"] = {
"Mpoto",
6928303,
"bnt-mbi",
"Latn",
}
m["mpb"] = {
"Mullukmulluk",
6741120,
}
m["mpc"] = {
"Mangarayi",
6748829,
}
m["mpd"] = {
"Machinere",
12953681,
"awd",
"Latn",
}
m["mpe"] = {
"Majang",
56724,
"sdv",
}
m["mpg"] = {
"Marba",
56614,
"cdc-mas",
}
m["mph"] = {
"Maung",
6792550,
"aus-wdj",
"Latn",
}
m["mpi"] = {
"Mpade",
3280670,
"cdc-cbm",
"Latn",
}
m["mpj"] = {
"Martu Wangka",
3295916,
"aus-pam",
"Latn",
}
m["mpk"] = {
"Mbara (Chad)",
3912770,
"cdc-cbm",
}
m["mpl"] = {
"Middle Watut",
15887910,
"poz-ocw",
"Latn",
}
m["mpm"] = {
"Yosondúa Mixtec",
12953741,
"omq-mxt",
"Latn",
}
m["mpn"] = {
"Mindiri",
6863842,
"poz-ocw",
"Latn",
}
m["mpo"] = {
"Miu",
6883668,
"poz-ocw",
"Latn",
}
m["mpp"] = {
"Migabac",
11732448,
"ngf-huo",
"Latn",
}
m["mpq"] = {
"Matís",
3299145,
"sai-pan",
"Latn",
}
m["mpr"] = {
"Vangunu",
3554582,
"poz-ocw",
"Latn",
}
m["mps"] = {
"Dadibi",
5208077,
"paa-teb",
"Latn",
}
m["mpt"] = {
"Mian",
12952846,
"ngf-okk",
"Latn",
}
m["mpu"] = {
"Makuráp",
3281037,
"tup",
"Latn",
}
m["mpv"] = {
"Mungkip",
11732485,
"ngf-fin",
"Latn",
}
m["mpw"] = {
"Mapidian",
6753812,
"awd",
"Latn",
}
m["mpx"] = {
"Misima-Paneati",
6875666,
"poz-ocw",
"Latn",
}
m["mpy"] = {
"Mapia",
3287224,
"poz-mic",
"Latn",
}
m["mpz"] = {
"Mpi",
6928276,
"tbq-bka",
}
m["mqa"] = {
"Maba",
3273750,
}
m["mqb"] = {
"Mbuko",
3502213,
"cdc-cbm",
"Latn",
}
m["mqc"] = {
"Mangole",
6749097,
"poz-cma",
"Latn",
}
m["mqe"] = {
"Matepi",
11732426,
"ngf-han",
"Latn",
}
m["mqf"] = {
"Momuna",
6897518,
}
m["mqg"] = {
"Kota Bangun Kutai Malay",
12952778,
}
m["mqh"] = {
"Tlazoyaltepec Mixtec",
12953740,
"omq-mxt",
"Latn",
}
m["mqi"] = {
"Mariri",
6765544,
}
m["mqj"] = {
"Mamasa",
6745452,
"poz-ssw",
"Latn",
}
m["mqk"] = {
"Rajah Kabunsuwan Manobo",
12953700,
"mno",
}
m["mql"] = {
"Mbelime",
4286473,
"nic-eov",
"Latn",
}
m["mqm"] = {
"South Marquesan",
19694214,
"poz-pep",
"Latn",
}
m["mqn"] = {
"Moronene",
642581,
"poz-btk",
"Latn",
}
m["mqo"] = {
"Modole",
11732457,
"paa-nha",
"Latn",
}
m["mqp"] = {
"Manipa",
6749799,
"poz-cma",
"Latn",
}
m["mqq"] = {
"Minokok",
18642293,
"poz-san",
"Latn",
}
m["mqr"] = {
"Mander",
6747979,
"paa-tkw",
}
m["mqs"] = {
"West Makian",
3033575,
"paa-nha",
"Latn",
}
m["mqt"] = {
"Mok",
13018559,
"mkh-pal",
}
m["mqu"] = {
"Mandari",
3285426,
"sdv-bri",
}
m["mqv"] = {
"Mosimo",
11732478,
"ngf-nwh",
"Latn",
}
m["mqw"] = {
"Murupi",
11732486,
"ngf-nwh",
"Latn",
}
m["mqx"] = {
"Mamuju",
6746004,
"poz-ssw",
"Latn",
}
m["mqy"] = {
"Manggarai",
3285748,
"poz-cet",
"Latn",
}
m["mqz"] = {
"Malasanga",
14916889,
"poz-ocw",
"Latn",
}
m["mra"] = {
"Mlabri",
3073465,
"mkh",
}
m["mrb"] = {
"Sungwadia",
3293299,
"poz-vnn",
"Latn",
}
m["mrc"] = {
"Maricopa",
56386,
"nai-yuc",
"Latn",
}
m["mrd"] = {
"Western Magar",
22303263,
"sit-gma",
"Deva",
translit = {
Deva = "Deva-translit",
},
}
m["mre"] = {
"Martha's Vineyard Sign Language",
33494,
"sgn",
"Latn, Sgnw",
}
m["mrf"] = {
"Elseng",
3915667,
"qfa-unc", -- "Border or language isolate"; unclassifiable due to paucity of data
"Latn",
}
m["mrg"] = {
"Mising",
3316328,
"sit-tan",
"Latn, Beng, Deva",
ancestors = "adi",
translit = {
Beng = "Beng-translit",
Deva = "Deva-translit",
},
}
m["mrh"] = {
"Mara Chin",
4175893,
"tbq-kuk",
"Latn",
}
m["mrj"] = {
"Western Mari",
1776032,
"chm",
"Cyrl",
translit = "chm-translit",
sort_key = "mrj-sortkey",
}
m["mrk"] = {
"Hmwaveke",
5873712,
"poz-cln",
"Latn",
}
m["mrl"] = {
"Mortlockese",
3324598,
"poz-mic",
"Latn",
}
m["mrm"] = {
"Mwerlap",
3331115,
"poz-vnn",
"Latn",
}
m["mrn"] = {
"Cheke Holo",
2962165,
"poz-ocw",
"Latn",
}
m["mro"] = {
"Mru",
1951521,
"sit-mru",
"Latn, Mroo",
}
m["mrp"] = {
"Morouas",
6913299,
"poz-vnn",
"Latn",
}
m["mrq"] = {
"North Marquesan",
2603808,
"poz-pep",
"Latn",
}
m["mrr"] = {
"Hill Maria",
27602,
"dra-mdy",
"Deva",
}
m["mrs"] = {
"Maragus",
6754640,
"poz-vnc",
"Latn",
}
m["mrt"] = {
"Margi",
56241,
"cdc-cbm",
"Latn",
}
m["mru"] = {
"Mono (Cameroon)",
11031964,
"alv-mbm",
"Latn",
}
m["mrv"] = {
"Mangarevan",
36237,
"poz-pep",
"Latn",
}
m["mrw"] = {
"มาราเนา",
33800,
"phi",
"Latn, Arab",
}
m["mrx"] = {
"Dineor",
5278044,
"paa-tkw",
"Latn",
}
m["mry"] = {
"Karaga Mandaya",
6747925,
"phi",
}
m["mrz"] = {
"Marind",
6763970,
"paa-ani",
"Latn",
}
m["msb"] = {
"มัสบาเต",
33948,
"phi",
"Latn",
}
m["msc"] = {
"Sankaran Maninka",
11155812,
"dmn-mnk",
}
m["msd"] = {
"Yucatec Maya Sign Language",
34281,
"sgn",
"Latn", -- when documented
}
m["mse"] = {
"Musey",
56328,
"cdc-mas",
}
m["msf"] = {
"Mekwei",
4544752,
"paa-nim",
"Latn",
}
m["msg"] = {
"Moraid",
6909020,
"paa-wbh",
"Latn",
}
m["msi"] = {
"Sabah Malay",
10867404,
"crp",
"Latn, Arab",
}
m["msj"] = {
"Ma",
6720909,
"nic-mbc",
"Latn",
}
m["msk"] = {
"Mansaka",
12952800,
"phi",
"Latn",
}
m["msl"] = {
"Molof",
4300950,
}
m["msm"] = {
"Agusan Manobo",
12953696,
"mno",
"Latn",
}
m["msn"] = {
"Vurës",
3563857,
"poz-vnn",
"Latn",
}
m["mso"] = {
"Mombum",
6897079,
"ngf-mom",
"Latn",
}
m["msp"] = {
"Maritsauá",
6765915,
"tup",
"Latn",
}
m["msq"] = {
"Caac",
2932212,
"poz-cln",
"Latn",
}
m["msr"] = {
"Mongolian Sign Language",
3915499,
"sgn",
}
m["mss"] = {
"West Masela",
12952816,
"poz-tim",
}
m["msu"] = {
"Musom",
6943041,
"poz-ocw",
"Latn",
}
m["msv"] = {
"Maslam",
3502273,
}
m["msw"] = {
"Mansoanka",
35814,
}
m["msx"] = {
"Moresada",
11732475,
"ngf-pom",
"Latn",
}
m["msy"] = {
"Aruamu",
3501809,
"paa-ram",
"Latn",
}
m["msz"] = {
"Momare",
6897030,
"ngf-huo",
"Latn",
}
m["mta"] = {
"Cotabato Manobo",
12953698,
"mno",
"Latn",
}
m["mtb"] = {
"Anyin Morofo",
3502338,
"alv-ctn",
"Latn",
ancestors = "any",
}
m["mtc"] = {
"Munit",
11732482,
"ngf-kok",
"Latn",
}
m["mtd"] = {
"Mualang",
3073458,
"poz-mly",
"Latn",
}
m["mte"] = {
"Alu",
33503,
"poz-ocw",
"Latn",
}
m["mtf"] = {
"Murik (New Guinea)",
7050035,
"paa-lsp",
"Latn",
}
m["mtg"] = {
"Una",
5580728,
"ngf-mek",
}
m["mth"] = {
"Munggui",
6936018,
"poz-hce",
"Latn",
}
m["mti"] = {
"Maiwa (New Guinea)",
6737223,
"ngf-dag",
"Latn",
}
m["mtj"] = {
"Moskona",
11288953,
"paa-ebh",
"Latn",
}
m["mtk"] = {
"Mbe'",
10964025,
"nic-nka",
"Latn",
}
m["mtl"] = {
"Montol",
3440457,
"cdc-wst",
"Latn",
}
m["mtm"] = {
"Mator",
20669419,
"syd",
"Cyrl",
}
m["mtn"] = {
"Matagalpa",
3490756,
"nai-min",
}
m["mto"] = {
"Totontepec Mixe",
7828400,
"nai-miz",
"Latn",
}
m["mtp"] = {
"Wichí Lhamtés Nocten",
5908756,
"sai-wic",
"Latn",
}
m["mtq"] = {
"เหมื่อง",
3236789,
"mkh-vie",
"Latn",
sort_key = "vi-sortkey",
}
m["mtr"] = {
"เมวาร์",
2992857,
"raj",
"Deva",
translit = "Deva-translit", -- for now
}
m["mts"] = {
"Yora",
3572572,
"sai-pan",
"Latn",
}
m["mtt"] = {
"Mota",
3325052,
"poz-vnn",
"Latn",
}
m["mtu"] = {
"Tututepec Mixtec",
7857069,
"omq-mxt",
"Latn",
}
m["mtv"] = {
"Asaro'o",
3503684,
"ngf-fin",
"Latn",
}
m["mtw"] = {
"Magahat",
6729600,
"phi",
}
m["mtx"] = {
"Tidaá Mixtec",
7800805,
"omq-mxt",
"Latn",
}
m["mty"] = {
"Nabi",
6956858,
"paa-tor",
"Latn",
}
m["mua"] = {
"Mundang",
36032,
"alv-mbm",
}
m["mub"] = {
"Mubi",
3440518,
"cdc-est",
"Latn",
}
m["muc"] = {
"Mbu'",
35868,
"nic-beb",
"Latn",
}
m["mud"] = {
"Mednyj Aleut",
1977419,
"qfa-mix",
ancestors = "ale, ru"
}
m["mue"] = {
"Media Lengua",
36066,
"qfa-mix",
"Latn",
ancestors = "es, qu",
}
m["mug"] = {
"Musgu",
3123545,
"cdc-cbm",
"Latn",
}
m["muh"] = {
"Mündü",
35981,
"nic-nke",
"Latn",
}
m["mui"] = {
"มูซี",
615660,
"poz-mly",
"Latn",
}
m["muj"] = {
"Mabire",
3440437,
}
m["mul"] = {
"ร่วม", -- ภาษาร่วม ใช้แทน ข้ามภาษา
7834564,
"qfa-not",
"All",
-- NOTE: The following sort keys are used in process_page() in [[Module:headword/page]], which generates
-- the default sort key for the page (corresponding to {{DEFAULTSORT:...}}) by generating a sort key for
-- the pagename using `makeSortKey()` called on language object "mul". Currently this just handles
-- Japanese sort keys.
--
-- FIXME: This should be smarter and use the language of the page if there's only one.
sort_key = {
Hani = "Hani-sortkey",
Jpan = "Jpan-sortkey",
Hrkt = "Hira-sortkey", -- Sort all kana as Hira.
Hira = "Hira-sortkey",
Kana = "Hira-sortkey",
},
standard_chars = "AaBbCcDdEeFfGgHhIiJjKkLlMmNnOoPpQqRrSsTtUuVvWwXxYyZz" .. c.punc,
}
m["mum"] = {
"Maiwala",
12952764,
"poz-ocw",
"Latn",
}
m["muo"] = {
"Nyong",
36373,
"alv-lek",
}
m["mup"] = {
"Malvi",
33413,
"raj",
"Deva",
translit = "Deva-translit"
}
m["muq"] = {
"Eastern Xiangxi Miao",
27431376,
"hmn",
}
m["mur"] = {
"Murle",
56727,
"sdv",
}
m["mus"] = {
"Creek",
523014,
"nai-mus",
"Latn",
}
m["mut"] = {
"Western Muria",
12952886,
"dra-mur",
}
m["muu"] = {
"Yaaku",
34222,
"cus-eas",
}
m["muv"] = {
"Muthuvan",
3327420,
"dra-tam",
}
m["mux"] = {
"Bo-Ung",
15831607,
"ngf-chw",
"Latn",
}
m["muy"] = {
"Muyang",
3502301,
"cdc-cbm",
"Latn",
}
m["muz"] = {
"Mursi",
36013,
"sdv",
}
m["mva"] = {
"Manam",
6746851,
"poz-ocw",
"Latn",
}
m["mvb"] = {
"Mattole",
20824,
"ath-pco",
"Latn",
}
m["mvd"] = {
"Mamboru",
578815,
"poz",
"Latn",
}
m["mvg"] = {
"Yucuañe Mixtec",
25562736,
"omq-mxt",
"Latn",
}
m["mvh"] = {
"Mire",
3441359,
}
m["mvi"] = {
"มิยาโกะ",
36218,
"jpx-sry",
"Jpan",
translit = s["jpx-translit"],
display_text = s["jpx-displaytext"],
strip_diacritics = s["jpx-stripdiacritics"],
sort_key = s["jpx-sortkey"],
}
m["mvk"] = {
"Mekmek",
6810592,
"paa-yua",
"Latn",
}
m["mvl"] = {
"Mbara (Australia)",
6799620,
"aus-pam",
}
m["mvm"] = {
"Muya",
2422759,
"sit-qia",
}
m["mvn"] = {
"Minaveha",
6863278,
"poz-ocw",
"Latn",
}
m["mvo"] = {
"Marovo",
3294683,
"poz-ocw",
"Latn",
}
m["mvp"] = {
"Duri",
3915414,
"poz-ssw",
"Latn",
}
m["mvq"] = {
"Moere",
11732458,
"ngf-kum",
"Latn",
}
m["mvr"] = {
"Marau",
6755069,
"poz-hce",
"Latn",
}
m["mvs"] = {
"Massep",
3502895,
"paa-tkw",
}
m["mvt"] = {
"Mpotovoro",
6928305,
"poz-vnc",
"Latn",
}
m["mvu"] = {
"Marfa",
713633,
}
m["mvv"] = {
"Tagal Murut",
7675300,
"poz-san",
"Latn",
}
m["mvw"] = {
"Machinga",
12952754,
"bnt-rvm",
}
m["mvx"] = {
"Meoswar",
6817777,
"poz-hce",
"Latn",
}
m["mvy"] = {
"Indus Kohistani",
33399,
"inc-koh",
"Arab",
}
m["mvz"] = {
"Mesqan",
6821677,
"sem-eth",
}
m["mwa"] = {
"Mwatebu",
14916896,
"poz-ocw",
"Latn",
}
m["mwb"] = {
"Juwal",
6319103,
"paa-tor",
"Latn",
}
m["mwc"] = {
"Are",
29277,
"poz-ocw",
"Latn",
}
m["mwe"] = {
"Mwera",
6944725,
"bnt-rvm",
"Latn",
}
m["mwf"] = {
"Murrinh-Patha",
2980398,
"aus-dal",
"Latn",
}
m["mwg"] = {
"Aiklep",
3399652,
"poz-ocw",
"Latn",
}
m["mwh"] = {
"Mouk-Aria",
3325498,
"poz-ocw",
"Latn",
}
m["mwi"] = {
"Labo",
2157452,
"poz-vnc",
"Latn",
}
m["mwk"] = {
"Kita Maninkakan",
3015523,
"dmn-wmn",
}
m["mwl"] = {
"มีรังดา",
13330,
"roa-asl",
"Latn",
}
m["mwm"] = {
"Sar",
56850,
"csu-sar",
"Latn",
}
m["mwn"] = {
"Nyamwanga",
6944666,
"bnt-mwi",
"Latn",
}
m["mwo"] = {
"Sungwadaga",
3276435,
"poz-vnn",
"Latn",
}
m["mwp"] = {
"Kala Lagaw Ya",
2591262,
"aus-pam",
"Latn",
}
m["mwq"] = {
"Mün Chin",
331340,
"tbq-kuk",
}
m["mwr"] = {
"มาร์วาร์",
56312,
"raj",
"Deva, Mahj",
translit = {
Deva = "Deva-translit", -- for now
Mahj = "Mahj-translit",
},
}
m["mws"] = {
"Mwimbi-Muthambi",
15632357,
"bnt-kka",
"Latn",
}
m["mwt"] = {
"Moken",
18648701,
"poz",
}
m["mwu"] = {
"Mittu",
6883573,
"csu-bbk",
"Latn",
}
m["mwv"] = {
"Mentawai",
13365,
"poz-nws",
"Latn",
}
m["mww"] = {
"ม้งขาว",
3138829,
"hmn",
"Latn, Hmng, Hmnp",
}
m["mwz"] = {
"Moingi",
11011905,
}
m["mxa"] = {
"Northwest Oaxaca Mixtec",
12953739,
"omq-mxt",
"Latn",
}
m["mxb"] = {
"Tezoatlán Mixtec",
3317286,
"omq-mxt",
"Latn",
}
m["mxd"] = {
"Modang",
6888037,
"poz",
"Latn",
}
m["mxe"] = {
"Mele-Fila",
3305008,
"poz-pnp",
"Latn",
}
m["mxf"] = {
"Malgbe",
3502224,
}
m["mxg"] = {
"Mbangala",
6799612,
"bnt-yak",
}
m["mxh"] = {
"Mvuba",
6944591,
"csu-mle",
"Latn",
}
m["mxi"] = {
"Mozarabic",
317044,
"roa-ibe",
"Arab, Hebr, Latn",
translit = "mxi-translit",
strip_diacritics = {
Arab = "ar-stripdiacritics",
},
-- Hebr display_text, strip_diacritics, sort_key in [[Module:scripts/data]]
}
m["mxj"] = {
"Miju",
56332,
"sit-mdz",
"Latn, Deva",
translit = {
Deva = "Deva-translit",
},
}
m["mxk"] = {
"Monumbo",
6906792,
"paa-tor",
}
m["mxl"] = {
"Maxi Gbe",
35770,
"alv-gbe",
}
m["mxm"] = {
"Meramera",
6817936,
"poz-ocw",
"Latn",
}
m["mxn"] = {
"Moi (Indonesia)",
11732459,
"paa-wbh",
"Latn",
}
m["mxo"] = {
"Mbowe",
10962309,
"bnt-kav",
}
m["mxp"] = {
"Tlahuitoltepec Mixe",
7810697,
}
m["mxq"] = {
"Juquila Mixe",
25559721,
}
m["mxr"] = {
"Murik (Malaysia)",
3328150,
nil,
"Latn",
}
m["mxs"] = {
"Huitepec Mixtec",
12953729,
"omq-mxt",
"Latn",
}
m["mxt"] = {
"Jamiltepec Mixtec",
12953730,
"omq-mxt",
"Latn",
}
m["mxu"] = {
"Mada (Cameroon)",
3441206,
"cdc-cbm",
"Latn",
}
m["mxv"] = {
"Metlatónoc Mixtec",
36363,
"omq-mxt",
"Latn",
}
m["mxw"] = {
"Namo",
12952923,
"paa-yam",
"Latn",
}
m["mxx"] = {
"Mahou",
11004334,
"dmn-mnk",
"Latn, Nkoo",
}
m["mxy"] = {
"Southeastern Nochixtlán Mixtec",
7070684,
"omq-mxt",
"Latn",
}
m["mxz"] = {
"Central Masela",
42575433,
"poz-tim",
"Latn",
}
m["myb"] = {
"Mbay",
3033565,
"csu-sar",
"Latn",
}
m["myc"] = {
"Mayeka",
11129517,
"bnt-boa",
}
m["mye"] = {
"Myene",
35832,
"bnt-tso",
"Latn",
}
m["myf"] = {
"Bambassi",
56540,
"omv-mao",
"Latn",
}
m["myg"] = {
"Manta",
35799,
"nic-mom",
"Latn",
}
m["myh"] = {
"Makah",
3280640,
"wak",
"Latn",
}
m["myj"] = {
"Mangayat",
35988,
"nic-ser",
}
m["myk"] = {
"Mamara Senoufo",
36187,
"alv-sma",
"Latn",
}
m["myl"] = {
"Moma",
6897018,
"poz",
"Latn",
}
m["mym"] = {
"Me'en",
3408516,
"sdv",
}
m["myo"] = {
"Anfillo",
34928,
"omv-gon",
}
m["myp"] = {
"Pirahã",
33825,
"sai-mur",
"Latn",
}
m["myr"] = {
"Muniche",
3915654,
}
m["mys"] = {
"Mesmes",
3508617,
"sem-eth",
}
m["myu"] = {
"Mundurukú",
746723,
"tup",
"Latn",
}
m["myv"] = {
"เอร์เซีย",
29952,
"urj-mdv",
"Cyrl",
translit = "myv-translit",
override_translit = true,
}
m["myw"] = {
"Muyuw",
3502878,
"poz-ocw",
"Latn",
}
m["myx"] = {
"Masaba",
12952814,
"bnt-msl",
"Latn",
}
m["myy"] = {
"Macuna",
3275059,
"sai-tuc",
"Latn",
}
m["myz"] = {
"Classical Mandaic",
25559314,
"sem-ase",
"Mand",
translit = {
Mand = "Mand-translit",
},
strip_diacritics = {
Mand = "Mand-stripdiacritics",
}
}
m["mza"] = {
"Santa María Zacatepec Mixtec",
8063756,
"omq-mxt",
"Latn",
}
m["mzb"] = {
"Northern Saharan Berber",
11156769,
"ber",
"Arab, Latn, Tfng",
}
m["mzc"] = {
"Madagascar Sign Language",
12715020,
"sgn",
}
m["mzd"] = {
"Malimba",
35806,
"bnt-saw",
}
m["mze"] = {
"Morawa",
6909384,
"paa-mal",
"Latn",
}
m["mzg"] = {
"Monastic Sign Language",
3217333,
"sgn",
}
m["mzh"] = {
"Wichí Lhamtés Güisnay",
7998197,
"sai-wic",
"Latn",
}
m["mzi"] = {
"Ixcatlán Mazatec",
6101049,
"omq-maz",
"Latn",
}
m["mzj"] = {
"Manya",
11006832,
"dmn-mnk",
}
m["mzk"] = {
"Nigeria Mambila",
11004163,
"nic-mmb",
"Latn",
}
m["mzl"] = {
"Mazatlán Mixe",
25559728,
}
m["mzm"] = {
"Mumuye",
36021,
"alv-mum",
"Latn",
}
m["mzn"] = {
"มอแซนแดรอน",
13356,
"ira-msh",
"mzn-Arab",
}
m["mzo"] = {
"Matipuhy",
6787588,
"sai-kui",
"Latn",
}
m["mzp"] = {
"Movima",
1659701,
"qfa-iso",
"Latn",
}
m["mzq"] = {
"Mori Atas",
3324070,
"poz-btk",
"Latn",
}
m["mzr"] = {
"Marúbo",
3296011,
"sai-pan",
"Latn",
}
m["mzs"] = {
"ครีโอลมาเก๊า",
35785,
"crp",
"Latn",
ancestors = "pt",
sort_key = {Latn = {remove_diacritics = c.grave .. c.acute .. c.circ .. c.tilde .. c.diaer .. c.cedilla}},
}
m["mzt"] = {
"Mintil",
6869641,
"mkh-asl",
}
m["mzu"] = {
"Inapang",
6013569,
"paa-ram",
"Latn",
}
m["mzv"] = {
"Manza",
36038,
"gba-eas",
}
m["mzw"] = {
"Deg",
35183,
"nic-gnw",
"Latn",
}
m["mzx"] = {
"Mawayana",
6794377,
"awd",
}
m["mzy"] = {
"Mozambican Sign Language",
6927809,
"sgn",
}
m["mzz"] = {
"Maiadomu",
6735234,
"poz-ocw",
"Latn",
}
return require("Module:languages").finalizeData(m, "language")
7qlw76ybprn9d4lm5euyznbcbx2a5ye
มอดูล:languages/data/3/k
828
36376
5720760
5684159
2026-04-21T07:01:03Z
OctraBot
3198
บอต: แทนที่ข้อความโดยอัตโนมัติ (-standardChars +standard_chars)
5720760
Scribunto
text/plain
local m_langdata = require("Module:languages/data")
-- Loaded on demand, as it may not be needed (depending on the data).
local function u(...)
u = require("Module:string utilities").char
return u(...)
end
local c = m_langdata.chars
local p = m_langdata.puaChars
local s = m_langdata.shared
local m = {}
m["kaa"] = {
"การากัลปัก",
33541,
"trk-kno",
"Latn, Cyrl, fa-Arab",
dotted_dotless_i = true,
strip_diacritics = {
from = {"['’]"},
to = {"ʼ"}
},
sort_key = {
Latn = {
from = {
-- Sort the old orthography (using the apostrophe) after the new orthography (using the acute accent).
"í", "iʼ", "i", -- Ensure "i" comes after "í", "iʼ", "ı".
"sh", "ch",
"á", "aʼ", "ǵ", "gʼ", "x", p[4], p[5], "ı", "q", "ń", "nʼ", "ó", "oʼ", "ú", "uʼ", "c"
},
to = {
p[4], p[5], "i" .. p[3],
"z" .. p[1], "z" .. p[3],
"a" .. p[1], "a" .. p[2], "g" .. p[1], "g" .. p[2], "h" .. p[1], "i", "i" .. p[1], "i" .. p[2], "k" .. p[1], "n" .. p[1], "n" .. p[2], "o" .. p[1], "o" .. p[2], "u" .. p[1], "u" .. p[2], "z" .. p[2]
}
},
Cyrl = {
from = {"ә", "ғ", "ё", "қ", "ң", "ө", "ү", "ў", "ҳ"},
to = {"а" .. p[1], "г" .. p[1], "е" .. p[1], "к" .. p[1], "н" .. p[1], "о" .. p[1], "у" .. p[1], "у" .. p[2], "х" .. p[1]}
},
},
}
m["kab"] = {
"กะไบล์",
35853,
"ber",
"Latn, Arab, Tfng",
}
m["kac"] = {
"จิ่งเผาะ",
33332,
"sit-jnp",
"Latn, Mymr",
}
m["kad"] = {
"Kadara",
3914011,
"nic-plc",
"Latn",
}
m["kae"] = {
"Ketangalan",
2779411,
"map",
}
m["kaf"] = {
"Katso",
246122,
"tbq-kzh",
}
m["kag"] = {
"Kajaman",
6348863,
"poz",
"Latn",
}
m["kah"] = {
"Fer",
5443742,
"csu-bgr",
"Latn",
}
m["kai"] = {
"Karekare",
3438770,
"cdc-wst",
"Latn",
}
m["kaj"] = {
"Jju",
35401,
"nic-plc",
"Latn",
}
m["kak"] = {
"Kayapa Kallahan",
3192220,
"phi",
"Latn",
}
m["kam"] = {
"Kamba",
2574767,
"bnt-kka",
"Latn",
}
m["kao"] = {
"Kassonke",
36905,
"dmn-wmn",
"Latn",
}
m["kap"] = {
"Bezhta",
33054,
"cau-ets",
"Cyrl",
translit = "cau-nec-translit",
override_translit = true,
display_text = {Cyrl = s["cau-Cyrl-displaytext"]},
strip_diacritics = {Cyrl = s["cau-Cyrl-stripdiacritics"]},
}
m["kaq"] = {
"Capanahua",
2937196,
"sai-pan",
"Latn",
}
m["kaw"] = {
"ชวาเก่า",
49341,
"poz",
"Latn, Java, Kawi",
--translit = "jv-translit", --same as jv
}
m["kax"] = {
"Kao",
3192799,
"paa-nha",
"Latn",
}
m["kay"] = {
"Kamayurá",
3192336,
"tup-gua",
"Latn",
}
m["kba"] = {
"Kalarko",
5517764,
"aus-pam",
"Latn",
}
m["kbb"] = {
"Kaxuyana",
12953626,
"sai-prk",
"Latn",
}
m["kbc"] = {
"Kadiwéu",
18168288,
"sai-guc",
"Latn",
}
m["kbd"] = {
"คาบาร์เดีย",
33522,
"cau-cir",
"Cyrl, Latn, Arab",
translit = {
Cyrl = "cau-cir-translit",
Arab = "ar-translit",
},
override_translit = true,
display_text = {Cyrl = s["cau-Cyrl-displaytext"]},
strip_diacritics = {
Cyrl = s["cau-Cyrl-stripdiacritics"],
Latn = s["cau-Latn-stripdiacritics"],
},
sort_key = {
Cyrl = {
from = {
"кхъу", "къӏу", -- 4 chars
"гъу", "джу", "дзу", "жъу", "къу", "кхъ", "къӏ", "кӏу", "кӏь", "лъу", "лӏу", "пӏу", "сӏу", "тӏу", "фӏу", "хъу", "цӏу", "чъу", "чӏу", "шъу", "шӏу", "щӏу", -- 3 chars
"гу", "гъ", "гь", "дж", "дз", "ё", "жъ", "жь", "ку", "къ", "кь", "кӏ", "лъ", "ль", "лӏ", "пӏ", "сӏ", "тӏ", "фӏ", "ху", "хъ", "хь", "цу", "цӏ", "чу", "чъ", "чӏ", "шъ", "шӏ", "щӏ", "ӏу", "ӏь", -- 2 chars
"э" -- 1 char
},
to = {
"к" .. p[5], "к" .. p[7],
"г" .. p[3], "д" .. p[2], "д" .. p[4], "ж" .. p[2], "к" .. p[3], "к" .. p[4], "к" .. p[6], "к" .. p[10], "к" .. p[11], "л" .. p[2], "л" .. p[5], "п" .. p[2], "с" .. p[2], "т" .. p[2], "ф" .. p[2], "х" .. p[3], "ц" .. p[3], "ч" .. p[3], "ч" .. p[5], "ш" .. p[2], "ш" .. p[4], "щ" .. p[2],
"г" .. p[1], "г" .. p[2], "г" .. p[4], "д" .. p[1], "д" .. p[3], "е" .. p[1], "ж" .. p[1], "ж" .. p[3], "к" .. p[1], "к" .. p[2], "к" .. p[8], "к" .. p[9], "л" .. p[1], "л" .. p[3], "л" .. p[4], "п" .. p[1], "с" .. p[1], "т" .. p[1], "ф" .. p[1], "х" .. p[1], "х" .. p[2], "х" .. p[4], "ц" .. p[1], "ц" .. p[2], "ч" .. p[1], "ч" .. p[2], "ч" .. p[4], "ш" .. p[1], "ш" .. p[3], "щ" .. p[1], "ӏ" .. p[1], "ӏ" .. p[2],
"а" .. p[1]
}
},
},
}
m["kbe"] = {
"Kanju",
10543322,
"aus-pam",
"Latn",
}
m["kbh"] = {
"Camsá",
2842667,
"qfa-iso",
"Latn",
}
m["kbi"] = {
"Kaptiau",
6367294,
"poz-oce",
"Latn",
}
m["kbj"] = {
"Kari",
6370438,
"bnt-boa",
"Latn",
}
m["kbk"] = {
"Grass Koiari",
12952642,
"ngf-koi",
"Latn",
}
m["kbm"] = {
"Iwal",
3156391,
"poz-ocw",
"Latn",
}
m["kbn"] = {
"Kare (Central Africa)",
35554,
"alv-mbm",
"Latn",
}
m["kbo"] = {
"Keliko",
11275553,
"csu-mma",
}
m["kbp"] = {
"Kabiyé",
35475,
"nic-gne",
"Latn",
}
m["kbq"] = {
"Kamano",
11732272,
"ngf-kag",
"Latn",
}
m["kbr"] = {
"Kafa",
35481,
"omv-gon",
"Ethi, Latn",
}
m["kbs"] = {
"Kande",
35556,
"bnt-tso",
"Latn",
}
m["kbt"] = {
"Gabadi",
3291159,
"poz-ocw",
"Latn",
}
m["kbu"] = {
"Kabutra",
10966761,
"raj",
}
m["kbv"] = {
"Kamberataro",
5261289,
"paa-sng",
"Latn",
}
m["kbw"] = {
"Kaiep",
6347632,
"poz-ocw",
"Latn",
}
m["kbx"] = {
"Ap Ma",
56298,
"paa-eke",
"Latn",
}
m["kbz"] = {
"Duhwa",
56295,
"cdc-wst",
"Latn",
}
m["kcb"] = {
"Kawacha",
11732302,
"ngf-ang",
"Latn",
}
m["kcc"] = {
"Lubila",
3914381,
"nic-uce",
"Latn",
}
m["kcd"] = {
"Ngkâlmpw Kanum",
12952566,
"paa-yam",
"Latn",
}
m["kce"] = {
"Kaivi",
6348685,
"nic-kau",
}
m["kcf"] = {
"Ukaan",
36651,
"nic-bco",
}
m["kcg"] = {
"Tyap",
3912765,
"nic-plc",
"Latn",
}
m["kch"] = {
"Vono",
3913920,
"nic-kau",
}
m["kci"] = {
"Kamantan",
3914019,
"nic-plc",
}
m["kcj"] = {
"Kobiana",
35609,
"alv-nyn",
}
m["kck"] = {
"Kalanga",
33672,
"bnt-sho",
"Latn",
}
m["kcl"] = {
"Kala",
6349982,
"poz-ocw",
"Latn",
}
m["kcm"] = {
"Tar Gula",
277963,
"csu-bba",
}
m["kcn"] = {
"Nubi",
36388,
"crp",
"Latn, Arab",
ancestors = "apd",
strip_diacritics = {remove_diacritics = c.acute},
}
m["kco"] = {
"Kinalakna",
11732320,
"ngf-huo",
"Latn",
}
m["kcp"] = {
"Kanga",
6362384,
"qfa-kad",
"Latn",
}
m["kcq"] = {
"Kamo",
3914879,
"alv-wjk",
}
m["kcr"] = {
"Katla",
35688,
"nic-ktl",
}
m["kcs"] = {
"Koenoem",
3438755,
"cdc-wst",
}
m["kct"] = {
"Kaian",
6347538,
"paa-ram",
"Latn",
}
m["kcu"] = {
"Kikami",
3915212,
"bnt-ruv",
"Latn",
}
m["kcv"] = {
"Kete",
3195598,
"bnt-lub",
}
m["kcw"] = {
"Kabwari",
6344539,
"bnt-glb",
}
m["kcx"] = {
"Kachama-Ganjule",
12634070,
"omv-eom",
}
m["kcy"] = {
"Korandje",
33427,
"son",
}
m["kcz"] = {
"Konongo",
11732345,
"bnt-tkm",
"Latn",
}
m["kda"] = {
"Worimi",
3914062,
"aus-pam",
"Latn",
}
m["kdc"] = {
"Kutu",
6448634,
"bnt-ruv",
}
m["kdd"] = {
"Yankunytjatjara",
34207,
"aus-pam",
"Latn",
}
m["kde"] = {
"Makonde",
35172,
"bnt-rvm",
"Latn",
}
m["kdf"] = {
"Mamusi",
6746036,
"poz-ocw",
"Latn",
}
m["kdg"] = {
"Seba",
7442316,
"bnt-sbi",
"Latn",
}
m["kdh"] = {
"Tem",
36531,
"nic-gne",
"Latn",
}
m["kdi"] = {
"Kumam",
6443410,
"sdv-los",
}
m["kdj"] = {
"Karamojong",
56326,
"sdv-ttu",
"Latn",
}
m["kdk"] = {
"Numèè",
3346774,
"poz-cln",
"Latn",
}
m["kdl"] = {
"Tsikimba",
3914404,
"nic-kam",
}
m["kdm"] = {
"Kagoma",
3914420,
"nic-plc",
}
m["kdn"] = {
"Kunda",
4121130,
"bnt-sna",
"Latn",
}
m["kdp"] = {
"Kaningdon-Nindem",
3914956,
"nic-nin",
}
m["kdq"] = {
"Koch",
56431,
"tbq-bdg",
}
m["kdr"] = {
"Karaim",
33725,
"trk-kcu",
"Cyrl, Latn, Hebr",
-- Hebr display_text, strip_diacritics, sort_key in [[Module:scripts/data]]
}
m["kdt"] = {
"กูย",
56310,
"mkh-kat",
"Thai, Khmr, Laoo",
}
m["kdu"] = {
"Kadaru",
35441,
"nub-hil",
"Latn",
}
m["kdv"] = {
"Kado",
7402721,
"sit-luu",
}
m["kdw"] = {
"Koneraw",
11732341,
"ngf-mom",
"Latn",
}
m["kdx"] = {
"Kam",
36753,
"alv-wjk",
}
m["kdy"] = {
"Keder",
6383641,
"paa-tkw",
}
m["kdz"] = {
"Kwaja",
11128866,
"nic-nka",
"Latn",
}
m["kea"] = {
"ครีโอลกาบูเวร์ดี",
35963,
"crp",
"Latn",
ancestors = "pt",
}
m["keb"] = {
"Kélé",
35559,
"bnt-kel",
}
m["kec"] = {
"Keiga",
3409311,
"qfa-kad",
"Latn",
}
m["ked"] = {
"Kerewe",
6393846,
"bnt-haj",
}
m["kee"] = {
"Eastern Keres",
15649021,
"nai-ker",
"Latn",
}
m["kef"] = {
"Kpessi",
35748,
"alv-gbe",
}
m["keg"] = {
"Tese",
16887296,
"sdv",
}
m["keh"] = {
"Keak",
6382110,
"paa-ndu",
"Latn",
}
m["kei"] = {
"Kei",
2410352,
"poz-cet",
}
m["kej"] = {
"Kadar",
6345179,
"dra-mal",
}
m["kek"] = {
"Q'eqchi",
35536,
"myn",
"Latn",
}
m["kel"] = {
"Kela-Yela",
6385426,
"bnt-mon",
"Latn",
}
m["kem"] = {
"Kemak",
35549,
"poz-tim",
"Latn",
}
m["ken"] = {
"Kenyang",
35650,
"nic-mam",
"Latn",
}
m["keo"] = {
"Kakwa",
3033547,
"sdv-bri",
}
m["kep"] = {
"Kaikadi",
6347757,
"dra-tam",
}
m["keq"] = {
"Kamar",
14916877,
"inc-hal",
}
m["ker"] = {
"Kera",
56251,
"cdc-est",
"Latn",
}
m["kes"] = {
"Kugbo",
3813394,
"nic-cde",
"Latn",
}
m["ket"] = {
"Ket",
33485,
"qfa-yke",
"Cyrl",
strip_diacritics = {
from = {"['’]"},
to = {"ʼ"}
},
sort_key = {
from = {"ӷ", "ё", "ӄ", "ӈ", "ө", "ә", "ʼ"},
to = {"г" .. p[1], "е" .. p[1], "к" .. p[1], "н" .. p[1], "о" .. p[1], "ъ" .. p[1], "ь" .. p[1]}
},
}
m["keu"] = {
"Akebu",
35026,
"alv-ktg",
"Latn",
}
m["kev"] = {
"Kanikkaran",
6363201,
"dra-mal",
"Taml, Mlym",
-- Mlym translit in [[Module:scripts/data]] (NOTE: not present before, presumably an accidental omission)
}
m["kew"] = {
"Kewa",
12952619,
"ngf-eng",
"Latn",
}
m["kex"] = {
"Kukna",
5031131,
"inc-eas",
ancestors = "bh",
}
m["key"] = {
"Kupia",
6445354,
"inc-eas",
}
m["kez"] = {
"Kukele",
3915391,
"nic-ucn",
"Latn",
}
m["kfa"] = {
"Kodava",
33531,
"dra-kod",
"Knda, Mlym",
-- Knda translit in [[Module:scripts/data]]
-- Mlym translit in [[Module:scripts/data]]
}
m["kfb"] = {
"Kolami",
33479,
"dra-knk",
"Deva, Telu",
translit = {
Deva = "Deva-translit",
Telu = "Telu-translit",
},
}
m["kfc"] = {
"Konda-Dora",
35679,
"dra-kki",
"Orya, Telu",
translit = {
Orya = "Orya-translit",
Telu = "Telu-translit",
},
}
m["kfd"] = {
"Korra Koraga",
12952655,
"dra-kor",
"Knda",
-- Knda translit in [[Module:scripts/data]]
}
m["kfe"] = {
"Kota (India)",
33483,
"dra-tkt",
"Taml",
translit = "Taml-translit",
}
m["kff"] = {
"Koya",
33471,
"dra-gon",
"Telu, Orya, Deva, Latn",
}
m["kfg"] = {
"Kudiya",
12952667,
"dra-tlk",
}
m["kfh"] = {
"Kurichiya",
12952676,
"dra-mal",
"Mlym",
-- Mlym translit in [[Module:scripts/data]]
}
m["kfi"] = {
"Kannada Kurumba",
56589,
"dra-sdo",
}
m["kfj"] = {
"Kemiehua",
27144776,
"mkh-pal",
}
m["kfk"] = {
"Kinnauri",
2383208,
"sit-kin",
"Takr, Deva, Latn",
translit = {
Takr = "Takr-translit",
Deva = "Deva-translit",
},
}
m["kfl"] = {
"Kung",
6444510,
"nic-rnc",
"Latn",
}
m["kfn"] = {
"Kuk",
6442398,
"nic-rnc",
"Latn",
}
m["kfo"] = {
"Koro (West Africa)",
11160588,
"dmn-mnk",
"Latn, Nkoo",
}
m["kfp"] = {
"Korwa",
6432786,
"mun",
}
m["kfq"] = {
"Korku",
33715,
"mun",
"Deva",
}
m["kfr"] = {
"กัจฉ์",
56487,
"inc-snd",
"Gujr, sd-Arab, Sind, Khoj",
translit = {
Gujr = "Gujr-translit",
Sind = "Sind-translit",
["sd-Arab"] = "sd-Arab-translit",
},
strip_diacritics = {
remove_diacritics = c.kashida .. c.fathatan .. c.dammatan .. c.kasratan .. c.fatha .. c.damma .. c.kasra .. c.shadda .. c.sukun .. c.superalef,
from = {u(0x0671)},
to = {u(0x0627)}
},
}
m["kfs"] = {
"Bilaspuri",
12953397,
"him",
"Deva, Takr",
translit = {
Deva = "Deva-translit",
Takr = "Takr-translit",
},
}
m["kft"] = {
"Kanjari",
12953610,
"inc-pan",
ancestors = "pa",
}
m["kfu"] = {
"Katkari",
6377671,
"inc-sou",
}
m["kfv"] = {
"Kurmukar",
6446193,
"inc-eas",
}
m["kfw"] = {
"Kharam Naga",
12952906,
"tbq-kuk",
}
m["kfx"] = {
"Kullu Pahari",
6443148,
"him",
"Deva",
translit = {
Deva = "Deva-translit",
},
}
m["kfy"] = {
"Kumaoni",
33529,
"inc-pah",
"Deva, Shrd, Takr",
translit = {
Deva = "Deva-translit",
Takr = "Takr-translit",
},
-- Shrd translit in [[Module:scripts/data]] (NOTE: not present before, presumably an accidental omission)
}
m["kfz"] = {
"Koromfé",
35701,
"nic-gur",
"Latn",
}
m["kga"] = {
"Koyaga",
11155632,
"dmn-mnk",
}
m["kgb"] = {
"Kawe",
12952750,
"poz-hce",
"Latn",
}
m["kgd"] = {
"Kataang",
12953622,
"mkh",
}
m["kge"] = {
"Komering",
49224,
"poz-lgx",
"Latn, Arab",
}
m["kgf"] = {
"Kube",
11732359,
"ngf-huo",
"Latn",
}
m["kgg"] = {
"Kusunda",
33630,
"qfa-iso", -- central Nepal
"Latn",
}
m["kgi"] = {
"Selangor Sign Language",
33731,
"sgn",
}
m["kgj"] = {
"Gamale Kham",
22236996,
"sit-kha",
"Deva",
translit = {
Deva = "Deva-translit",
},
}
m["kgk"] = {
"Kaiwá",
3111883,
"gn",
"Latn",
}
m["kgl"] = {
"Kunggari",
10550184,
"aus-pam",
}
m["kgn"] = {
"Karingani",
6371041,
"xme-ttc",
"fa-Arab, Latn",
ancestors = "xme-ttc-nor",
}
m["kgo"] = {
"Krongo",
6438927,
"qfa-kad",
"Latn",
}
m["kgp"] = {
"Kaingang",
2665734,
"sai-sje",
"Latn",
}
m["kgq"] = {
"Kamoro",
6359001,
"ngf-ask",
"Latn",
}
m["kgr"] = {
"Abun",
56657,
"qfa-iso", -- Papuan; isolate in Ethnologue, Glottolog and Palmer (2018); grouped with West Papuan by Ross (2005)
"Latn",
}
m["kgs"] = {
"Kumbainggar",
3915412,
"aus-pam",
}
m["kgt"] = {
"Somyev",
3913354,
"nic-mmb",
"Latn",
}
m["kgu"] = {
"Kobol",
11732325,
"ngf-omo",
"Latn",
}
m["kgv"] = {
"Karas",
6368621,
"qfa-dis", -- Divergent Papuan language; grouped with Mbaham-Iha by Glottolog to form a (mainland) West Bomberai
-- family, but with Mbaham-Iha and Timor-Alor-Pantar by Wikipedia (following Usher and Schapper 2022)
-- into a (Greater) West Bomberai family.
"Latn",
}
m["kgw"] = {
"Karon Dori",
56817,
"paa-mbr",
"Latn",
}
m["kgx"] = {
"Kamaru",
12953604,
"poz",
}
m["kgy"] = {
"Kyerung",
12952691,
"sit-kyk",
}
m["kha"] = {
"คาซี",
33584,
"aav-pkl",
"Latn, as-Beng",
}
m["khb"] = {
"ไทลื้อ",
36948,
"tai-swe",
"Talu, Lana",
translit = {
Talu = "Talu-translit",
Lana = "Lana-translit",
},
strip_diacritics = {remove_diacritics = c.ZWNJ},
sort_key = "khb-sortkey",
}
m["khc"] = {
"Tukang Besi North",
18611555,
"poz",
}
m["khd"] = {
"Bädi Kanum",
20888004,
"paa-yam",
"Latn",
}
m["khe"] = {
"Korowai",
6432598,
"ngf-gaw",
"Latn",
}
m["khf"] = {
"Khuen",
27144893,
"mkh",
}
m["khh"] = {
"Kehu",
10994953,
}
m["khj"] = {
"Kuturmi",
3914490,
"nic-plc",
"Latn",
}
m["khl"] = {
"Lusi",
3267788,
"poz-ocw",
"Latn",
}
m["khn"] = {
"Khandeshi",
33726,
"inc-sou",
}
m["kho"] = {
"โคตาน",
6583551,
"xsc-sak",
"Brah, Khar",
-- Brah translit in [[Module:scripts/data]]
}
m["khp"] = {
"Kapauri",
3502575,
"paa-tkw",
}
m["khq"] = {
"Koyra Chiini",
33600,
"son",
"Latn, Arab",
}
m["khr"] = {
"Kharia",
3915562,
"mun",
}
m["khs"] = {
"Kasua",
6374863,
"ngf-bos",
"Latn",
}
m["kht"] = {
"คำตี้",
3915502,
"tai-swe",
"Mymr",
display_text = s["kht-displaytext"],
strip_diacritics = s["kht-stripdiacritics"],
}
m["khu"] = {
"Nkhumbi",
11019169,
"bnt-swb",
}
m["khv"] = {
"Khvarshi",
56425,
"cau-wts",
"Cyrl",
translit = "khv-translit",
display_text = {Cyrl = s["cau-Cyrl-displaytext"]},
strip_diacritics = {Cyrl = s["cau-Cyrl-stripdiacritics"]},
}
m["khw"] = {
"Khowar",
938216,
"inc-chi",
"Arab",
strip_diacritics = {
-- character "ۂ" code U+06C2 to "ه" and "هٔ" (U+0647 + U+0654) to "ه"; hamzatu l-waṣli to a regular alif
from = {"هٔ", "ۂ", "ٱ"},
to = {"ہ", "ہ", "ا"},
remove_diacritics = c.fathatan .. c.dammatan .. c.kasratan .. c.fatha .. c.damma .. c.kasra .. c.shadda .. c.sukun .. c.nunghunna .. c.superalef
},
}
m["khx"] = {
"Kanu",
12952571,
"bnt-lgb",
}
m["khy"] = {
"Ekele",
6385549,
"bnt-ske",
"Latn",
}
m["khz"] = {
"Keapara",
12952603,
"poz-ocw",
"Latn",
}
m["kia"] = {
"Kim",
35685,
"alv-kim",
}
m["kib"] = {
"Koalib",
35859,
"alv-hei",
}
m["kic"] = {
"Kickapoo",
20162127,
"alg-sfk",
"Latn",
}
m["kid"] = {
"Koshin",
35632,
"nic-beb",
"Latn",
}
m["kie"] = {
"Kibet",
56893,
}
m["kif"] = {
"Eastern Parbate Kham",
12953022,
"sit-kha",
"Deva",
translit = {
Deva = "Deva-translit",
},
}
m["kig"] = {
"Kimaama",
11732321,
"paa-kol",
}
m["kih"] = {
"Kilmeri",
6408020,
"paa-brd",
"Latn",
}
m["kii"] = {
"Kitsai",
56627,
"cdd",
"Latn",
}
m["kij"] = {
"Kilivila",
3196601,
"poz-ocw",
"Latn",
}
m["kil"] = {
"Kariya",
3438708,
"cdc-wst",
}
m["kim"] = {
"โตฟา",
36848,
"trk-ssb",
"Cyrl",
}
m["kio"] = {
"Kiowa",
56631,
"nai-kta",
"Latn",
}
m["kip"] = {
"Sheshi Kham",
12952622,
"sit-kha",
"Deva",
translit = {
Deva = "Deva-translit",
},
}
m["kiq"] = {
"Kosadle",
6432994,
"paa-kko",
"Latn",
}
m["kis"] = {
"Kis",
6416362,
"poz-ocw",
"Latn",
}
m["kit"] = {
"Agob",
3332143,
"paa-pht",
"Latn",
}
m["kiv"] = {
"Kimbu",
10997740,
"bnt-tkm",
}
m["kiw"] = {
"Northeast Kiwai",
11732324,
"paa-kiw",
"Latn",
}
m["kix"] = {
"Khiamniungan Naga",
6401546,
"sit-kch",
"Latn",
}
m["kiy"] = {
"Kirikiri",
6415159,
"paa-lkp",
"Latn",
}
m["kiz"] = {
"Kisi",
3912772,
"bnt-bki",
}
m["kja"] = {
"Mlap",
6885683,
"paa-nim",
"Latn",
}
m["kjb"] = {
"Q'anjob'al",
35551,
"myn",
"Latn",
}
m["kjc"] = {
"Coastal Konjo",
3198689,
"poz",
"Latn",
}
m["kjd"] = {
"Southern Kiwai",
11732322,
"paa-kiw",
"Latn",
}
m["kje"] = {
"Kisar",
3197441,
"poz",
"Latn",
}
m["kjg"] = {
"ขมุ",
33335,
"mkh",
"Laoo",
translit = "Laoo-translit",
--sort_key = "Laoo-sortkey",
}
m["kjh"] = {
"คาคัส",
33575,
"trk-ssb",
"Cyrl",
translit = "kjh-translit",
override_translit = true,
}
m["kji"] = {
"Zabana",
379130,
"poz-ocw",
"Latn",
}
m["kjj"] = {
"Khinalug",
35278,
"cau-nec",
"Cyrl, Latn",
translit = "kjj-translit",
override_translit = true,
display_text = {Cyrl = s["cau-Cyrl-displaytext"]},
strip_diacritics = {
Cyrl = s["cau-Cyrl-stripdiacritics"],
Latn = s["cau-Latn-stripdiacritics"],
},
}
m["kjk"] = {
"Highland Konjo",
3198688,
"poz",
}
m["kjl"] = {
"Western Parbate Kham",
22237017,
"sit-kha",
"Deva",
translit = {
Deva = "Deva-translit",
},
}
m["kjm"] = {
"Kháng",
6403501,
"mkh-pal",
}
m["kjn"] = {
"Kunjen",
3200468,
"aus-pmn",
"Latn",
}
m["kjo"] = {
"Harijan Kinnauri",
5657463,
"him",
"Takr, Deva",
}
m["kjp"] = {
"กะเหรี่ยงโปตะวันออก",
5330390,
"kar",
"Mymr, Leke, Thai",
translit = "kjp-translit",
override_translit = true,
}
m["kjq"] = {
"Western Keres",
12645568,
"nai-ker",
"Latn",
}
m["kjr"] = {
"Kurudu",
12952678,
"poz-hce",
"Latn",
}
m["kjs"] = {
"East Kewa",
20050949,
"ngf-eng",
"Latn",
}
m["kjt"] = {
"กะเหรี่ยงโปแพร่",
7187991,
"kar",
"Thai",
}
m["kju"] = {
"Kashaya",
3193689,
"nai-pom",
"Latn",
}
m["kjx"] = {
"Ramopa",
56830,
"paa-nbo",
"Latn",
}
m["kjy"] = {
"Erave",
12952416,
"ngf-eng",
"Latn",
}
m["kjz"] = {
"Bumthangkha",
2786408,
"sit-ebo",
"Tibt",
override_translit = true,
-- Tibt translit, display_text, strip_diacritics, sort_key in [[Module:scripts/data]]
}
m["kka"] = {
"Kakanda",
3915342,
"alv-ngb",
}
m["kkb"] = {
"Kwerisa",
56881,
"paa-lkp",
"Latn",
}
m["kkc"] = {
"Odoodee",
12952987,
"ngf-est",
"Latn",
}
m["kkd"] = {
"Kinuku",
6414422,
"nic-kau",
}
m["kke"] = {
"Kakabe",
3913966,
"dmn-mok",
"Latn",
}
m["kkf"] = {
"Kalaktang Monpa",
63257089,
"sit-tsk",
"Tibt, Latn, Deva",
translit = {
Deva = "Deva-translit",
},
override_translit = true,
-- Tibt translit, display_text, strip_diacritics, sort_key in [[Module:scripts/data]]
}
m["kkg"] = {
"Mabaka Valley Kalinga",
18753304,
"phi",
}
m["kkh"] = {
"เขิน",
3545044,
"tai-swe",
"Lana, Thai",
translit = {
Lana = "Lana-translit",
},
sort_key = "nod-sortkey",
}
m["kki"] = {
"Kagulu",
12952537,
"bnt-ruv",
"Latn",
}
m["kkj"] = {
"Kako",
35755,
"bnt-kak",
}
m["kkk"] = {
"Kokota",
3198399,
"poz-ocw",
"Latn",
}
m["kkl"] = {
"Kosarek Yale",
6432995,
"ngf-mek",
"Latn",
}
m["kkm"] = {
"Kiong",
6414512,
"nic-ucr",
"Latn",
}
m["kkn"] = {
"Kon Keu",
6428686,
"mkh-pal",
}
m["kko"] = {
"Karko",
35529,
"nub-hil",
}
m["kkp"] = {
"Koko-Bera",
6426699,
"aus-pmn",
"Latn",
}
m["kkq"] = {
"Kaiku",
6347840,
"bnt-kbi",
"Latn",
}
m["kkr"] = {
"Kir-Balar",
3440527,
"cdc-wst",
"Latn",
}
m["kks"] = {
"Kirfi",
56242,
"cdc-wst",
"Latn",
}
m["kkt"] = {
"Koi",
6426194,
"sit-kiw",
}
m["kku"] = {
"Tumi",
3913934,
"nic-kau",
}
m["kkv"] = {
"Kangean",
2071325,
"poz-msa",
"Latn",
}
m["kkw"] = {
"Teke-Kukuya",
36560,
"bnt-tek",
}
m["kkx"] = {
"Kohin",
6425997,
"poz-brw",
}
m["kky"] = {
"Guugu Yimidhirr",
56543,
"aus-pam",
"Latn",
}
m["kkz"] = {
"Kaska",
20823,
"ath-nor",
"Latn",
}
m["kla"] = {
"Klamath-Modoc",
2669248,
"nai-plp",
"Latn",
}
m["klb"] = {
"Kiliwa",
3182593,
"nai-yuc",
"Latn",
}
m["klc"] = {
"Kolbila",
6427122,
"alv-lek",
}
m["kld"] = {
"Gamilaraay",
3111818,
"aus-cww",
"Latn",
}
m["kle"] = {
"Kulung",
6443304,
"sit-kic",
}
m["klf"] = {
"Kendeje",
56895,
}
m["klg"] = {
"กาลากันแบบตากาเกาลู",
18756514,
"phi",
"Latn",
}
m["klh"] = {
"Weliki",
7981017,
"ngf-fin",
"Latn",
}
m["kli"] = {
"Kalumpang",
13561407,
"poz",
}
m["klj"] = {
"Khalaj",
33455,
"trk",
"fa-Arab, Latn",
ancestors = "klj-arg",
strip_diacritics = {
remove_diacritics = c.kashida .. c.fathatan .. c.dammatan .. c.kasratan .. c.fatha .. c.damma .. c.kasra .. c.shadda .. c.sukun,
}
}
m["klk"] = {
"Kono (Nigeria)",
6429589,
"nic-kau",
"Latn",
}
m["kll"] = {
"กาลากันแบบกากัน",
18748913,
"phi",
}
m["klm"] = {
"Kolom",
6844970,
"ngf-rai",
"Latn",
}
m["kln"] = {
"Kalenjin",
637228,
"sdv-nma",
"Latn",
}
m["klo"] = {
"Kapya",
6367410,
"nic-ykb",
}
m["klp"] = {
"Kamasa",
6356107,
"ngf-ang",
"Latn",
}
m["klq"] = {
"Rumu",
7379420,
"paa-tuk",
"Latn",
}
m["klr"] = {
"Khaling",
56381,
"sit-kiw",
"Deva",
}
m["kls"] = {
"Kalasha",
33416,
"inc-chi",
"Latn, ks-Arab",
}
m["klt"] = {
"Nukna",
7068874,
"ngf-fin",
"Latn",
}
m["klu"] = {
"Klao",
3914866,
"kro-wkr",
}
m["klv"] = {
"Maskelynes",
3297282,
"poz-vnc",
"Latn",
}
m["klw"] = {
"ลินดู",
18390055,
"poz-kal",
"Latn",
}
m["klx"] = {
"Koluwawa",
6427954,
"poz-ocw",
"Latn",
}
m["kly"] = {
"Kalao",
6350643,
"poz",
}
m["klz"] = {
"Kabola",
11732258,
"paa-tap",
"Latn",
}
m["kma"] = {
"Konni",
35680,
"nic-buk",
}
m["kmb"] = {
"Kimbundu",
35891,
"bnt-kmb",
"Latn",
}
m["kmc"] = {
"ต้งใต้",
35379,
"qfa-kms",
"Latn",
}
m["kmd"] = {
"Madukayang Kalinga",
18753305,
"phi",
}
m["kme"] = {
"Bakole",
35068,
"bnt-kpw",
"Latn",
}
m["kmf"] = {
"Kare (New Guinea)",
11732286,
"ngf-mab",
"Latn",
}
m["kmg"] = {
"Kâte",
3201059,
"ngf-huo",
"Latn",
}
m["kmh"] = {
"Kalam",
12952550,
"ngf-kak",
"Latn",
}
m["kmi"] = {
"Kami",
3915372,
"alv-ngb",
"Latn",
}
m["kmj"] = {
"Kumarbhag Paharia",
3130374,
"dra-mlo",
"Beng, Deva",
translit = {
Beng = "Beng-translit",
Deva = "Deva-translit",
},
}
m["kmk"] = {
"Limos Kalinga",
18753303,
"phi",
"Latn",
}
m["kml"] = {
"Tanudan Kalinga",
18753307,
"phi",
"Latn",
}
m["kmm"] = {
"Kom (India)",
12952647,
"tbq-kuk",
}
m["kmn"] = {
"Awtuw",
3504217,
"paa-spk",
"Latn",
}
m["kmo"] = {
"Kwoma",
11732376,
"paa-spk",
"Latn",
}
m["kmp"] = {
"Gimme",
11152236,
"alv-dur",
}
m["kmq"] = {
"Kwama",
2591184,
"ssa-kom",
}
m["kmr"] = {
"เคิร์ดเหนือ",
36163,
"ku",
"Latn, Cyrl, Armn, ku-Arab, Yezi",
translit = {
Cyrl = "kmr-translit",
-- Armn translit in [[Module:scripts/data]]
["ku-Arab"] = "ckb-translit",
},
strip_diacritics = {
Latn = {
remove_diacritics = "'’",
from = {"r̄", "R̄", "ẍ", "Ẍ"},
to = {"rr", "Rr", "x", "X"}
},
},
wikimedia_codes = "ku",
}
m["kms"] = {
"Kamasau",
6356117,
"paa-tor",
"Latn",
}
m["kmt"] = {
"Kemtuik",
6387179,
"paa-nim",
"Latn",
}
m["kmu"] = {
"Kanite",
12952567,
"ngf-kag",
"Latn",
}
m["kmv"] = {
"Karipúna Creole French",
2523999,
"crp",
"Latn",
ancestors = "fr",
sort_key = s["roa-oil-sortkey"],
}
m["kmw"] = {
"Kumu",
6428450,
"bnt-kbi",
"Latn",
}
m["kmx"] = {
"Waboda",
7958705,
"paa-kiw",
"Latn",
}
m["kmy"] = {
"Koma",
35634,
"alv-dur",
}
m["kmz"] = {
"Khorasani Turkish",
35373,
"trk-ogz",
"Arab",
ancestors = "trk-oat",
}
m["kna"] = {
"Kanakuru",
56811,
"cdc-wst",
"Latn",
}
m["knb"] = {
"Lubuagan Kalinga",
12953602,
"phi",
"Latn",
}
m["knd"] = {
"Konda",
11732340,
"ngf-sbh",
"Latn",
}
m["kne"] = {
"กันกานาอือ",
18753329,
"phi",
"Latn",
strip_diacritics = {
Latn = {
remove_diacritics = c.grave .. c.acute .. c.circ .. c.diaer,
}
},
sort_key = {
Latn = "tl-sortkey",
},
standard_chars = {
Latn = "AaBbKkDdEeGgHhIiLlMmNnOoPpRrSsTtUuWwYy" .. c.punc,
},
}
m["knf"] = {
"Mankanya",
35789,
"alv-pap",
"Latn",
}
m["kni"] = {
"Kanufi",
3913297,
"nic-nin",
"Latn",
}
m["knj"] = {
"Akatek",
34923,
"myn",
"Latn",
}
m["knk"] = {
"Kuranko",
3198896,
"dmn-mok",
"Latn",
}
m["knl"] = {
"Keninjal",
6389309,
"poz-mly",
"Latn",
}
m["knm"] = { -- two unrelated lects have this name; this is the Katukinian one
"Kanamari",
3438373,
"sai-ktk",
"Latn",
}
m["kno"] = {
"Kono (Sierra Leone)",
35675,
"dmn-vak",
"Latn",
}
m["knp"] = {
"Kwanja",
35641,
"nic-mmb",
"Latn",
}
m["knq"] = {
"Kintaq",
6414335,
"mkh-asl",
}
m["knr"] = {
"Kaningra",
6363253,
"paa-spk",
"Latn",
}
m["kns"] = {
"Kensiu",
6391529,
"mkh-asl",
}
m["knt"] = {
"Katukina",
3194265,
"sai-pan",
"Latn",
}
m["knu"] = { -- a dialect of 'kpe'
"Kono (Guinea)",
3198703,
"dmn-msw",
"Latn, Kpel",
ancestors = "kpe",
}
m["knv"] = {
"Tabo",
7959888,
"aav",
}
m["knx"] = {
"Kendayan",
6388963,
"poz-mly",
"Latn",
}
m["kny"] = {
"Kanyok",
11110766,
"bnt-lub",
"Latn",
}
m["knz"] = {
"Kalamsé",
3914000,
"nic-gnn",
}
m["koa"] = {
"Konomala",
3198732,
"poz-ocw",
"Latn",
}
m["koc"] = {
"Kpati",
3913279,
"nic-nge",
"Latn",
}
m["kod"] = {
"Kodi",
4577633,
"poz-cet",
"Latn",
}
m["koe"] = {
"Kacipo-Balesi",
5364424,
"sdv",
}
m["kof"] = {
"Kubi",
3438718,
"cdc-wst",
"Latn",
}
m["kog"] = {
"Cogui",
3198286,
"cba",
"Latn",
}
m["koh"] = {
"Koyo",
35649,
"bnt-mbo",
"Latn",
}
m["koi"] = {
"Komi-Permyak",
56318,
"kv",
"Cyrl",
translit = "kv-translit",
strip_diacritics = {remove_diacritics = c.acute},
override_translit = true,
}
m["kok"] = {
"กงกัณ",
34239,
"inc-sou",
"Deva, Knda, Mlym, fa-Arab, Latn",
translit = {
Deva = "Deva-translit",
},
-- Knda translit in [[Module:scripts/data]]
-- Mlym translit in [[Module:scripts/data]]
strip_diacritics = {
-- FIXME: Separate out the scripts
from = {"च़", "ज़", "झ़", "ಚ಼", "ಜ಼", "ಝ಼"},
to = {"च", "ज", "झ", "ಚ", "ಜ", "ಝ"}
} ,
}
m["kol"] = {
"Kol (New Guinea)",
4227542,
}
m["koo"] = {
"Konzo",
2361829,
"bnt-glb",
}
m["kop"] = {
"Waube",
11732373,
"ngf-nur",
"Latn",
}
m["koq"] = {
"Kota (Gabon)",
35607,
"bnt-kel",
"Latn",
}
m["kos"] = {
"Kosraean",
33464,
"poz-mic",
"Latn",
}
m["kot"] = {
"Lagwan",
3502264,
"cdc-cbm",
"Latn",
}
m["kou"] = {
"Koke",
797249,
"alv-bua",
}
m["kov"] = {
"Kudu-Camo",
3915850,
"nic-jer",
}
m["kow"] = {
"Kugama",
3913307,
"alv-mye",
}
m["koy"] = {
"Koyukon",
28304,
"ath-nor",
"Latn",
}
m["koz"] = {
"Korak",
6431365,
"ngf-kow",
"Latn",
}
m["kpa"] = {
"Kutto",
3437656,
"cdc-wst",
}
m["kpb"] = {
"Mullu Kurumba",
19573111,
"dra-mal",
}
m["kpc"] = {
"Curripaco",
2882543,
"awd-nwk",
"Latn",
}
m["kpd"] = {
"Koba",
6424249,
"poz",
}
m["kpe"] = {
"Kpelle",
35673,
"dmn-msw",
"Latn, Kpel",
}
m["kpf"] = {
"Komba",
6428239,
"ngf-huo",
"Latn",
}
m["kpg"] = {
"Kapingamarangi",
35771,
"poz-pnp",
"Latn",
}
m["kph"] = {
"Kplang",
35628,
"alv-gng",
}
m["kpi"] = {
"Kofei",
6425665,
"paa-egb",
"Latn",
}
m["kpj"] = {
"Karajá",
10322066,
"sai-mje",
"Latn",
}
m["kpk"] = {
"Kpan",
3915380,
"nic-jkn",
"Latn",
}
m["kpl"] = {
"Kpala",
11154769,
"nic-nkk",
"Latn",
}
m["kpm"] = {
"เกอฮอ",
3511919,
"mkh-ban",
"Latn",
}
m["kpn"] = {
"Kepkiriwát",
3195366,
"tup",
"Latn",
}
m["kpo"] = {
"Ikposo",
35029,
"alv-ktg",
"Latn",
}
m["kpq"] = {
"Korupun-Sela",
6432769,
"ngf-mek",
}
m["kpr"] = {
"Korafe-Yegha",
11732347,
"paa-bin",
"Latn",
}
m["kps"] = {
"Tehit",
7694851,
"paa-wbh",
"Latn",
}
m["kpt"] = {
"Karata",
56636,
"cau-and",
"Cyrl",
translit = "kpt-translit",
override_translit = true,
display_text = {Cyrl = s["cau-Cyrl-displaytext"]},
strip_diacritics = {Cyrl = s["cau-Cyrl-stripdiacritics"]},
}
m["kpu"] = {
"Kafoa",
6346151,
"paa-tap",
"Latn",
}
m["kpv"] = {
"Komi-Zyrian",
34114,
"kv",
"Cyrl",
translit = "kv-translit",
override_translit = true,
wikimedia_codes = "kv",
}
m["kpw"] = {
"Kobon",
11732326,
"ngf-kak",
"Latn",
}
m["kpx"] = {
"Mountain Koiari",
6925030,
"ngf-koi",
"Latn",
}
m["kpy"] = {
"Koryak",
36199,
"qfa-ckn",
"Cyrl",
strip_diacritics = {
from = {"['’]"},
to = {"ʼ"}
},
sort_key = {
from = {"вʼ", "гʼ", "ё", "ӄ", "ӈ"},
to = {"в" .. p[1], "г" .. p[1], "е" .. p[1], "к" .. p[1], "н" .. p[1]}
},
translit = "kpy-translit",
}
m["kpz"] = {
"Kupsabiny",
56445,
"sdv-kln",
}
m["kqa"] = {
"Mum",
6935252,
"ngf-nso",
"Latn",
}
m["kqb"] = {
"Kovai",
6434822,
"ngf-huo",
"Latn",
}
m["kqc"] = {
"Doromu-Koki",
5298175,
"paa-man",
"Latn",
}
m["kqd"] = {
"Koy Sanjaq Surat",
33463,
"sem-nna",
}
m["kqe"] = {
"กาลากัน",
18748906,
"phi",
"Latn",
}
m["kqf"] = {
"Kakabai",
6349119,
"poz-ocw",
"Latn",
}
m["kqg"] = {
"Khe",
3914015,
"nic-gur",
}
m["kqh"] = {
"Kisankasa",
6416409,
"sdv",
}
m["kqi"] = {
"Koitabu",
6426363,
"ngf-koi",
"Latn",
}
m["kqj"] = {
"Koromira",
6432520,
"paa-sbo",
"Latn",
}
m["kqk"] = {
"Kotafon Gbe",
12952447,
"alv-pph",
}
m["kql"] = {
"Kyenele",
11732453,
"paa-yua",
"Latn",
}
m["kqm"] = {
"Khisa",
3913955,
"nic-gur",
}
m["kqn"] = {
"Kaonde",
33601,
"bnt-lub",
"Latn",
}
m["kqo"] = {
"Eastern Krahn",
3915374,
"kro-wee",
}
m["kqp"] = {
"Kimré",
3441210,
"cdc-est",
}
m["kqq"] = {
"Krenak",
6436747,
"sai-cer",
}
m["kqr"] = {
"Kimaragang",
3196845,
"poz-san",
"Latn",
}
m["kqs"] = {
"Northern Kissi",
19921576,
"alv-kis",
}
m["kqt"] = {
"Klias River Kadazan",
12953594,
"poz-san",
}
m["kqu"] = {
"Seroa",
33127766,
"khi-tuu",
}
m["kqv"] = {
"Okolod",
7082487,
"poz-san",
}
m["kqw"] = {
"Kandas",
3192590,
"poz-ocw",
"Latn",
}
m["kqx"] = {
"Mser",
3502347,
"cdc-cbm",
}
m["kqy"] = {
"Koorete",
6430753,
"omv-eom",
"Ethi, Latn",
}
m["kqz"] = {
"Korana",
2756709,
"khi-khk",
"Latn",
}
m["kra"] = {
"Kumhali",
13580783,
"inc-eas",
ancestors = "bh",
}
m["krb"] = {
"Karkin",
3193345,
"nai-utn",
"Latn",
}
m["krc"] = {
"Karachay-Balkar",
33714,
"trk-kcu",
"Cyrl",
translit = "krc-translit",
sort_key = {
from = {"гъ", "дж", "ё", "къ", "нг"},
to = {"г" .. p[1], "д" .. p[1], "е" .. p[1], "к" .. p[1], "н" .. p[1]}
},
}
m["krd"] = {
"Kairui-Midiki",
12953277,
"poz-tim",
}
m["kre"] = {
"Panará",
3361895,
"sai-cer",
"Latn",
}
m["krf"] = {
"Koro (Vanuatu)",
3198995,
"poz-vnn",
"Latn",
}
m["krh"] = {
"Kurama",
35593,
"nic-kau",
}
m["kri"] = {
"Krio",
35744,
"crp",
"Latn",
ancestors = "en",
strip_diacritics = {remove_diacritics = c.grave .. c.acute .. c.circ},
sort_key = {
from = {"ɛ", "gb", "kp", "ɔ"},
to = {"e" .. p[1], "g" .. p[1], "k" .. p[1], "o" .. p[1]}
},
}
m["krj"] = {
"Kinaray-a",
33720,
"phi",
"Latn",
}
m["krk"] = {
"Kerek",
332792,
"qfa-ckn",
"Cyrl",
}
m["krl"] = {
"คาเรเลีย",
33557,
"urj-fin",
"Latn",
sort_key = {
from = {
"č", "š", "ž", "ü", "ä", "ö", -- 2 chars
"z", "'" -- 1 char
},
to = {
"c" .. p[1], "s" .. p[1], "s" .. p[3], "y" .. p[1], "y" .. p[2], "y" .. p[3],
"s" .. p[2], "y" .. p[4],
}
},
}
m["krm"] = {
"Krim",
35713,
"alv",
}
m["krn"] = {
"Sapo",
3915386,
"kro-wee",
}
m["krp"] = {
"Korop",
35626,
"nic-ucr",
"Latn",
}
m["krr"] = {
"Kru'ng",
12953650,
"mkh-ban",
}
m["krs"] = {
"Kresh",
56674,
"csu-bkr",
}
m["kru"] = {
"กุรุข",
33492,
"dra-kml",
"Deva, Tols",
translit = {
Deva = "Deva-translit",
},
}
m["krv"] = {
"Kavet",
12953649,
"sai-ktk",
"Latn",
}
m["krw"] = {
"Western Krahn",
10975611,
"kro-wee",
}
m["krx"] = {
"Karon",
35704,
"alv-jol",
}
m["kry"] = {
"Kryts",
35861,
"cau-ssm",
"Latn, Cyrl",
display_text = {Cyrl = s["cau-Cyrl-displaytext"]},
strip_diacritics = {
Latn = s["cau-Latn-stripdiacritics"],
Cyrl = s["cau-Cyrl-stripdiacritics"],
},
}
m["krz"] = {
"Sota Kanum",
12952568,
"paa-yam",
"Latn",
}
m["ksa"] = {
"Shuwa-Zamani",
3913929,
"nic-kau",
}
m["ksb"] = {
"Shambala",
3788739,
"bnt-seu",
"Latn",
}
m["ksc"] = {
"Southern Kalinga",
18753301,
"phi",
}
m["ksd"] = {
"Tolai",
35870,
"poz-ocw",
"Latn",
}
m["kse"] = {
"Kuni",
6444619,
"poz-ocw",
"Latn",
}
m["ksf"] = {
"Bafia",
34930,
"bnt-baf",
"Latn",
}
m["ksg"] = {
"Kusaghe",
3200638,
"poz-ocw",
"Latn",
}
m["ksi"] = {
"Krisa",
841704,
"paa-msk",
"Latn",
}
m["ksj"] = {
"Uare",
6450052,
"paa-kwa",
"Latn",
}
m["ksk"] = {
"Kansa",
3192772,
"sio-dhe",
"Latn",
}
m["ksl"] = {
"Kumalu",
17584381,
"poz-ocw",
"Latn",
}
m["ksm"] = {
"Kumba",
3913972,
"alv-mye",
}
m["ksn"] = {
"Kasiguranin",
6374525,
"phi",
}
m["kso"] = {
"Kofa",
56278,
"cdc-cbm",
}
m["ksp"] = {
"Kaba",
3915316,
"csu-sar",
}
m["ksq"] = {
"Kwaami",
3440525,
"cdc-wst",
}
m["ksr"] = {
"Borong",
4946263,
"ngf-huo",
"Latn",
}
m["kss"] = {
"Southern Kissi",
11028974,
"alv-kis",
}
m["kst"] = {
"Winyé",
3913360,
"nic-gnw",
}
m["ksu"] = {
"Khamyang",
6583541,
"tai-swe",
}
m["ksv"] = {
"Kusu",
6448199,
"bnt-tet",
}
m["ksw"] = {
"กะเหรี่ยงสะกอ",
56410,
"kar",
"Mymr",
translit = "ksw-translit",
}
m["ksx"] = {
"Kedang",
6382520,
"poz",
"Latn",
}
m["ksy"] = {
"Kharia Thar",
6400661,
"inc-eas",
}
m["ksz"] = {
"Kodaku",
21179986,
"mun",
}
m["kta"] = {
"Katua",
6378404,
"mkh-ban",
}
m["ktb"] = {
"Kambaata",
35664,
"cus-hec",
"Latn",
}
m["ktc"] = {
"Kholok",
3440464,
"cdc-wst",
}
m["ktd"] = {
"Kokata",
10547021,
"aus-pam",
}
m["ktf"] = {
"Kwami",
12952687,
"bnt-lgb",
}
m["ktg"] = {
"Kalkatungu",
3914057,
"aus-pam",
"Latn",
}
m["kth"] = {
"Karanga",
713643,
}
m["kti"] = {
"North Muyu",
20857698,
"ngf-okk",
"Latn",
}
m["ktj"] = {
"Plapo Krumen",
10975356,
"kro-grb",
}
m["ktk"] = {
"Kaniet",
3399050,
"poz-aay",
"Latn",
}
m["ktl"] = {
"Koroshi",
3775265,
"ira-nwi",
ancestors = "bal",
}
m["ktm"] = {
"Kurti",
3200615,
"poz-aay",
"Latn",
}
m["ktn"] = {
"Karitiâna",
3112184,
"tup",
"Latn",
}
m["kto"] = {
"Kuot",
56537,
}
m["ktp"] = {
"Kaduo",
769809,
"tbq-bka",
}
m["ktq"] = {
"Katabaga",
3193895,
}
m["ktr"] = {
"Kota Marudu Tinagas",
18642280,
}
m["kts"] = {
"South Muyu",
42308820,
"ngf-okk",
"Latn",
}
m["ktt"] = {
"Ketum",
12952616,
"ngf-gaw",
"Latn",
}
m["ktu"] = {
"Kituba",
35746,
"crp",
"Latn",
ancestors = "kg",
}
m["ktv"] = {
"กะตูตะวันออก",
22808951,
"mkh-kat",
"Latn",
}
m["ktw"] = {
"Kato",
20831,
"ath-pco",
"Latn",
}
m["ktx"] = {
"Kaxararí",
6380124,
"sai-pan",
"Latn",
}
m["kty"] = {
"Kango",
6362818,
"bnt-bta",
"Latn",
}
m["ktz"] = {
"Juǀ'hoan",
1192295,
"khi-kxa",
"Latn",
}
m["kub"] = {
"Kutep",
35645,
"nic-jkn",
}
m["kuc"] = {
"Kwinsu",
6450460,
"paa-tkw",
}
m["kud"] = {
"Auhelawa",
5166,
"poz-ocw",
"Latn",
}
m["kue"] = {
"Kuman",
137525,
"ngf-chw",
"Latn",
}
m["kuf"] = {
"กะตูตะวันตก",
6378400,
"mkh-kat",
"Laoo, Tale, Latn",
}
m["kug"] = {
"Kupa",
3915336,
"alv-ngb",
}
m["kuh"] = {
"Kushi",
3438747,
"cdc-wst",
}
m["kui"] = {
"Kuikúro",
3915522,
"sai-kui",
"Latn",
}
m["kuj"] = {
"Kuria",
6445968,
"bnt-lok",
"Latn",
}
m["kuk"] = {
"Kepo'",
6393217,
"poz",
}
m["kul"] = {
"Kulere",
3440506,
"cdc-wst",
}
m["kum"] = {
"คูมุก",
36209,
"trk-kcu",
"Cyrl",
translit = "kum-translit",
sort_key = {
from = {"гъ", "гь", "ё", "къ", "нг", "оь", "уь"},
to = {"г" .. p[1], "г" .. p[2], "е" .. p[1], "к" .. p[1], "н" .. p[1], "о" .. p[1], "у" .. p[1]}
},
}
m["kun"] = {
"Kunama",
36041,
}
m["kuo"] = {
"Kumukio",
11732362,
"ngf-huo",
"Latn",
}
m["kup"] = {
"Kunimaipa",
6444696,
"paa-kun",
"Latn",
}
m["kuq"] = {
"Karipuna",
6371071,
"tup-gua",
"Latn",
}
m["kus"] = {
"Kusaal",
35708,
"nic-dag",
"Latn",
}
m["kut"] = {
"Ktunaxa",
33434,
"qfa-iso",
"Latn",
}
m["kuu"] = {
"Upper Kuskokwim",
28062,
"ath-nor",
"Latn",
}
m["kuv"] = {
"Kur",
12635082,
"poz-cma",
"Latn",
}
m["kuw"] = {
"Kpagua",
11137573,
"bad-cnt",
}
m["kux"] = {
"Kukatja",
10549839,
"aus-pam",
}
m["kuy"] = {
"Kuuku-Ya'u",
10550697,
"aus-pmn",
}
m["kuz"] = {
"Kunza",
2669181,
"qfa-iso",
"Latn",
}
m["kva"] = {
"Bagvalal",
56638,
"cau-and",
"Cyrl",
translit = "cau-nec-translit",
override_translit = true,
display_text = {Cyrl = s["cau-Cyrl-displaytext"]},
strip_diacritics = {Cyrl = s["cau-Cyrl-stripdiacritics"]},
}
m["kvb"] = {
"Kubu",
6441341,
"poz-mly",
}
m["kvc"] = {
"Kove",
3199402,
"poz-ocw",
"Latn",
}
m["kvd"] = {
"Kui (Indonesia)",
6442230,
"paa-tap",
"Latn",
}
m["kve"] = {
"Kalabakan",
6350003,
"poz-san",
"Latn",
}
m["kvf"] = {
"Kabalai",
3440427,
"cdc-est",
}
m["kvg"] = {
"Kuni-Boazi",
2907551,
"paa-ani",
"Latn",
}
m["kvh"] = {
"Komodo",
3198565,
"poz-cet",
"Latn",
}
m["kvi"] = {
"Kwang",
3440398,
"cdc-est",
"Latn",
}
m["kvj"] = {
"Psikye",
56304,
"cdc-cbm",
}
m["kvk"] = {
"Korean Sign Language",
3073428,
"sgn-jsl",
}
m["kvl"] = {
"Brek Karen",
12952577,
"kar",
}
m["kvm"] = {
"Kendem",
35751,
"nic-mam",
"Latn",
}
m["kvn"] = {
"Border Kuna",
31777873,
"cba",
}
m["kvo"] = {
"Dobel",
5286559,
"poz",
"Latn",
}
m["kvp"] = {
"Kompane",
18343041,
"poz",
}
m["kvq"] = {
"Geba Karen",
12952581,
"kar",
"Latn, Mymr",
}
m["kvr"] = {
"Kerinci",
3195442,
"poz-mly",
"Latn, Arab", -- Also Incung, which we don't have
}
m["kvt"] = {
"Lahta Karen",
12952582,
"kar",
}
m["kvu"] = {
"Yinbaw Karen",
14426328,
"kar",
}
m["kvv"] = {
"Kola",
6426967,
"poz",
"Latn",
}
m["kvw"] = {
"Wersing",
7983599,
"paa-tap",
"Latn",
}
m["kvx"] = {
"Parkari Koli",
3244176,
"inc-wes",
}
m["kvy"] = {
"Yintale Karen",
14426329,
"kar",
}
m["kvz"] = {
"Tsakwambo",
7849438,
"ngf-gaw",
"Latn",
}
m["kwa"] = {
"Dâw",
3042278,
"sai-nad",
"Latn",
}
m["kwb"] = {
"Baa",
34842,
"alv-ada",
}
m["kwc"] = {
"Likwala",
35597,
"bnt-mbo",
}
m["kwd"] = {
"Kwaio",
3200796,
"poz-sls",
"Latn",
}
m["kwe"] = {
"Kwerba",
6450328,
"paa-tkw",
}
m["kwf"] = {
"Kwara'ae",
3200829,
"poz-sls",
"Latn",
}
m["kwg"] = {
"Sara Kaba Deme",
3915384,
"csu-kab",
}
m["kwh"] = {
"Kowiai",
6435028,
"poz",
"Latn",
}
m["kwi"] = {
"Awa-Cuaiquer",
2603103,
"sai-bar",
"Latn",
}
m["kwj"] = {
"Kwanga",
3438383,
"paa-spk",
"Latn",
}
m["kwk"] = {
"Kwak'wala",
2640628,
"wak",
"Latn",
}
m["kwl"] = {
"Kofyar",
3441382,
"cdc-wst",
"Latn",
}
m["kwm"] = {
"Kwambi",
3487165,
"bnt-ova",
}
m["kwn"] = {
"Kwangali",
36334,
"bnt-kav",
"Latn",
}
m["kwo"] = {
"Kwomtari",
3508116,
"paa-kwm",
"Latn",
}
m["kwp"] = {
"Kodia",
3914867,
"kro-ekr",
}
m["kwq"] = {
"Kwak",
11014183,
"nic-nka",
ancestors = "yam",
}
m["kwr"] = {
"Kwer",
12635137,
"ngf-okk",
"Latn",
}
m["kws"] = {
"Kwese",
3200846,
"bnt-pen",
}
m["kwt"] = {
"Kwesten",
6450354,
"paa-tkw",
}
m["kwu"] = {
"Kwakum",
35624,
"bnt-kak",
}
m["kwv"] = {
"Sara Kaba Náà",
3915361,
"csu-kab",
"Latn",
}
m["kww"] = {
"Kwinti",
721182,
"crp",
"Latn",
ancestors = "en"
}
m["kwx"] = {
"Khirwar",
12976968,
"dra",
}
m["kwz"] = {
"Kwadi",
2364661,
"khi-kkw",
"Latn",
}
m["kxa"] = {
"Kairiru",
3398785,
"poz-ocw",
"Latn",
}
m["kxb"] = {
"Krobu",
35586,
"alv-ptn",
"Latn",
}
m["kxc"] = {
"Konso",
56624,
"cus-eas",
"Ethi, Latn",
}
m["kxd"] = {
"มลายูแบบบรูไน",
3182878,
"poz-mly",
"Latn, ms-Arab",
}
m["kxe"] = {
"Kakihum",
3914433,
"nic-kam",
ancestors = "tvd",
}
m["kxf"] = {
"Manumanaw Karen",
12952592,
"kar",
"Mymr, Latn",
}
m["kxh"] = {
"Karo",
3447116,
"omv-aro",
}
m["kxi"] = {
"Keningau Murut",
6389308,
"poz-san",
"Latn",
}
m["kxj"] = {
"Kulfa",
713654,
"csu-kab",
}
m["kxk"] = {
"Zayein Karen",
14352960,
"kar",
}
m["kxl"] = {
"Nepali Kurux",
3200624,
"dra-kml",
"Deva",
ancestors = "kru",
translit = {
Deva = "Deva-translit",
},
}
m["kxm"] = {
"เขมรเหนือ",
3502234,
"mkh-kmr",
"Thai, Khmr",
ancestors = "xhm",
sort_key = {
from = {"[%pๆ]", "[็-๎]", "([เแโใไ])([ก-ฮ])"},
to = {"", "", "%2%1"}
},
}
m["kxn"] = {
"Kanowit",
6364300,
"poz-bnn",
"Latn",
}
m["kxo"] = {
"Kanoé",
4356223,
"qfa-iso",
"Latn",
}
m["kxp"] = {
"Wadiyara Koli",
12953645,
"inc-wes",
}
m["kxq"] = {
"Smärky Kanum",
12952569,
"paa-yam",
"Latn",
}
m["kxr"] = {
"Manus Koro",
3198994,
"poz-aay",
"Latn",
}
m["kxs"] = {
"Kangjia",
3182570,
"xgn-shr",
"Latn",
}
m["kxt"] = {
"Koiwat",
6426388,
"paa-ndu",
"Latn",
}
m["kxu"] = {
"Kui (India)",
33919,
"dra-kki",
"Orya",
translit = "Orya-translit",
strip_diacritics = {
remove_diacritics = "୕",
from = {"ଆଆ", "ଇଇ", "ଉଉ", "ଏଏ", "ଓଓ", "ିଇ", "ୁଉ", "େଏ", "ୋଓ"},
to = {"ଆ", "ଈ", "ଊ", "ଏ", "ଓ", "ୀ", "ୂ", "େ", "ୋ"},
},
}
m["kxv"] = {
"Kuvi",
3200721,
"dra-kki",
"Orya",
translit = "Orya-translit",
strip_diacritics = {
remove_diacritics = "୕",
from = {"ଆଆ", "ଇଇ", "ଉଉ", "ଏଏ", "ଓଓ", "([କ-ହ])ଆ", "ିଇ", "ୁଉ", "େଏ", "ୋଓ"},
to = {"ଆ", "ଈ", "ଊ", "ଏ", "ଓ", "%1ା", "ୀ", "ୂ", "େ", "ୋ"},
},
}
m["kxw"] = {
"Konai",
11732339,
"ngf-est",
"Latn",
}
m["kxx"] = {
"Likuba",
35646,
"bnt-bmo",
}
m["kxy"] = {
"Kayong",
6380673,
"mkh",
}
m["kxz"] = {
"Kerewo",
6393847,
"paa-kiw",
"Latn",
}
m["kya"] = {
"Kwaya",
6450276,
"bnt-haj",
"Latn",
}
m["kyb"] = {
"Butbut Kalinga",
18753300,
"phi",
"Latn",
}
m["kyc"] = {
"Kyaka",
12952690,
"ngf-eng",
"Latn",
}
m["kyd"] = {
"Karey",
6370196,
"poz",
}
m["kye"] = {
"Krache",
35658,
"alv-gng",
}
m["kyf"] = {
"Kouya",
35595,
"kro-bet",
}
m["kyg"] = {
"Keyagana",
6398208,
"ngf-kag",
"Latn",
}
m["kyh"] = {
"Karok",
1288440,
"qfa-iso", -- or Hokan?
"Latn",
}
m["kyi"] = {
"Kiput",
3038653,
"poz-swa",
"Latn",
}
m["kyj"] = {
"กาเรา",
3192950,
"phi",
"Latn",
}
m["kyk"] = {
"Kamayo",
3192339,
"phi",
"Latn",
}
m["kyl"] = {
"Kalapuya",
3192120,
"nai-klp",
}
m["kym"] = {
"Kpatili",
3913982,
"znd",
}
m["kyn"] = {
"Karolanos",
6373093,
"phi",
}
m["kyo"] = {
"Kelon",
6386414,
"paa-tap",
"Latn",
}
m["kyp"] = {
"Kang",
25559558,
"tai",
}
m["kyq"] = {
"Kenga",
35707,
"csu-bgr",
}
m["kyr"] = {
"Kuruáya",
3200633,
"tup",
"Latn",
}
m["kys"] = {
"Baram Kayan",
2883794,
"poz",
"Latn",
}
m["kyt"] = {
"Kayagar",
6380394,
"paa-kay",
"Latn",
}
m["kyu"] = {
"กะยาตะวันตก",
12952596,
"kar",
"Kali, Mymr, Latn",
translit = {Kali = "Kali-translit"},
}
m["kyv"] = {
"Kayort",
6380675,
"inc-krd",
"Deva",
translit = {
Deva = "Deva-translit",
},
}
m["kyw"] = {
"Kudmali",
6446173,
"inc-bih",
"Deva, as-Beng, Orya, Chis",
translit = {
Deva = "Deva-translit",
["as-Beng"] = "Beng-translit",
Orya = "Orya-translit",
},
}
m["kyx"] = {
"Rapoisi",
7294279,
"paa-nbo",
"Latn",
}
m["kyy"] = {
"Kambaira",
6356254,
"ngf-kag",
"Latn",
}
m["kyz"] = {
"Kayabí",
6380372,
"tup-gua",
"Latn",
}
m["kza"] = {
"Western Karaboro",
36601,
"alv-krb",
}
m["kzb"] = {
"Kaibobo",
6347565,
"poz-cma",
}
m["kzc"] = {
"Bondoukou Kulango",
11031321,
"alv-kul",
"Latn",
}
m["kzd"] = {
"Kadai",
7679471,
"poz-cma",
"Latn",
}
--kze (Kosena) made an etym-only child of auy (Auyana) per [[Wiktionary:Language_treatment_requests#merge_Kosena_[kze]_into_Auyana_[auy]]]
m["kzf"] = {
"Da'a Kaili",
33103997,
"poz-kal",
"Latn",
}
m["kzg"] = {
"Kikai",
3196527,
"jpx-nry",
"Jpan",
translit = s["jpx-translit"],
display_text = s["jpx-displaytext"],
strip_diacritics = s["jpx-stripdiacritics"],
sort_key = s["jpx-sortkey"],
}
m["kzh"] = {
"Dongolawi",
5295991,
"nub",
"Latn",
}
m["kzi"] = {
"Kelabit",
6385445,
"poz-swa",
"Latn",
}
m["kzj"] = {
"กาดาซันชายฝั่ง",
3307195,
"poz-san",
"Latn",
}
m["kzk"] = {
"Kazukuru",
1089069,
"poz-ocw",
}
m["kzl"] = {
"Kayeli",
4207444,
"poz-cma",
"Latn",
}
m["kzm"] = {
"Kais",
6348319,
"ngf-sbh",
"Latn",
}
m["kzn"] = {
"Kokola",
11128329,
"bnt-mak",
"Latn",
ancestors = "vmw",
}
m["kzo"] = {
"Kaningi",
35683,
"bnt-mbt",
}
m["kzp"] = {
"Kaidipang",
6347611,
"phi",
"Latn",
}
m["kzq"] = {
"Kaike",
10951226,
"sit-tam",
}
m["kzr"] = {
"Karang",
35681,
"alv-mbm",
"Latn",
}
m["kzs"] = {
"Sugut Dusun",
12953510,
"poz-san",
"Latn",
}
m["kzt"] = {
"Tambunan Dusun",
12953514,
"poz-san",
"Latn",
}
m["kzu"] = {
"Kayupulau",
6380723,
"poz-ocw",
}
m["kzv"] = {
"Komyandaret",
6428671,
"ngf-gaw",
"Latn",
}
m["kzw"] = { -- contrast xoo, sai-kat, sai-xoc, the last of which the ISO conflated into this code
"Kariri",
12953620,
"sai-mje",
"Latn",
}
m["kzx"] = {
"Kamarian",
6356040,
"poz-cma",
"Latn",
}
m["kzy"] = {
"Kango-Sua",
11008360,
"bnt-kbi",
"Latn",
ancestors = "bip",
}
m["kzz"] = {
"Kalabra",
6350038,
"paa-wbh",
"Latn",
}
return require("Module:languages").finalizeData(m, "language")
nt7c1900gwwpyvpyc4ho1fh5pyoe3ut
มอดูล:languages/data/3/i
828
36378
5720759
5684157
2026-04-21T07:01:01Z
OctraBot
3198
บอต: แทนที่ข้อความโดยอัตโนมัติ (-standardChars +standard_chars)
5720759
Scribunto
text/plain
local m_langdata = require("Module:languages/data")
-- Loaded on demand, as it may not be needed (depending on the data).
local function u(...)
u = require("Module:string utilities").char
return u(...)
end
local c = m_langdata.chars
local p = m_langdata.puaChars
local s = m_langdata.shared
local m = {}
m["iai"] = {
"Iaai",
282888,
"poz-cln",
"Latn",
}
m["ian"] = {
"Iatmul",
5983460,
"paa-ndu",
"Latn",
}
m["iar"] = {
"Purari",
3499934,
"qfa-dis", -- Papuan; isolate in Glottolog; unclassified by Pawley and Hammarström; proposed to be in a putative Binanderean-Goilalan family by Usher
}
m["iba"] = {
"อีบัน",
33424,
"poz-mly",
"Latn",
}
m["ibb"] = {
"Ibibio",
33792,
"nic-ief",
"Latn",
}
m["ibd"] = {
"Iwaidja",
1977429,
"aus-wdj",
"Latn",
}
m["ibe"] = {
"Akpes",
35457,
"alv-von",
"Latn",
}
m["ibg"] = {
"Ibanag",
1775596,
"phi",
"Latn",
}
m["ibh"] = {
"Bih",
51955140,
"cmc",
"Latn",
}
m["ibl"] = {
"Ibaloi",
3147383,
"phi",
"Latn",
}
m["ibm"] = {
"Agoi",
34727,
"nic-ucr",
"Latn",
}
m["ibn"] = {
"Ibino",
3813281,
"nic-lcr",
"Latn",
}
m["ibr"] = {
"Ibuoro",
3813306,
"nic-ief",
"Latn",
}
m["ibu"] = {
"Ibu",
11732235,
"paa-nha",
"Latn",
}
m["iby"] = {
"Ibani",
11280479,
"ijo",
"Latn",
}
m["ica"] = {
"Ede Ica",
12952405,
"alv-ede",
"Latn",
}
m["ich"] = {
"Etkywan",
3914462,
"nic-jkn",
"Latn",
}
m["icl"] = {
"Icelandic Sign Language",
3436654,
"sgn",
"Latn", -- when documented
}
m["icr"] = {
"Islander Creole English",
2044587,
"crp",
"Latn",
ancestors = "en",
}
m["ida"] = {
"Idakho-Isukha-Tiriki",
12952512,
"bnt-lok",
"Latn",
}
m["idb"] = {
"Indo-Portuguese",
6025550,
"crp",
"Latn",
ancestors = "pt",
}
m["idc"] = {
"Idon",
3913366,
"nic-plc",
}
m["idd"] = {
"Ede Idaca",
13123376,
"alv-ede",
"Latn",
}
m["ide"] = {
"Idere",
3813288,
"nic-ief",
}
m["idi"] = {
"Idi",
5988630,
"paa-pht",
"Latn",
}
m["idr"] = {
"Indri",
35662,
"nic-ser",
}
m["ids"] = {
"Idesa",
3913979,
"alv-swd",
"Latn",
ancestors = "oke",
}
m["idt"] = {
"Idaté",
12952511,
"poz-tim",
"Latn",
}
m["idu"] = {
"Idoma",
35478,
"alv-ido",
"Latn",
}
m["ifa"] = {
"Amganad Ifugao",
18748222,
"phi",
"Latn",
}
m["ifb"] = {
"Batad Ifugao",
12953578,
"phi",
"Latn",
}
m["ife"] = {
"Ifè",
33606,
"alv-ede",
"Latn",
strip_diacritics = {remove_diacritics = c.grave .. c.acute .. c.circ .. c.macron .. c.caron},
sort_key = {
remove_diacritics = c.tilde,
from = {"ɖ", "dz", "ɛ", "gb", "kp", "ny", "ŋ", "ɔ", "ts"},
to = {"d" .. p[1], "d" .. p[2], "e" .. p[1], "g" .. p[1], "k" .. p[1], "n" .. p[1], "n" .. p[2], "o" .. p[1], "t" .. p[1]}
},
}
m["iff"] = {
"Ifo",
7902545,
"poz-vns",
"Latn",
}
m["ifk"] = {
"Tuwali Ifugao",
7857158,
"phi",
"Latn",
}
m["ifm"] = {
"Teke-Fuumu",
36603,
"bnt-tek",
}
m["ifu"] = {
"Mayoyao Ifugao",
12953579,
"phi",
"Latn",
}
m["ify"] = {
"Keley-I Kallahan",
3192221,
"phi",
"Latn",
}
m["igb"] = {
"Ebira",
35363,
"alv-nup",
"Latn",
}
m["ige"] = {
"Igede",
35420,
"alv-ido",
"Latn",
}
m["igg"] = {
"Igana",
5991454,
"paa-ram",
"Latn",
}
m["igl"] = {
"Igala",
35513,
"alv-yrd",
"Latn",
strip_diacritics = {remove_diacritics = c.grave .. c.acute .. c.circ .. c.macron .. c.dotabove .. c.caron .. c.lineabove},
sort_key = {
from = {
"ñm", "ñw", -- 3 chars
"ch", "ẹ", "gb", "gw", "kp", "kw", "ny", "ñ", "ọ" -- 2 chars
},
to = {
"n" .. p[3], "n" .. p[4],
"c" .. p[1], "e" .. p[1], "g" .. p[1], "g" .. p[2], "k" .. p[1], "k" .. p[2], "n" .. p[1], "n" .. p[2], "o" .. p[1]
}
},
}
m["igm"] = {
"Kanggape",
6362743,
"paa-ram",
"Latn",
}
m["ign"] = {
"Ignaciano",
3148190,
"awd",
"Latn",
}
m["igo"] = {
"Isebe",
11732248,
"ngf-gum",
"Latn",
}
m["igs"] = {
"Glosa",
1138529,
"art",
"Latn",
type = "appendix-constructed",
}
m["igw"] = {
"Igwe",
3913985,
"alv-yek",
"Latn",
}
m["ihb"] = {
"Pidgin Iha",
12639686,
"crp",
ancestors = "ihp",
}
m["ihi"] = {
"Ihievbe",
3441193,
"alv-eeo",
"Latn",
ancestors = "ema",
}
m["ihp"] = {
"Iha",
5994495,
"paa-mbi",
"Latn",
}
m["ijc"] = {
"Izon",
35483,
"ijo",
"Latn",
}
m["ije"] = {
"Biseni",
35010,
"ijo",
"Latn",
}
m["ijj"] = {
"Ede Ije",
12952406,
"alv-ede",
"Latn",
}
m["ijn"] = {
"Kalabari",
35697,
"ijo",
"Latn",
}
m["ijs"] = {
"Southeast Ijo",
3915854,
"ijo",
"Latn",
}
m["ike"] = {
"Eastern Canadian Inuktitut",
4126517,
"esx-inu",
"Cans, Latn",
}
m["iki"] = {
"Iko",
3813290,
"nic-lcr",
"Latn",
}
m["ikk"] = {
"Ika",
35406,
"alv-igb",
"Latn",
}
m["ikl"] = {
"Ikulu",
425973,
"nic-plc",
"Latn",
}
m["iko"] = {
"Olulumo-Ikom",
3914402,
"nic-uce",
"Latn",
}
m["ikp"] = {
"Ikpeshi",
3912777,
"alv-yek",
"Latn",
}
m["ikr"] = {
"Ikaranggal",
5995402,
"aus-pam",
}
m["iks"] = {
"Inuit Sign Language",
13360244,
"sgn",
"Latn", -- when documented
}
m["ikt"] = {
"Inuvialuktun",
27990,
"esx-inu",
"Cans, Latn",
}
m["ikv"] = {
"Iku-Gora-Ankwa",
3913940,
"nic-plc",
}
m["ikw"] = {
"Ikwere",
35399,
"alv-igb",
"Latn",
}
m["ikx"] = {
"Ik",
35472,
"ssa-klk",
"Latn",
}
m["ikz"] = {
"Ikizu",
10977626,
"bnt-lok",
"Latn",
}
m["ila"] = {
"Ile Ape",
12473380,
"poz-cet",
}
m["ilb"] = {
"Ila",
10962725,
"bnt-bot",
"Latn",
}
m["ilg"] = {
"Ilgar",
5997810,
"aus-wdj",
"Latn",
}
m["ili"] = {
"Ili Turki",
33627,
"trk-kar",
"Cyrl",
}
m["ilk"] = {
"Ilongot",
3148787,
"phi",
"Latn",
}
m["ill"] = {
"อีรานุน",
12953581,
"phi",
"Latn, Arab",
}
m["ilo"] = {
"อีโลกาโน",
35936,
"phi",
"Latn, Tglg",
translit = {
Tglg = "ilo-translit",
},
override_translit = true,
strip_diacritics = {
Latn = {
remove_diacritics = c.grave .. c.acute .. c.circ .. c.diaer,
}
},
sort_key = {
Latn = "tl-sortkey",
},
standard_chars = {
Latn = "AaBbKkDdEeGgHhIiLlMmNnOoPpRrSsTtUuWwYy" .. c.punc,
},
}
m["ils"] = {
"International Sign",
35754,
"sgn",
}
m["ilu"] = {
"Ili'uun",
12632888,
"poz-tim",
"Latn",
}
m["ilv"] = {
"Ilue",
3813301,
"nic-lcr",
"Latn",
}
m["ima"] = {
"Mala Malasar",
6740693,
"dra-tam",
}
m["imi"] = {
"Anamgura",
3501881,
"ngf-pom",
"Latn",
}
m["iml"] = {
"Miluk",
3314550,
"nai-coo",
"Latn",
}
m["imn"] = {
"Imonda",
6005721,
"paa-brd",
"Latn",
}
m["imo"] = {
"Imbongu",
12632895,
"ngf-chw",
"Latn",
}
m["imr"] = {
"Imroing",
6008394,
"poz-tim",
}
m["ims"] = {
"Marsian",
1265446,
"itc-sbl",
"Ital, Latn",
-- Ital translit in [[Module:scripts/data]]
display_text = {
Latn = s["itc-Latn-displaytext"]
},
strip_diacritics = {
Latn = s["itc-Latn-stripdiacritics"]
},
sort_key = {
Latn = s["itc-Latn-sortkey"]
},
}
m["imy"] = {
"Milyan",
3832946,
"ine-luw",
"Lyci",
}
m["inb"] = {
"Inga",
35491,
"qwe",
ancestors = "qwe-kch",
}
m["ing"] = {
"Deg Xinag",
27782,
"ath-nor",
"Latn",
}
m["inh"] = {
"อิงกุช",
33509,
"cau-vay",
"Cyrl, Latn, Arab",
translit = {
Cyrl = "cau-nec-translit",
Arab = "ar-translit",
},
override_translit = true,
display_text = {Cyrl = s["cau-Cyrl-displaytext"]},
strip_diacritics = {
Cyrl = s["cau-Cyrl-stripdiacritics"],
Latn = s["cau-Latn-stripdiacritics"],
},
sort_key = {
Cyrl = {
from = {"аь", "гӏ", "ё", "кх", "къ", "кӏ", "пӏ", "тӏ", "хь", "хӏ", "цӏ", "чӏ", "яь"},
to = {"а" .. p[1], "г" .. p[1], "е" .. p[1], "к" .. p[1], "к" .. p[2], "к" .. p[3], "п" .. p[1], "т" .. p[1], "х" .. p[1], "х" .. p[2], "ц" .. p[1], "ч" .. p[1], "я" .. p[1]}
},
},
}
m["inj"] = {
"Jungle Inga",
16115012,
"qwe",
ancestors = "qwe-kch",
}
m["inl"] = {
"มืออินโดนีเซีย",
3915477,
"sgn",
"Latn", -- when documented
}
m["inm"] = {
"Minaean",
737784,
"sem-osa",
"Sarb",
-- Sarb translit in [[Module:scripts/data]]
}
m["inn"] = {
"Isinai",
6081098,
"phi",
"Latn",
}
m["ino"] = {
"Inoke-Yate",
6036531,
"ngf-kag",
"Latn",
}
m["inp"] = {
"Iñapari",
15338035,
"awd",
"Latn",
}
m["ins"] = {
"มืออินเดีย",
12953486,
"sgn",
}
m["int"] = {
"Intha",
6057507,
"tbq-brm",
ancestors = "obr",
}
m["inz"] = {
"Ineseño",
35443,
"nai-chu",
"Latn",
}
m["ior"] = {
"Inor",
35763,
"sem-eth",
"Ethi",
}
m["iou"] = {
"Tuma-Irumu",
7852460,
"ngf-fin",
"Latn",
}
m["iow"] = {
"Chiwere",
56737,
"sio-msv",
"Latn",
}
m["ipi"] = {
"Ipili",
6065141,
"ngf-eng",
"Latn",
}
m["ipo"] = {
"Ipiko",
10566515,
"paa-ani",
"Latn",
}
m["iqu"] = {
"Iquito",
2669184,
"sai-zap",
"Latn",
}
m["iqw"] = {
"Ikwo",
11926474,
"alv-igb",
"Latn",
ancestors = "izi",
}
m["ire"] = {
"Iresim",
6069398,
"poz-hce",
"Latn",
}
m["irh"] = {
"Irarutu",
3027928,
"poz-cet",
"Latn",
}
m["iri"] = {
"Rigwe",
3912756,
"nic-plc",
"Latn",
}
m["irk"] = {
"Iraqw",
33595,
"cus-sou",
"Latn",
}
m["irn"] = {
"Irantxe",
3409301,
nil,
"Latn",
}
m["irr"] = {
"Ir",
3071880,
"mkh-kat",
}
m["iru"] = {
"Irula",
33363,
"dra-imd",
"Taml",
translit = "Taml-translit"
}
m["irx"] = {
-- Wikipedia and Glottolog say that North and South Kamberau are different languages but ISO 639-3 has not (yet?)
-- split them.
"Kamberau",
6356317,
"ngf-ask",
"Latn",
}
m["iry"] = {
"Iraya",
6068356,
"phi",
"Latn",
}
m["isa"] = {
"Isabi",
11732247,
"ngf-kag",
"Latn",
}
m["isc"] = {
"Isconahua",
3052971,
"sai-pan",
"Latn",
}
m["isd"] = {
"Isnag",
6085162,
"phi",
"Latn",
}
m["ise"] = {
"Italian Sign Language",
375619,
"sgn",
"Latn", -- when documented
}
m["isg"] = {
"Irish Sign Language",
14183,
"sgn",
"Latn", -- when documented
}
m["ish"] = {
"Esan",
35268,
"alv-eeo",
"Latn",
}
m["isi"] = {
"Nkem-Nkum",
36261,
"nic-eko",
"Latn",
}
m["isk"] = {
"Ishkashimi",
33419,
"ira-sgi",
"Cyrl, Arab",
}
m["ism"] = {
"Masimasi",
6783273,
"poz-ocw",
"Latn",
}
m["isn"] = {
"Isanzu",
6078891,
"bnt-tkm",
"Latn",
}
m["iso"] = {
"Isoko",
35414,
"alv-swd",
"Latn",
}
m["isr"] = {
"Israeli Sign Language",
2911863,
"sgn",
"Sgnw",
}
m["ist"] = {
"อิสเตรีย",
35845,
"roa-dal",
"Latn",
}
m["isu"] = {
"Isu",
6089423,
"nic-rnw",
"Latn",
}
m["isv"] = {
"Interslavic",
148971,
"art",
"Latn, Cyrl",
type = "appendix-constructed",
ancestors = "sla-pro",
}
m["itb"] = {
"Binongan Itneg",
12953584,
"phi",
"Latn",
}
m["itd"] = {
"Southern Tidung",
63214959,
"poz-san",
"Latn",
}
m["ite"] = {
"Itene",
3038640,
"sai-cpc",
"Latn",
}
m["iti"] = {
"Inlaod Itneg",
12953585,
"phi",
}
m["itk"] = {
"Judeo-Italian",
1145414,
"roa-itd",
"Hebr, Latn",
-- Hebr display_text, strip_diacritics, sort_key in [[Module:scripts/data]]
}
m["itl"] = {
"Itelmen",
33624,
"qfa-cka",
"Cyrl, Latn",
strip_diacritics = {
Cyrl = {
from = {"['’]", "[ӅԮ]", "[ӆԯ]", "Ҳ", "ҳ"},
to = {"ʼ", "Ԓ", "ԓ", "Ӽ", "ӽ"}
},
},
sort_key = {
Cyrl = {
from = {
"ӑ", "ё", "кʼ", "ӄʼ", "о̆", "пʼ", "тʼ", "ў", "чʼ", -- 2 chars
"ӄ", "љ", "ԓ", "њ", "ӈ", "ӽ", "ә" -- 1 char
},
to = {
"а" .. p[1], "е" .. p[1], "к" .. p[1], "к" .. p[3], "о" .. p[1], "п" .. p[1], "т" .. p[1], "у" .. p[1], "ч" .. p[1],
"к" .. p[2], "л" .. p[1], "л" .. p[2], "н" .. p[1], "н" .. p[2], "х" .. p[1], "ь" .. p[1]
}
},
},
}
m["itm"] = {
"Itu Mbon Uzo",
10977737,
"nic-ief",
"Latn",
ancestors = "ibr",
}
m["ito"] = {
"Itonama",
950585,
"qfa-iso",
}
m["itr"] = {
"Iteri",
2083185,
"paa-lem",
"Latn",
}
m["its"] = {
"Itsekiri",
36045,
"alv-edk",
"Latn",
strip_diacritics = {Latn = {remove_diacritics = c.grave .. c.acute .. c.macron}},
sort_key = {
remove_diacritics = c.tilde,
from = {"ẹ", "gb", "gh", "kp", "ọ", "ts", "ṣ"},
to = {"e" .. p[1], "g" .. p[1], "g" .. p[2], "k" .. p[1], "o" .. p[1], "t" .. p[1], "t" .. p[1]}
},
}
m["itt"] = {
"Maeng Itneg",
18748761,
"phi",
}
m["itv"] = {
"Itawit",
3915527,
"phi",
"Latn",
}
m["itw"] = {
"Ito",
11128810,
"nic-ief",
ancestors = "ibr",
}
m["itx"] = {
"Itik",
6094713,
"paa-tkw",
}
m["ity"] = {
"Moyadan Itneg",
12953583,
"phi",
}
m["itz"] = {
"Itza'",
35537,
"myn",
"Latn",
}
m["ium"] = {
"เมี่ยน",
2498808,
"hmx-mie",
"Latn",
}
m["ivb"] = {
"Ibatan",
18748212,
"phi",
"Latn",
}
m["ivv"] = {
"Ivatan",
3547080,
"phi",
"Latn",
}
m["iwk"] = {
"I-Wak",
12632789,
"phi",
}
m["iwm"] = {
"Iwam",
3915215,
"paa-iwm",
"Latn",
}
m["iwo"] = {
"Iwur",
6101006,
"ngf-okk",
"Latn",
}
m["iws"] = {
"Sepik Iwam",
16893603,
"paa-iwm",
"Latn",
}
m["ixc"] = {
"Ixcatec",
56706,
"omq",
}
m["ixl"] = {
"Ixil",
35528,
"myn",
"Latn",
}
m["iya"] = {
"Iyayu",
3913390,
"alv-nwd",
"Latn",
}
m["iyo"] = {
"Mesaka",
36080,
"nic-tiv",
"Latn",
}
m["iyx"] = {
"Yaa",
36909,
"bnt-nze",
"Latn",
}
m["izh"] = {
"อิงเกรีย",
33559,
"urj-fin",
"Latn",
sort_key = {
from = {
"š", "ž",
},
to = {
"s" .. p[1], "z" .. p[1],
}
},
}
m["izi"] = {
"Izi-Ezaa-Ikwo-Mgbo",
11927027,
"alv-igb",
}
m["izr"] = {
"Izere",
6101921,
"nic-plc",
"Latn",
}
m["izz"] = {
"Izi",
3914387,
"alv-igb",
"Latn",
ancestors = "izi",
}
return require("Module:languages").finalizeData(m, "language")
j9de089eppkcf5v3dfhi0hp4mj41cy6
มอดูล:languages/data/3/h
828
36379
5720758
5684156
2026-04-21T07:00:59Z
OctraBot
3198
บอต: แทนที่ข้อความโดยอัตโนมัติ (-standardChars +standard_chars)
5720758
Scribunto
text/plain
local m_langdata = require("Module:languages/data")
-- Loaded on demand, as it may not be needed (depending on the data).
local function u(...)
u = require("Module:string utilities").char
return u(...)
end
local c = m_langdata.chars
local p = m_langdata.puaChars
local s = m_langdata.shared
local m = {}
m["haa"] = {
"Hän",
28272,
"ath-nor",
"Latn",
}
m["hab"] = {
"Hanoi Sign Language",
12632107,
"sgn",
"Latn", -- when documented
}
m["hac"] = {
"Gurani",
33733,
"ira-zgr",
"ku-Arab",
translit = "ckb-translit",
}
m["had"] = {
"Hatam",
56825,
"qfa-iso", -- Would form paa-ham with [[w:Mansim language]] if that language were added
"Latn",
}
m["haf"] = {
"Haiphong Sign Language",
39868240,
"sgn",
}
m["hag"] = {
"Hanga",
35426,
"nic-dag",
"Latn",
}
m["hah"] = {
"Hahon",
3125730,
"poz-ocw",
"Latn",
}
m["hai"] = {
"Haida",
33303,
"qfa-iso",
"Latn",
}
m["haj"] = {
"ฮาชอง",
3350576,
"qfa-mix",
"as-Beng, Latn",
ancestors = "tbq-pro, inc-oas, inc-obn",
}
m["hak"] = {
"แคะ",
33375,
"zhx",
"Hants",
ancestors = "ltc",
generate_forms = "zh-generateforms",
sort_key = "Hani-sortkey",
}
m["hal"] = {
"Halang",
56307,
"mkh",
"Latn",
}
m["ham"] = {
"Hewa",
5748345,
"paa-spk",
"Latn",
}
m["hao"] = {
"Hakö",
3125871,
"poz-ocw",
"Latn",
}
m["hap"] = {
"Hupla",
5946223,
"ngf-dan",
"Latn",
}
m["har"] = {
"Harari",
33626,
"sem-eth",
"Ethi",
translit = "Ethi-translit",
}
m["has"] = {
"Haisla",
3107399,
"wak",
"Latn",
}
m["hav"] = {
"Havu",
5684097,
"bnt-shh",
"Latn",
}
m["haw"] = {
"ฮาวาย",
33569,
"poz-pep",
"Latn",
display_text = {
from = {"‘"},
to = {"ʻ"}
},
sort_key = {remove_diacritics = c.macron},
standard_chars = "AaĀāEeĒēIiĪīOoŌōUuŪūHhKkLlMmNnPpWwʻ" .. c.punc,
}
m["hax"] = {
"Southern Haida",
12953543,
"qfa-iso",
"Latn",
ancestors = "hai",
}
m["hay"] = {
"Haya",
35756,
"bnt-haj",
"Latn",
}
m["hba"] = {
"Hamba",
11028905,
"bnt-tet",
"Latn",
}
m["hbb"] = {
"Huba",
56290,
"cdc-cbm",
"Latn",
}
m["hbn"] = {
"Heiban",
35523,
"alv-hei",
}
m["hbu"] = {
"Habu",
1567033,
"poz-cet",
"Latn",
}
m["hca"] = {
"Andaman Creole Hindi",
7599417,
"crp",
ancestors = "hi, bn, ta",
}
m["hch"] = {
"Huichol",
35575,
"azc",
"Latn",
}
m["hdn"] = {
"Northern Haida",
20054484,
"qfa-iso",
"Latn",
ancestors = "hai",
}
m["hds"] = {
"Honduras Sign Language",
3915496,
"sgn",
"Latn", -- when documented
}
m["hdy"] = {
"Hadiyya",
56613,
"cus-hec",
"Latn, Ethi",
}
m["hea"] = {
"Northern Qiandong Miao",
3138832,
"hmn",
"Latn, Bopo",
}
m["hed"] = {
"Herdé",
56253,
"cdc-mas",
"Latn",
}
m["heg"] = {
"Helong",
35432,
"poz-tim",
"Latn",
}
m["heh"] = {
"Hehe",
3129390,
"bnt-bki",
"Latn",
}
m["hei"] = {
"Heiltsuk",
5699507,
"wak",
"Latn",
}
m["hem"] = {
"Hemba",
5711209,
"bnt-lbn",
}
m["hgm"] = {
"Haiǁom",
4494781,
"khi-khk",
"Latn",
}
m["hgw"] = {
"Haigwai",
5639108,
"poz-ocw",
"Latn",
}
m["hhi"] = {
"Hoia Hoia",
5877767,
"paa-ani",
}
m["hhr"] = {
"Kerak",
11010783,
"alv-jfe",
}
m["hhy"] = {
"Hoyahoya",
15633149,
"paa-ani",
"Latn",
}
m["hia"] = {
"Lamang",
35700,
"cdc-cbm",
"Latn",
}
m["hib"] = {
"Hibito",
3135164,
"qfa-unc", -- poorly attested; possibly in a Hibito-Cholon or Cholonan family
}
m["hid"] = {
"Hidatsa",
3135234,
"sio-mor",
"Latn",
}
m["hif"] = {
"ฮินดีแบบฟีจี",
46728,
"inc-hie",
"Latn",
ancestors = "awa",
}
m["hig"] = {
"Kamwe",
56271,
"cdc-cbm",
}
m["hih"] = {
"Pamosu",
12953011,
"ngf-tib",
"Latn",
}
m["hii"] = {
"Hinduri",
5766763,
"him",
"Deva, Takr",
}
m["hij"] = {
"Hijuk",
35274,
"bnt-bsa",
}
m["hik"] = {
"Seit-Kaitetu",
7446989,
"poz-cma",
}
m["hil"] = {
"ฮีลีไกโนน",
35978,
"phi",
"Latn",
strip_diacritics = {Latn = {remove_diacritics = c.grave .. c.acute .. c.circ}},
standard_chars = {
Latn = "AaBbKkDdEeGgHhIiLlMmNnOoPpRrSsTtUuWwYy",
c.punc
},
sort_key = {
Latn = "tl-sortkey"
},
}
m["hio"] = {
"Tshwa",
963636,
"khi-kal",
}
m["hir"] = {
"Himarimã",
5765127,
"qfa-unc", -- language of uncontacted group; word list lost; believed Arawan
}
m["hit"] = {
"Hittite",
35668,
"ine-ana",
"Xsux, Latn",
}
m["hiw"] = {
"Hiw",
3138713,
"poz-vnn",
"Latn",
}
m["hix"] = {
"Hixkaryana",
56522,
"sai-prk",
"Latn",
}
m["hji"] = {
"Haji",
5639933,
"poz-mly",
"Latn",
}
m["hka"] = {
"Kahe",
3892562,
"bnt-chg",
"Latn",
}
m["hke"] = {
"Hunde",
3065432,
"bnt-shh",
"Latn",
}
m["hkh"] = {
"Pogali",
105198619,
"inc-kas",
}
m["hkk"] = {
"Hunjara-Kaina Ke",
63213931,
"paa-bin",
"Latn",
}
m["hkn"] = {
"Mel-Khaonh",
19059577,
"mkh-ban",
}
m["hks"] = {
"Hong Kong Sign Language",
17038844,
"sgn",
}
m["hla"] = {
"Halia",
3125959,
"poz-ocw",
"Latn",
}
m["hlb"] = {
"Halbi",
3695692,
"inc-hal",
"Deva, Orya",
translit = {
Deva = "Deva-translit",
Orya = "Orya-translit",
},
}
m["hld"] = {
"Halang Doan",
3914632,
"mkh-ban",
}
m["hle"] = {
"Hlersu",
5873537,
"tbq-llo",
}
m["hlt"] = {
"Nga La",
12952942,
"tbq-kuk",
"Latn",
}
m["hma"] = {
"Southern Mashan Hmong",
12953560,
"hmn",
"Latn",
}
m["hmb"] = {
"Humburi Senni",
35486,
"son",
"Latn, Arab",
}
m["hmc"] = {
"Central Huishui Hmong",
12953558,
"hmn",
}
m["hmd"] = {
"A-Hmao",
1108934,
"hmn",
"Latn, Plrd",
}
m["hme"] = {
"Eastern Huishui Hmong",
12953559,
"hmn",
}
m["hmf"] = {
"Hmong Don",
22911602,
"hmn",
}
m["hmg"] = {
"Southwestern Guiyang Hmong",
27478542,
"hmn",
}
m["hmh"] = {
"Southwestern Huishui Hmong",
12953565,
"hmn",
}
m["hmi"] = {
"Northern Huishui Hmong",
27434946,
"hmn",
}
m["hmj"] = {
"Ge",
11251864,
"hmn",
"Bopo",
}
m["hmk"] = {
"Yemaek",
8050724,
"qfa-kor",
"Hani",
sort_key = "Hani-sortkey",
}
m["hml"] = {
"Luopohe Hmong",
14468943,
"hmn",
}
m["hmm"] = {
"Central Mashan Hmong",
12953561,
"hmn",
}
m["hmp"] = {
"Northern Mashan Hmong",
12953564,
"hmn",
}
m["hmq"] = {
"Eastern Qiandong Miao",
27431369,
"hmn",
}
m["hmr"] = {
"Hmar",
2992841,
"tbq-kuk",
"Latn",
ancestors = "lus",
}
m["hms"] = {
"Southern Qiandong Miao",
12953562,
"hmn",
}
m["hmt"] = {
"Hamtai",
5646436,
"ngf-ang",
"Latn",
}
m["hmu"] = {
"Hamap",
12952484,
"paa-tap",
"Latn",
}
m["hmv"] = {
"Hmong Dô",
22911598,
"hmn",
"Latn", -- Probably also Hmng
}
m["hmw"] = {
"Western Mashan Hmong",
12953563,
"hmn",
}
m["hmy"] = {
"Southern Guiyang Hmong",
12953553,
"hmn",
}
m["hmz"] = {
"Hmong Shua",
25559603,
"hmn",
}
m["hna"] = {
"Mina",
56532,
"cdc-cbm",
}
m["hnd"] = {
"Southern Hindko",
382273,
"inc-pan",
"pa-Arab",
ancestors = "lah",
}
m["hne"] = {
"Chhattisgarhi",
33158,
"inc-hie",
"Deva",
ancestors = "inc-oaw",
translit = {
Deva = "Deva-translit",
},
}
m["hnh"] = {
"ǁAni",
3832982,
"khi-kal",
"Latn",
}
m["hni"] = {
"Hani",
56516,
"tbq-han",
"Latn",
}
m["hnj"] = {
"ม้งเขียว",
3138831,
"hmn",
"Latn, Hmng, Hmnp",
}
m["hnm"] = {
"ไหหลำ",
934541,
"zhx-nan",
"Hants",
generate_forms = "zh-generateforms",
sort_key = "Hani-sortkey",
}
m["hnn"] = {
"ฮานูโนโอ",
35435,
"phi",
"Hano, Latn",
translit = {Hano = "hnn-translit"},
override_translit = true,
strip_diacritics = {Latn = {remove_diacritics = c.grave .. c.acute .. c.circ}},
standard_chars = {
Latn = "AaBbKkDdEeGgHhIiLlMmNnOoPpRrSsTtUuWwYy",
c.punc
},
sort_key = {
Latn = "tl-sortkey",
},
}
m["hno"] = {
"Northern Hindko",
6346358,
"inc-pan",
"Arab",
ancestors = "lah",
}
m["hns"] = {
"ฮินดูสตานีแบบแคริบเบียน",
1843468,
"inc", -- "crp"?
"Arab, Deva, Kthi, Latn",
ancestors = "bho, awa",
}
m["hnu"] = {
"Hung",
12632753,
"mkh-vie",
}
m["hoa"] = {
"Hoava",
3138887,
"poz-ocw",
"Latn",
}
m["hob"] = {
"Austronesian Mari",
6760941,
"poz-ocw",
"Latn",
}
m["hoc"] = {
"Ho",
33270,
"mun",
"Wara, Orya, Deva, Beng, Latn",
translit = {
Orya = "Orya-translit",
Deva = "Deva-translit",
Beng = "Beng-translit",
},
}
m["hod"] = {
"Holma",
56331,
"cdc-cbm",
"Latn",
}
m["hoe"] = {
"Horom",
3914008,
"nic-ple",
"Latn",
}
m["hoh"] = {
"Hobyót",
33299,
"sem-sar",
"Arab, Latn",
}
m["hoi"] = {
"Holikachuk",
28508,
"ath-nor",
"Latn",
}
m["hoj"] = {
"Hadoti",
33227,
"raj",
"Deva",
translit = "Deva-translit",
}
m["hol"] = {
"Holu",
4121133,
"bnt-pen",
"Latn",
}
m["hom"] = {
"Homa",
3449953,
"bnt-boa",
"Latn",
}
m["hoo"] = {
"Holoholo",
3139484,
"bnt-tkm",
"Latn",
}
m["hop"] = {
"Hopi",
56421,
"azc",
"Latn",
}
m["hor"] = {
"Horo",
641748,
"csu-sar",
}
m["hos"] = {
"Ho Chi Minh City Sign Language",
16111971,
"sgn",
"Latn", -- when documented
}
m["hot"] = {
"Hote",
12632404,
"poz-ocw",
"Latn",
}
m["hov"] = {
"Hovongan",
5917269,
"poz",
"Latn",
}
m["how"] = {
"Honi",
56842,
"tbq-han",
}
m["hoy"] = {
"Holiya",
5880707,
"dra-kan",
}
m["hoz"] = {
"Hozo",
5923010,
"omv-mao",
}
m["hpo"] = {
"Hpon",
5923277,
"tbq-brm",
"Latn",
}
m["hps"] = {
"Hawai'i Pidgin Sign Language",
33358,
"sgn",
"Latn", -- when documented
}
m["hra"] = {
"Hrangkhol",
5923435,
"tbq-kuk",
"Latn",
}
m["hrc"] = {
"Niwer Mil",
30323994,
"poz-oce",
"Latn",
}
m["hre"] = {
"Hrê",
3915794,
"mkh-nbn",
"Latn",
}
m["hrk"] = {
"Haruku",
5675762,
"poz-cma",
}
m["hrm"] = {
"Horned Miao",
63213949,
"hmn",
}
m["hro"] = {
"Haroi",
3127568,
"cmc",
"Latn",
}
m["hrp"] = {
"Nhirrpi",
32571318,
"aus-kar",
}
m["hrt"] = {
"Hértevin",
33290,
"sem-nna",
"Latn",
}
m["hru"] = {
"Hruso",
5923933,
"sit-hrs",
"Latn",
}
m["hrw"] = {
"Warwar Feni",
56704265,
"poz-oce",
"Latn",
}
m["hrx"] = {
"ฮุนสริก",
304049,
"gmw-hgm",
"Latn",
ancestors = "gmw-cfr",
}
m["hrz"] = {
"Harzani",
56464,
"xme-ttc",
"fa-Arab, Latn",
ancestors = "xme-ttc-nor",
}
m["hsb"] = {
"ซอร์บตอนบน",
13248,
"wen",
"Latn",
sort_key = s["wen-sortkey"],
standard_chars = "AaBbCcČčĆćDdEeĚěFfGgHhIiJjKkŁłLlMmNnŃńOoÓóPpRrŘřSsŠšTtUuWwYyZzŽžŹź" .. c.punc,
}
m["hsh"] = {
"Hungarian Sign Language",
13636869,
"sgn",
"Latn", -- when documented
}
m["hsl"] = {
"Hausa Sign Language",
3915462,
"sgn",
"Latn", -- when documented
}
m["hsn"] = {
"เซียง",
13220,
"zhx",
"Hants",
ancestors = "ltc",
generate_forms = "zh-generateforms",
translit = "zh-translit",
sort_key = "Hani-sortkey",
}
m["hss"] = {
"Harsusi",
33423,
"sem-sar",
"Arab, Latn",
}
m["hti"] = {
"Hoti",
5912372,
"poz-cma",
"Latn",
}
m["hto"] = {
"Minica Huitoto",
948514,
"sai-wit",
"Latn",
}
m["hts"] = {
"Hadza",
33411,
"qfa-iso",
"Latn",
}
m["htu"] = {
"Hitu",
5872700,
"poz-cma",
"Latn",
}
m["hub"] = {
"Huambisa",
1526037,
"sai-jiv",
"Latn",
}
m["huc"] = {
"ǂHoan",
2053913,
"khi-kxa",
"Latn",
}
m["hud"] = {
"Huaulu",
12952504,
"poz-cma",
"Latn",
}
m["huf"] = {
"Humene",
11732231,
"paa-kwa",
"Latn",
}
m["hug"] = {
"Huachipaeri",
3446617,
"sai-har",
"Latn",
}
m["huh"] = {
"Huilliche",
35531,
"sai-ara",
"Latn",
}
m["hui"] = {
"Huli",
3125121,
"ngf-eng",
"Latn",
}
m["huj"] = {
"Northern Guiyang Hmong",
12953554,
"hmn",
}
m["huk"] = {
"Hulung",
12952505,
"poz-cet",
}
m["hul"] = {
"Hula",
6382179,
"poz-ocw",
"Latn",
}
m["hum"] = {
"Hungana",
10975396,
"bnt-yak",
}
m["huo"] = {
"Hu",
3141783,
"mkh-pal",
}
m["hup"] = {
"Hupa",
28058,
"ath-pco",
"Latn",
}
m["huq"] = {
"Tsat",
34133,
"cmc",
}
m["hur"] = {
"Halkomelem",
35388,
"sal",
"Latn",
}
m["hus"] = {
"Wastek",
35573,
"myn",
"Latn",
}
m["huu"] = {
"Murui Huitoto",
2640935,
"sai-wit",
"Latn",
}
m["huv"] = {
"Huave",
12954031,
"qfa-iso",
"Latn",
}
m["huw"] = {
"Hukumina",
3142988,
"poz-cma",
"Latn",
}
m["hux"] = {
"Nüpode Huitoto",
56333,
"sai-wit",
"Latn",
}
m["huy"] = {
"Hulaulá",
33426,
"sem-nna",
"Hebr",
-- Hebr display_text, strip_diacritics, sort_key in [[Module:scripts/data]]
}
m["huz"] = {
"Hunzib",
56564,
"cau-ets",
"Cyrl",
translit = "huz-translit",
display_text = {Cyrl = s["cau-Cyrl-displaytext"]},
strip_diacritics = {Cyrl = s["cau-Cyrl-stripdiacritics"]},
}
m["hvc"] = {
"Haitian Vodoun Culture Language",
3504239,
"crp",
"Latn",
}
m["hvk"] = {
"Haveke",
5683513,
"poz-cln",
"Latn",
}
m["hvn"] = {
"Sabu",
3128792,
"poz-cet",
"Latn",
}
m["hwa"] = {
"Wané",
3914887,
"kro-ekr",
"Latn",
}
m["hwc"] = {
"Hawaiian Creole",
35602,
"crp",
"Latn",
}
m["hwo"] = {
"Hwana",
56498,
"cdc-cbm",
"Latn",
}
m["hya"] = {
"Hya",
56798,
"cdc-cbm",
"Latn",
}
return require("Module:languages").finalizeData(m, "language")
qhhob2c9rjmybektliq0fbeijwr90s9
มอดูล:languages/data/3/g
828
36380
5720757
5719157
2026-04-21T07:00:57Z
OctraBot
3198
บอต: แทนที่ข้อความโดยอัตโนมัติ (-standardChars +standard_chars)
5720757
Scribunto
text/plain
local m_langdata = require("Module:languages/data")
-- Loaded on demand, as it may not be needed (depending on the data).
local function u(...)
u = require("Module:string utilities").char
return u(...)
end
local c = m_langdata.chars
local p = m_langdata.puaChars
local s = m_langdata.shared
local m = {}
m["gaa"] = {
"Ga",
33287,
"alv-gda",
"Latn",
}
m["gab"] = {
"Gabri",
3441237,
"cdc-est",
"Latn",
}
m["gac"] = {
"Mixed Great Andamanese",
56329630,
"qfa-adn",
"Latn",
}
m["gad"] = { -- not to be confused with gdk, gdg
"Gaddang",
3438830,
"phi",
"Latn",
}
m["gae"] = {
"Warekena",
1091095,
"awd-nwk",
"Latn",
}
m["gaf"] = {
"Gende",
3100425,
"ngf-kag",
"Latn",
}
m["gag"] = {
"กากาอุซ",
33457,
"trk-ogz",
"Latn, Cyrl",
ancestors = "trk-oat",
dotted_dotless_i = true,
sort_key = {
Latn = {
from = {
"i", -- Ensure "i" comes after "ı".
"ä", "ç", "ê", "ı", "ö", "ş", "ţ", "ü"
},
to = {
"i" .. p[1],
"a" .. p[1], "c" .. p[1], "e" .. p[1], "i", "o" .. p[1], "s" .. p[1], "t" .. p[1], "u" .. p[1]
}
},
},
}
m["gah"] = {
"Alekano",
3441595,
"ngf-kag",
"Latn",
}
m["gai"] = {
"Borei",
6799756,
"paa-ram",
"Latn",
}
m["gaj"] = {
"Gadsup",
5516467,
"ngf-kag",
"Latn",
}
m["gak"] = {
"Gamkonora",
5520226,
"paa-nha",
"Latn",
}
m["gal"] = {
"Galoli",
35322,
"poz-tim",
"Latn",
}
m["gam"] = {
"Kandawo",
6361369,
"ngf-chw",
"Latn",
}
m["gan"] = {
"กั้น",
33475,
"zhx",
"Hants",
ancestors = "ltc",
generate_forms = "zh-generateforms",
translit = "zh-translit",
sort_key = "Hani-sortkey",
}
m["gao"] = {
"Gants",
5521529,
"ngf-eso",
"Latn",
}
m["gap"] = {
"Gal",
5517742,
"ngf-han",
"Latn",
}
m["gaq"] = {
"Gata'",
3501920,
"mun",
"Orya",
}
m["gar"] = {
"Galeya",
5518509,
"poz-ocw",
"Latn",
}
m["gas"] = {
"Adiwasi Garasia",
12953522,
"inc-bhi",
"Deva, Gujr",
ancestors = "bhb",
}
m["gat"] = {
"Kenati",
4219330,
"ngf-kag",
"Latn",
}
m["gau"] = {
"Kondekor",
12952433,
"dra-pgd",
"Telu",
}
m["gaw"] = {
"Nobonob",
11732205,
"ngf-han",
"Latn",
}
m["gay"] = {
"Gayo",
33286,
"poz-nws",
"Latn",
}
m["gbb"] = {
"Kaytetye",
6380709,
"aus-rnd",
"Latn",
}
m["gbd"] = {
"Karadjeri",
3913837,
"aus-pam",
"Latn",
}
m["gbe"] = {
"Niksek",
56375,
"paa-spk",
"Latn",
}
m["gbf"] = {
"Gaikundi",
5517032,
"paa-ndu",
"Latn",
}
m["gbg"] = {
"Gbanziri",
35306,
"nic-nkg",
"Latn",
}
m["gbh"] = {
"Defi Gbe",
12952446,
"alv-gbe",
"Latn",
}
m["gbi"] = {
"Galela",
3094570,
"paa-nha",
"Latn",
}
m["gbj"] = {
"Bodo Gadaba",
3347070,
"mun",
"Orya",
}
m["gbk"] = {
"Gaddi",
17455500,
"him",
"Deva, Takr",
translit = {
Deva = "Deva-translit",
},
}
m["gbl"] = {
"Gamit",
2731717,
"inc-bhi",
"Deva, Gujr",
translit = {
Deva = "Deva-translit",
Gujr = "Gujr-translit",
},
}
m["gbm"] = {
"Garhwali",
33459,
"inc-pah",
"Deva",
translit = {
Deva = "Deva-translit",
},
}
m["gbn"] = {
"Mo'da",
12755683,
"csu-bbk",
"Latn",
}
m["gbo"] = {
"Northern Grebo",
11157042,
"grb",
"Latn",
}
m["gbp"] = {
"Gbaya-Bossangoa",
11011295,
"gba-wes",
"Latn",
}
m["gbq"] = {
"Gbaya-Bozoum",
4952879,
"gba-wes",
"Latn",
}
m["gbr"] = {
"Gbagyi",
11015105,
"alv-ngb",
"Latn",
}
m["gbs"] = {
"Gbesi Gbe",
12952448,
"alv-pph",
"Latn",
}
m["gbu"] = {
"Gagadu",
35677,
"aus-arn",
"Latn",
}
m["gbv"] = {
"Gbanu",
3914945,
"gba-eas",
"Latn",
}
m["gbw"] = {
"Gabi",
5515391,
"aus-pam",
"Latn",
}
m["gbx"] = {
"Eastern Xwla Gbe",
18379975,
"alv-pph",
"Latn",
}
m["gby"] = {
"Gbari",
3915451,
"alv-ngb",
"Latn",
}
m["gcc"] = {
"Mali",
6743338,
"paa-bng",
"Latn",
}
m["gcd"] = {
"Ganggalida",
3913765,
"aus-tnk",
"Latn",
}
m["gce"] = {
"Galice",
20711,
"ath-pco",
"Latn",
}
m["gcf"] = {
"Antillean Creole",
3006280,
"crp",
"Latn",
ancestors = "fr",
sort_key = s["roa-oil-sortkey"],
}
m["gcl"] = {
"Grenadian Creole English",
4252500,
"crp",
"Latn",
ancestors = "en",
}
m["gcn"] = {
"Gaina",
11732195,
"paa-bin",
"Latn",
}
m["gcr"] = {
"Guianese Creole",
1363072,
"crp",
"Latn",
ancestors = "fr",
sort_key = s["roa-oil-sortkey"],
}
m["gct"] = {
"Colonia Tovar German",
1138351,
"gmw-hgm",
"Latn",
ancestors = "gsw",
}
m["gdb"] = {
"Ollari",
33906,
"dra-pgd",
"Orya, Telu",
translit = {
Orya = "Orya-translit",
Telu = "Telu-translit"
},
}
m["gdc"] = {
"Gugu Badhun",
10510360,
"aus-pam",
"Latn",
}
m["gdd"] = {
"Gedaged",
35292,
"poz-ocw",
"Latn",
}
m["gde"] = {
"Gude",
3441230,
"cdc-cbm",
"Latn",
}
m["gdf"] = {
"Guduf-Gava",
3441350,
"cdc-cbm",
"Latn",
}
m["gdg"] = { -- not to be confused with gad, gdk
"Ga'dang",
5515189,
"phi",
"Latn",
}
m["gdh"] = {
"Gadjerawang",
3913817,
"aus-jar",
"Latn",
}
m["gdi"] = {
"Gundi",
11137851,
"nic-nkb",
"Latn",
}
m["gdj"] = {
"Kurtjar",
5619931,
"aus-pmn",
"Latn",
}
m["gdk"] = { -- not to be confused with gad, gdg
"Gadang",
56256,
"cdc-est",
"Latn",
}
m["gdl"] = {
"Dirasha",
56809,
"cus-eas",
"Ethi",
}
m["gdm"] = {
"Laal",
33436,
"qfa-dis", -- Chad; unclassified, isolate or grouped with Adamawa or Chadic languages
"Latn",
}
m["gdn"] = {
"Umanakaina",
7881084,
"ngf-dag",
"Latn",
}
m["gdo"] = {
"Godoberi",
56515,
"cau-and",
"Cyrl",
display_text = {Cyrl = s["cau-Cyrl-displaytext"]},
strip_diacritics = {Cyrl = s["cau-Cyrl-stripdiacritics"]},
}
m["gdq"] = {
"Mehri",
13361,
"sem-sar",
"Arab, Latn",
}
m["gdr"] = {
"Wipi",
8026711,
"paa-etf",
"Latn",
}
m["gds"] = {
"Ghandruk Sign Language",
15971577,
"sgn",
}
m["gdt"] = {
"Kungardutyi",
6444517,
"aus-kar",
"Latn",
}
m["gdu"] = {
"Gudu",
3441172,
"cdc-cbm",
"Latn",
}
m["gdx"] = {
"Godwari",
3540922,
"raj",
"Deva",
translit = {
Deva = "Deva-translit",
},
}
m["gea"] = {
"Geruma",
3438789,
"cdc-wst",
"Latn",
}
m["geb"] = {
"Kire",
11129733,
"paa-ram",
"Latn",
}
m["gec"] = {
"Gboloo Grebo",
11019342,
"grb",
"Latn",
}
m["ged"] = {
"Gade",
3914459,
"alv-nup",
"Latn",
}
m["geg"] = {
"Gengle",
3438345,
"alv-mye",
"Latn",
ancestors = "kow",
}
m["geh"] = {
"Hutterisch",
33385,
"gmw-hgm",
"Latn",
ancestors = "bar",
}
m["gei"] = {
"Gebe",
3100032,
"poz-hce",
"Latn",
}
m["gej"] = {
"Gen",
33450,
"alv-gbe",
"Latn",
}
m["gek"] = {
"Gerka",
3441277,
"cdc-wst",
"Latn",
}
m["gel"] = {
"Fakkanci",
36627,
"nic-knn",
"Latn",
}
m["geq"] = {
"Geme",
3915851,
"znd",
"Latn",
}
m["ges"] = {
"Geser-Gorom",
5553579,
"poz-cma",
"Latn",
}
m["gev"] = {
"Viya",
7937974,
"bnt-tso",
"Latn",
}
m["gew"] = {
"Gera",
3438725,
"cdc-wst",
"Latn",
}
m["gex"] = {
"Garre",
56618,
"cus-som",
"Latn",
}
m["gey"] = {
"Enya",
5381452,
"bnt-mbe",
"Latn",
}
m["gez"] = {
"กืออึซ",
35667,
"sem-eth",
"Ethi",
translit = "Ethi-translit",
}
m["gfk"] = {
"Patpatar",
3368846,
"poz-ocw",
"Latn",
}
m["gft"] = {
"Gafat",
56910,
"sem-eth",
"Ethi, Latn",
}
m["gga"] = {
"Gao",
3095228,
"poz-ocw",
"Latn",
}
m["ggb"] = {
"Gbii",
3914390,
"kro-wkr",
"Latn",
}
m["ggd"] = {
"Gugadj",
5615186,
"aus-pmn",
"Latn",
}
m["gge"] = {
"Guragone",
5619801,
"aus-arn",
"Latn",
}
m["ggg"] = {
"Gurgula",
5620032,
"raj",
"Arab",
}
m["ggk"] = {
"Kungarakany",
6444516,
"aus-arn",
"Latn",
}
m["ggl"] = {
"Ganglau",
5521140,
"ngf-yag",
"Latn",
}
m["ggn"] = {
"Eastern Gurung",
12952472,
"sit-tam",
"Gukh, Deva",
translit = {
Deva = "Deva-translit",
},
}
m["ggt"] = {
"Gitua",
3107865,
"poz-ocw",
"Latn",
}
m["ggu"] = {
"Gban",
3913317,
"dmn-nbe",
"Latn",
}
m["ggw"] = {
"Gogodala",
3512161,
"ngf-gsu",
"Latn",
}
m["gha"] = {
"Ghadames",
56747,
"ber",
"Latn", -- and other scripts?
}
m["ghc"] = {
"แกลิกคลาสสิก",
5128278,
"cel-gae",
"Latn, Latg",
ancestors = "mga",
}
m["ghe"] = {
"Southern Ghale",
12952453,
"sit-tam",
"Deva",
translit = {
Deva = "Deva-translit",
},
}
m["ghh"] = {
"Northern Ghale",
22662104,
"sit-tam",
"Deva",
translit = {
Deva = "Deva-translit",
},
}
m["ghk"] = {
"Geko Karen",
5530317,
"kar",
}
m["ghl"] = {
"Ghulfan",
16885737,
"nub-hil",
"Latn", -- and others?
}
m["ghn"] = {
"Ghanongga",
3104772,
"poz-ocw",
"Latn",
}
m["gho"] = {
"Ghomara",
35315,
"ber",
"Tfng, Latn",
translit = {Tfng = "Tfng-translit"},
}
m["ghr"] = {
"Ghera",
22808992,
"inc-hiw",
}
m["ghs"] = {
"Guhu-Samane",
11732219,
"paa-gbi",
"Latn",
}
m["ght"] = {
"Kutang Ghale",
6448337,
"sit-tam",
"Tibt",
override_translit = true,
-- Tibt translit, display_text, strip_diacritics, sort_key in [[Module:scripts/data]]
}
m["gia"] = {
"Kitja",
1284877,
"aus-jar",
"Latn",
}
m["gib"] = {
"Gibanawa",
12953530,
"crp",
"Latn",
ancestors = "ha",
}
m["gid"] = {
"Gidar",
35265,
"cdc-cbm",
"Latn",
}
m["gie"] = {
"Guébie",
63140714,
"kro-did",
"Latn",
}
m["gig"] = {
"Goaria",
33269,
"raj",
"Arab",
}
m["gih"] = {
"Githabul",
48987680,
"aus-pam",
"Latn",
}
m["gii"] = {
"Girirra",
5564288,
"cus-som",
}
m["gil"] = {
"กิลเบิร์ต",
30898,
"poz-mic",
"Latn",
}
m["gim"] = {
"Gimi (Papuan)",
11732209,
"ngf-kag",
"Latn",
}
m["gin"] = {
"Hinukh",
33283,
"cau-wts",
"Cyrl",
translit = "gin-translit",
display_text = {Cyrl = s["cau-Cyrl-displaytext"]},
strip_diacritics = {Cyrl = s["cau-Cyrl-stripdiacritics"]},
}
m["gip"] = {
"Gimi (Austronesian)",
12952457,
"poz-ocw",
}
m["giq"] = {
"Green Gelao",
12953525,
"gio",
"Latn",
}
m["gir"] = {
"Red Gelao",
3100264,
"gio",
}
m["gis"] = {
"North Giziga",
3515084,
"cdc-cbm",
}
m["git"] = {
"Gitxsan",
3107862,
"nai-tsi",
"Latn",
}
m["giu"] = {
"Mulao",
11092831,
"gio",
}
m["giw"] = {
"White Gelao",
8843040,
"gio",
}
m["gix"] = {
"Gilima",
10977716,
"nic-nkm",
"Latn",
}
m["giy"] = {
"Giyug",
5565906,
}
m["giz"] = {
"South Giziga",
3502232,
"cdc-cbm",
}
m["gji"] = {
"Geji",
3914890,
"cdc-wst",
"Latn",
}
m["gjk"] = {
"Kachi Koli",
12953646,
"inc-wes",
}
m["gjm"] = {
"Gunditjmara",
6448731,
"aus-pam",
"Latn",
}
m["gjn"] = {
"Gonja",
35267,
"alv-gng",
"Latn",
}
m["gjr"] = {
"Gurindji Kriol",
5620091,
"qfa-mix",
"Latn",
ancestors = "gue, rop"
}
m["gju"] = {
"Gojri",
3241731,
"raj",
"ur-Arab, Deva, Takr",
strip_diacritics = {
["ur-Arab"] = {
remove_diacritics = c.fathatan .. c.dammatan .. c.kasratan .. c.fatha .. c.damma .. c.kasra .. c.shadda .. c.sukun .. c.nunghunna .. c.smallv,
from = {"ڵ", "ݩ"},
to = {"ل", "ن"}
},
},
translit = {
--["ur-Arab"] = "ur-translit",
Deva = "Deva-translit",
},
}
m["gka"] = {
"Guya",
11732221,
"ngf-fin",
"Latn",
}
m["gkd"] = {
"Magɨ",
55621742,
"ngf-ais",
"Latn",
}
m["gke"] = {
"Ndai",
6983667,
"alv-mbm",
}
m["gkn"] = {
"Gokana",
3075137,
"nic-ogo",
"Latn",
}
m["gko"] = {
"Kok-Nar",
6426526,
"aus-pmn",
"Latn",
}
m["gkp"] = {
"Guinea Kpelle",
11052867,
"dmn-msw",
"Latn, Kpel",
ancestors = "kpe",
}
m["glc"] = {
"Bon Gula",
289816,
"alv-bua",
}
m["gld"] = {
"Nanai",
13303,
"tuw-nan",
"Cyrl",
translit = "gld-translit",
strip_diacritics = {remove_diacritics = c.macron},
sort_key = {
from = {"ё", "ӈ"},
to = {"е" .. p[1], "н" .. p[1]}
},
}
m["glh"] = {
"Northwest Pashayi",
23713532,
"inc-pas",
"fa-Arab",
}
m["glj"] = {
"Kulaal",
33360,
"alv-bua",
}
m["glk"] = {
"Gilaki",
33657,
"ira-csp",
"fa-Arab",
}
m["glo"] = {
"Galambu",
2598797,
"cdc-wst",
"Latn",
}
m["glr"] = {
"Glaro-Twabo",
3915313,
"kro-wee",
}
m["glu"] = {
"Gula",
5617176,
"csu-bgr",
"Latn",
}
m["glw"] = {
"Glavda",
3441285,
"cdc-cbm",
"Latn",
}
m["gly"] = {
"Gule",
3120736,
"ssa-kom",
}
m["gma"] = {
"Gambera",
10502327,
"aus-wor",
"Latn",
}
m["gmb"] = {
"Gula'alaa",
3120733,
"poz-sls",
"Latn",
}
m["gmd"] = {
"Mághdì",
3914475,
"alv-bwj",
}
m["gmg"] = {
"Magiyi",
16926155,
"ngf-sog",
"Latn",
}
m["gmh"] = {
"เยอรมันสูงกลาง",
837985,
"gmw-hgm",
"Latn",
strip_diacritics = {
remove_diacritics = c.circ .. c.macron,
from = {"Ë", "ë", "[ƷȤ]", "[ʒȥ]"},
to = {"E", "e", "Z", "z"}
},
}
m["gml"] = {
"เยอรมันต่ำกลาง",
505674,
"gmw-lgm",
"Latn",
strip_diacritics = {remove_diacritics = c.circ .. c.macron .. c.diaer},
}
m["gmm"] = {
"Gbaya-Mbodomo",
6799713,
"gba-eas",
"Latn",
}
m["gmn"] = {
"Gimnime",
11016905,
"alv-dur",
"Latn",
}
m["gmr"] = {
"Mirning",
6873793,
"aus-pam",
"Latn",
}
m["gmu"] = {
"Gumalu",
5618027,
"ngf-gum",
"Latn",
}
m["gmv"] = {
"กาโม",
16116386,
"omv-nom",
"Latn, Ethi",
}
m["gmx"] = {
"Magoma",
16939552,
"bnt-bki",
}
m["gmy"] = {
"กรีกแบบไมซีนี",
668366,
"grk",
"Linb",
translit = "Linb-translit",
}
m["gmz"] = {
"Mgbo",
6826835,
"alv-igb",
ancestors = "izi",
}
m["gna"] = {
"Kaansa",
56802,
"nic-gur",
}
m["gnb"] = {
"Gangte",
12952442,
"tbq-kuk",
}
m["gnc"] = {
"Guanche",
35762,
"ber",
}
m["gnd"] = {
"Zulgo-Gemzek",
56800,
"cdc-cbm",
"Latn",
}
m["gne"] = {
"Ganang",
63163361,
"nic-plc",
ancestors = "izr",
}
m["gng"] = {
"Ngangam",
35888,
"nic-grm",
}
m["gnh"] = {
"Lere",
3915319,
"nic-jer",
}
m["gni"] = {
"กูนียันดี",
2669219,
"aus-bub",
"Latn",
}
m["gnj"] = {
"Ngen of Djonkro",
63170838,
"dmn-nbe",
"Latn",
}
m["gnk"] = {
"ǁGana",
1975199,
"khi-kal",
"Latn",
}
m["gnl"] = {
"Gangulu",
4916329,
"aus-pam",
"Latn",
}
m["gnm"] = {
"Ginuman",
11732210,
"ngf-dag",
"Latn",
}
m["gnn"] = {
"Gumatj",
10510745,
"aus-yol",
"Latn",
}
m["gnq"] = {
"Gana",
5520523,
"poz-san",
"Latn",
}
m["gnr"] = {
"Gureng Gureng",
5619998,
"aus-pam",
"Latn",
}
m["gnt"] = {
"Guntai",
12952475,
"paa-yam",
"Latn",
}
m["gnu"] = {
"Gnau",
3915810,
"paa-tor",
"Latn",
}
m["gnw"] = {
"Western Bolivian Guarani",
3775037,
"gn",
"Latn",
}
m["gnz"] = {
"Ganzi",
11137942,
"nic-nkb",
"Latn",
}
m["goa"] = {
"Guro",
35251,
"dmn-mda",
"Latn",
}
m["gob"] = {
"Playero",
3027923,
"sai-guh",
}
m["goc"] = {
"Gorakor",
12952463,
"poz-ocw",
"Latn",
}
m["god"] = {
"Godié",
3914412,
"kro-bet",
}
m["goe"] = {
"Gongduk",
2669221,
"sit",
}
m["gof"] = {
"โกฟา",
12631584,
"omv-nom",
"Latn, Ethi",
}
m["gog"] = {
"Gogo",
3272630,
"bnt-ruv",
"Latn",
}
m["goh"] = {
"เยอรมันสูงเก่า",
35218,
"gmw-hgm",
"Latn, Runr",
strip_diacritics = {
remove_diacritics = c.circ .. c.macron .. c.diaer,
from = {"[ƷȤ]", "[ʒȥ]"},
to = {"Z", "z"}
},
translit = {
Runr = "Runr-translit",
},
}
m["goi"] = {
"Gobasi",
5575414,
"ngf-est",
"Latn",
}
m["goj"] = {
"Gowlan",
12953532,
"inc-sou",
}
-- gok is a spurious language, see [[w:Spurious languages]]
m["gol"] = {
"Gola",
35482,
"alv",
"Latn, Vaii",
}
m["gon"] = {
"โคณฑี",
1775361,
"dra-gon",
"Telu, Gonm, Gong, Deva, Orya",
translit = {
Telu = "Telu-translit",
Gong = "gon-Gong-translit",
Gonm = "gon-Gonm-translit",
},
}
m["goo"] = {
"Gone Dau",
3110470,
"poz-pcc",
"Latn",
}
m["gop"] = {
"Yeretuar",
8052565,
"poz-hce",
"Latn",
}
m["goq"] = {
"Gorap",
3110816,
"crp",
"Latn",
ancestors = "ms",
}
m["gor"] = {
"Gorontalo",
2501174,
"phi",
"Latn",
}
m["got"] = {
"กอท",
35722,
"gme",
"Goth, Runr, Latn",
translit = {Goth = "Goth-translit"},
link_tr = true,
strip_diacritics = {Latn = {remove_diacritics = c.macron}},
}
m["gou"] = {
"Gavar",
3441180,
"cdc-cbm",
}
m["gov"] = {
"Goo",
16927208,
"dmn",
"Latn",
}
m["gow"] = {
"Gorwaa",
3437626,
"cus-sou",
"Latn",
}
m["gox"] = {
"Gobu",
7194986,
"bad-cnt",
}
m["goy"] = {
"Goundo",
317636,
"alv-kim",
}
m["goz"] = {
"Gozarkhani",
5590235,
"xme-ttc",
ancestors = "xme-ttc-eas",
}
m["gpa"] = {
"Gupa-Abawa",
3915352,
"alv-ngb",
"Latn",
}
m["gpn"] = {
"Taiap",
56237,
"qfa-dis", -- Papuan; isolate in Glottolog; relationship with Torricelli proposed by Usher
"Latn",
}
m["gqa"] = {
"Ga'anda",
56245,
"cdc-cbm",
"Latn",
}
m["gqi"] = {
"Guiqiong",
3120647,
"sit-qia",
}
m["gqn"] = { -- a variety of 'ter'
"Kinikinao",
53386731,
"awd",
"Latn",
}
m["gqr"] = {
"Gor",
759992,
"csu-sar",
"Latn",
}
m["gqu"] = {
"Qau",
17284874,
"gio",
}
m["gra"] = {
"Rajput Garasia",
21041529,
"inc-bhi",
"Deva, Gujr",
ancestors = "bhb",
translit = {
Deva = "Deva-translit",
Gujr = "Gujr-translit",
},
}
m["grc"] = {
"กรีกโบราณ",
35497,
"grk",
"Polyt, Cprt",
translit = {
Cprt = "Cprt-translit",
},
override_translit = true,
-- Polyt translit, display_text, strip_diacritics, sort_key in [[Module:scripts/data]]
standard_chars = {
Polyt = "ΑΆἈἉἊἋἌἍἎἏᾈᾉᾊᾋᾌᾍᾎᾏᾸᾹᾺᾼΒΓΔΕΈἘἙἚἛἜἝῈΖΗΉἨἩἪἫἬἭἮἯᾘᾙᾚᾛᾜᾝᾞᾟῊῌΘΙΊΪἸἹἺἻἼἽἾἿῘῙῚΚΛΜΝΞΟΌὈὉὊὋὌὍΠΡῬΡ̓ΣΤΥΎΫὙὛὝὟῨῩῪΦΧΨΩΏὨὩὪὫὬὭὮὯᾨᾩᾪᾫᾬᾭᾮᾯῸῺῼαάἀἁἂἃἄἅἆἇὰᾀᾁᾂᾃᾄᾅᾆᾇᾰᾱᾲᾳᾴᾶᾷβγδεέἐἑἒἓἔἕὲζηήἠἡἢἣἤἥἦἧὴᾐᾑᾒᾓᾔᾕᾖᾗῂῃῄῆῇθιίϊΐἰἱἲἳἴἵἶἷὶῐῑῒῖῗκλμνξοόὀὁὂὃὄὅὸπρῤῥςστυύϋΰὐὑὒὓὔὕὖὗὺῠῡῢῦῧφχψωώὠὡὢὣὤὥὦὧὼᾠᾡᾢᾣᾤᾥᾦᾧῲῳῴῶῷ·ͺ΄΅᾽᾿῀῁῍῎῏῝῞῟῭`´῾",
Cprt = "𐠀𐠁𐠂𐠃𐠄𐠅𐠈𐠊𐠋𐠌𐠍𐠎𐠏𐠐𐠑𐠒𐠓𐠔𐠕𐠖𐠗𐠘𐠙𐠚𐠛𐠜𐠝𐠞𐠟𐠠𐠡𐠢𐠣𐠤𐠥𐠦𐠧𐠨𐠩𐠪𐠫𐠬𐠭𐠮𐠯𐠰𐠱𐠲𐠳𐠴𐠵𐠷𐠸𐠼𐠿",
c.punc
},
}
m["grd"] = {
"Guruntum",
3441272,
"cdc-wst",
"Latn",
}
m["grg"] = {
"Madi",
6727664,
"ngf-fin",
"Latn",
}
m["grh"] = {
"Gbiri-Niragu",
3913936,
"nic-kau",
"Latn",
}
m["gri"] = {
"Ghari",
3104782,
"poz-sls",
"Latn",
}
m["grj"] = {
"Southern Grebo",
3914444,
"grb",
"Latn",
}
m["grm"] = {
"Kota Marudu Talantang",
6433808,
"poz-san",
"Latn",
}
m["gro"] = {
"Groma",
56551,
"sit-tib",
}
m["grq"] = {
"Gorovu",
56355,
"paa-ram",
"Latn",
}
m["grs"] = {
"Gresi",
5607612,
"paa-nim",
"Latn",
}
m["grt"] = {
"กาโร",
36137,
"tbq-bdg",
"Latn, Beng, Brai",
}
m["gru"] = {
"Kistane",
13273,
"sem-eth",
"Latn, Ethi",
}
m["grv"] = {
"Central Grebo",
18385114,
"grb",
"Latn",
}
m["grw"] = {
"Gweda",
5623387,
"poz-ocw",
"Latn",
}
m["grx"] = {
"Guriaso",
12631954,
"qfa-unc", -- no consensus; may be Kwomtari per Baron (1983) and Usher (2020), but no connections accepted by
-- Glottolog.
"Latn",
}
m["gry"] = {
"Barclayville Grebo",
11157342,
"grb",
"Latn",
}
m["grz"] = {
"Guramalum",
3120935,
"poz-ocw",
"Latn",
}
m["gse"] = {
"Ghanaian Sign Language",
35289,
"sgn",
"Latn", -- when documented
}
m["gsg"] = {
"German Sign Language",
33282,
"sgn-gsl",
"Sgnw",
}
m["gsl"] = {
"Gusilay",
35439,
"alv-jol",
"Latn",
}
m["gsm"] = {
"Guatemalan Sign Language",
2886781,
"sgn",
"Latn", -- when documented
}
m["gsn"] = {
"Gusan",
11732224,
"ngf-fin",
"Latn",
}
m["gso"] = {
"Southwest Gbaya",
4919322,
"gba-sou",
"Latn",
}
m["gsp"] = {
"Wasembo",
7971402,
"ngf-mad", -- placed in under Rai Coast by Glottolog (under Greater Yaganon) and Pawley-Hammarström
"Latn",
}
m["gss"] = {
"Greek Sign Language",
3565084,
"sgn",
}
m["gsw"] = {
"เยอรมันแบบอลามันเนีย",
131339,
"gmw-hgm",
"Latn",
wikimedia_codes = "als",
ancestors = "gmh",
}
m["gta"] = {
"Guató",
3027940,
"qfa-dis", -- isolate or Macro-Jê
"Latn",
}
m["gtu"] = {
"Aghu Tharrnggala",
16825981,
"aus-pmn",
"Latn",
}
m["gua"] = {
"Shiki",
3913946,
"nic-jrn",
"Latn",
}
m["gub"] = {
"Guajajára",
7699720,
"tup-gua",
"Latn",
}
m["guc"] = {
"Wayuu",
891085,
"awd-taa",
"Latn",
}
m["gud"] = {
"Yocoboué Dida",
21074781,
"kro-did",
"Latn",
}
m["gue"] = {
"Gurindji",
10511016,
"aus-pam",
"Latn",
}
m["guf"] = {
"Gupapuyngu",
10511004,
"aus-yol",
"Latn",
}
m["gug"] = {
"กัวรานีแบบปารากวัย",
17478066,
"gn",
"Latn",
wikimedia_codes = "gn",
ancestors = "gn-cls",
}
m["guh"] = {
"Guahibo",
2669193,
"sai-guh",
"Latn",
}
m["gui"] = {
"Eastern Bolivian Guarani",
2963912,
"gn",
"Latn",
}
m["guk"] = {
"Gumuz",
2396970,
"ssa",
"Latn, Ethi",
}
m["gul"] = {
"Gullah",
33395,
"crp",
"Latn",
ancestors = "en",
}
m["gum"] = {
"Guambiano",
2744745,
"sai-bar",
"Latn",
}
m["gun"] = {
"Mbya Guarani",
3915584,
"gn",
"Latn",
}
m["guo"] = {
"Guayabero",
2980375,
"sai-guh",
"Latn",
}
m["gup"] = {
"Gunwinggu",
1406574,
"aus-gun",
"Latn",
}
m["guq"] = {
"Aché",
383701,
"tup",
"Latn",
}
m["gur"] = {
"Farefare",
35331,
"nic-mre",
"Latn",
}
m["gus"] = {
"Guinean Sign Language",
15983937,
"sgn",
"Latn", -- when documented
}
m["gut"] = {
"Maléku Jaíka",
3915782,
"cba",
"Latn",
}
m["guu"] = {
"Yanomamö",
8048928,
"sai-ynm",
"Latn",
}
m["guv"] = {
"Gey",
11137816,
"alv-sav",
"Latn",
}
m["guw"] = {
"Gun",
3111668,
"alv-gbe",
"Latn",
strip_diacritics = {remove_diacritics = c.grave .. c.acute .. c.macron},
}
m["gux"] = {
"Gourmanchéma",
35474,
"nic-grm",
"Latn",
}
m["guz"] = {
"Gusii",
33603,
"bnt-lok",
"Latn",
}
m["gva"] = {
"Kaskihá",
3033534,
"sai-mas",
"Latn",
}
m["gvc"] = {
"Guanano",
3566001,
"sai-tuc",
"Latn",
}
m["gve"] = {
"Duwet",
5317647,
"poz-ocw",
"Latn",
}
m["gvf"] = {
"Golin",
3110291,
"ngf-chw",
"Latn",
}
m["gvj"] = {
"Guajá",
3915506,
"tup",
"Latn",
}
m["gvl"] = {
"Gulay",
641737,
"csu-sar",
"Latn",
}
m["gvm"] = {
"Gurmana",
3913363,
"nic-shi",
"Latn",
}
m["gvn"] = {
"Kuku-Yalanji",
5621973,
"aus-pam",
"Latn",
}
m["gvo"] = {
"Gavião do Jiparaná",
5528335,
"tup",
"Latn",
}
m["gvp"] = {
"Pará Gavião",
3365443,
"sai-nje",
"Latn",
}
m["gvr"] = {
"Western Gurung",
2392342,
"sit-tam",
"Gukh, Deva",
translit = {
Deva = "Deva-translit",
},
}
m["gvs"] = {
"Gumawana",
5618041,
"poz-ocw",
"Latn",
}
m["gvy"] = {
"Guyani",
10511230,
"aus-pam",
"Latn",
}
m["gwa"] = {
"Mbato",
3914941,
"alv-ptn",
"Latn",
}
m["gwb"] = {
"Gwa",
5623219,
"nic-jrn",
"Latn",
}
m["gwc"] = {
"Kalami",
1675961,
"inc-koh",
"Arab",
strip_diacritics = {
["Arab"] = {
-- character "ۂ" code U+06C2 to "ه" and "هٔ" (U+0647 + U+0654) to "ه"; hamzatu l-waṣli to a regular alif
from = {"هٔ", "ۂ", "ٱ"},
to = {"ہ", "ہ", "ا"},
remove_diacritics = c.fathatan .. c.dammatan .. c.kasratan .. c.fatha .. c.damma .. c.kasra .. c.shadda .. c.sukun .. c.nunghunna .. c.superalef .. u(0x065e)
},
},
}
m["gwd"] = {
"Gawwada",
3032135,
"cus-eas",
"Latn, Ethi",
}
m["gwe"] = {
"Gweno",
3358211,
"bnt-chg",
"Latn",
}
m["gwf"] = {
"Gowro",
3812403,
"inc-koh",
"Arab",
}
m["gwg"] = {
"Moo",
6907057,
"alv-bwj",
"Latn",
}
m["gwi"] = {
"Gwich'in",
21057,
"ath-nor",
"Latn",
}
m["gwj"] = {
"Gcwi",
12631978,
"khi-kal",
"Latn",
}
m["gwm"] = {
"Awngthim",
4830109,
"aus-pmn",
"Latn",
}
m["gwn"] = {
"Gwandara",
56521,
"cdc-wst",
"Latn",
}
m["gwr"] = {
"Gwere",
5623559,
"bnt-nyg",
"Latn",
}
m["gwt"] = {
"Gawar-Bati",
33894,
"inc-kun",
"Arab",
}
m["gwu"] = {
"Guwamu",
10511225,
"aus-pam",
"Latn",
}
m["gww"] = {
"Kwini",
10551249,
"aus-wor",
"Latn",
}
m["gwx"] = {
"Gua",
35422,
"alv-gng",
"Latn",
}
m["gxx"] = {
"Wè Southern",
19921582,
"kro-wee",
"Latn",
}
m["gya"] = {
"Northwest Gbaya",
36594,
"gba-wes",
"Latn",
}
m["gyb"] = {
"Garus",
5524492,
"ngf-han",
"Latn",
}
m["gyd"] = {
"Kayardild",
3913770,
"aus-tnk",
"Latn",
}
m["gye"] = {
"Gyem",
5624046,
"nic-jer",
"Latn",
}
m["gyf"] = {
"Gungabula",
10510783,
"aus-pam",
"Latn",
}
m["gyg"] = {
"Gbayi",
11137618,
"nic-ngd",
"Latn",
}
m["gyi"] = {
"Gyele",
35434,
"bnt-mnj",
"Latn",
}
m["gyl"] = {
"Gayil",
5528771,
"omv-aro",
"Latn",
}
m["gym"] = {
"Ngäbere",
3915581,
"cba",
"Latn",
}
m["gyn"] = {
"Guyanese Creole English",
3305477,
"crp",
"Latn",
ancestors = "en",
}
m["gyo"] = {
"Gyalsumdo",
53575940,
"sit-kyk",
}
m["gyr"] = {
"Guarayu",
3118779,
"tup-gua",
"Latn",
}
m["gyy"] = {
"Gunya",
10511001,
"aus-pam",
"Latn",
}
m["gza"] = {
"Ganza",
5521556,
"omv-mao",
"Latn",
}
m["gzn"] = {
"Gane",
3095108,
"poz-hce",
"Latn",
}
return require("Module:languages").finalizeData(m, "language")
99nrkp1ua81hq4ilwcpa9k8h969l4hz
มอดูล:languages/data/3/e
828
36382
5720756
5684153
2026-04-21T07:00:55Z
OctraBot
3198
บอต: แทนที่ข้อความโดยอัตโนมัติ (-standardChars +standard_chars)
5720756
Scribunto
text/plain
local m_langdata = require("Module:languages/data")
-- Loaded on demand, as it may not be needed (depending on the data).
local function u(...)
u = require("Module:string utilities").char
return u(...)
end
local c = m_langdata.chars
local p = m_langdata.puaChars
local s = m_langdata.shared
local m = {}
m["ebg"] = {
"Ebughu",
35294,
"nic-lcr",
"Latn",
}
m["ebk"] = {
"Eastern Bontoc",
62664215,
"phi",
"Latn",
}
m["ebr"] = {
"Ebrié",
36644,
"alv-ptn",
"Latn",
}
m["ebu"] = {
"Embu",
35318,
"bnt-kka",
"Latn",
}
m["ecr"] = {
"Eteocretan",
35461,
"qfa-unc",
"Polyt",
-- Polyt translit, display_text, strip_diacritics, sort_key in [[Module:scripts/data]]
}
m["ecs"] = {
"Ecuadorian Sign Language",
3436769,
"sgn",
"Latn", -- when documented
}
m["ecy"] = {
"Eteocypriot",
35309,
"qfa-unc",
"Cprt",
}
m["eee"] = {
"E",
35386,
"qfa-mix",
"Hani, Latn",
sort_key = {Hani = "Hani-sortkey"},
}
m["efa"] = {
"Efai",
3813297,
"nic-ief",
"Latn",
}
m["efe"] = {
"Efe",
56354,
"csu-mle",
"Latn",
}
m["efi"] = {
"Efik",
35377,
"nic-ief",
"Latn",
}
m["ega"] = {
"Ega",
3914927,
"alv",
"Latn",
}
m["egl"] = {
"เอมีเลีย",
1057898,
"roa-emr",
"Latn",
wikimedia_codes = "eml",
}
m["ego"] = {
"Eggon",
35300,
"nic-pls",
"Latn",
}
m["egy"] = {
"อียิปต์",
50868,
"egx",
"Latn, Egyp, Egyh",
sort_key = {
remove_diacritics = "'%-%s",
from = {"ꜣ", "j", "y", "ꜥ", "w", "b", "p", "f", "m", "n", "r", "ḥ", "ḫ", "ẖ", "h", "z", "š", "s", "q", "k", "g", "ṯ", "t", "ḏ", "d", "%."},
to = {p[1], p[2], p[3], p[4], p[5], p[6], p[7], p[8], p[9], p[10], p[11], p[13], p[14], p[15], p[12], p[16], p[18], p[17], p[19], p[20], p[21], p[23], p[22], p[25], p[24], p[26]}
},
}
m["ehu"] = {
"Ehueun",
3441392,
"alv-nwd",
"Latn",
}
m["eip"] = {
"Eipomek",
5349839,
"ngf-mek",
"Latn",
}
m["eit"] = {
"Eitiep",
5350030,
"paa-tor",
"Latn",
}
m["eiv"] = {
"Askopan",
56324,
"paa-nbo",
"Latn",
}
m["eja"] = {
"Ejamat",
6269820,
"alv-jfe",
"Latn",
}
m["eka"] = {
"Ekajuk",
35250,
"nic-eko",
"Latn",
}
m["eke"] = {
"Ekit",
3509628,
"nic-ief",
"Latn",
}
m["ekg"] = {
"Ekari",
5350305,
"ngf-pan",
"Latn",
}
m["eki"] = {
"Eki",
5350418,
"nic-ief",
"Latn",
}
m["ekl"] = {
"Kolhe",
6426945,
"mun",
"Latn",
}
m["ekm"] = {
"Elip",
12952414,
"nic-ymb",
"Latn",
}
m["eko"] = {
"Koti",
29930,
"bnt-mak",
"Latn",
}
m["ekp"] = {
"Ekpeye",
35254,
"alv-igb",
"Latn",
}
m["ekr"] = {
"Yace",
36901,
"alv-ido",
"Latn",
}
m["eky"] = {
"กะยาตะวันออก",
25559417,
"kar",
"Kali",
}
m["ele"] = {
"Elepi",
5359444,
"paa-tor",
"Latn",
}
m["elh"] = {
"El Hugeirat",
5351410,
"nub-hil",
"Latn",
}
m["eli"] = {
"Nding",
36176,
"alv-tal",
"Latn",
}
m["elk"] = {
"Elkei",
5364210,
"paa-tor",
"Latn",
}
m["elm"] = {
"Eleme",
3914427,
"nic-ogo",
"Latn",
}
m["elo"] = {
"El Molo",
56719,
"cus-eas",
"Latn",
}
m["elu"] = {
"Elu",
3364594,
"poz-aay",
"Latn",
}
m["elx"] = {
"Elamite",
35470,
"qfa-iso", --ancient language of Iran
"Xsux",
}
m["ema"] = {
"Emai",
35428,
"alv-eeo",
"Latn",
}
m["emb"] = {
"Embaloh",
5369424,
"poz",
"Latn",
}
m["eme"] = {
"Emerillon",
3588942,
"tup-gua",
"Latn",
}
m["emg"] = {
"Eastern Meohang",
12952840,
"sit-kie",
"Deva",
translit = {
Deva = "Deva-translit",
},
}
m["emi"] = {
"Mussau-Emira",
6943093,
"poz-stm",
"Latn",
}
m["emk"] = {
"Eastern Maninkakan",
11002130,
"dmn-mnk",
"Latn, Arab, Nkoo",
}
m["emm"] = {
"Mamulique",
3285082,
"nai-pak",
"Latn",
}
m["emn"] = {
"Eman",
5368975,
"nic-tvc",
"Latn",
}
m["emp"] = {
"Northern Emberá",
2391297,
"sai-chc",
"Latn",
}
m["ems"] = {
"อาลูตีก",
27992,
"ypk",
"Latn",
}
m["emu"] = {
"Eastern Muria",
12952883,
"dra-mur",
"Deva",
translit = {
Deva = "Deva-translit",
},
}
m["emw"] = {
"Emplawas",
5374265,
"poz-tim",
"Latn",
}
m["emx"] = {
"Erromintxela",
1122188,
"qfa-mix",
"Latn",
ancestors = "rom, eu",
}
m["emy"] = {
"Epigraphic Mayan",
301355,
"myn",
"Latn, Maya",
}
m["ena"] = {
"Apali",
3504201,
"ngf-sog",
"Latn",
}
m["enb"] = {
"Markweeta",
56874,
"sdv-nma",
"Latn",
}
m["enc"] = {
"En",
3504110,
"qfa-buy",
"Latn",
}
m["end"] = {
"Ende",
2067656,
"poz-cet",
"Latn",
}
m["enf"] = {
"Forest Enets",
30249597,
"syd-ene",
"Cyrl",
}
m["enh"] = {
"Tundra Enets",
25559411,
"syd-ene",
"Cyrl",
}
m["enl"] = {
"Enlhet",
15462671,
"sai-mas",
"Latn",
}
m["enm"] = {
"อังกฤษกลาง",
36395,
"gmw-ang",
"Latn",
strip_diacritics = {remove_diacritics = c.acute .. c.circ .. c.macron .. c.breve .. c.dotabove .. c.diaer .. c.dacute .. c.dotbelow .. c.tacute},
sort_key = {
remove_diacritics = c.acute .. c.circ .. c.macron .. c.breve .. c.dotabove .. c.diaer .. c.dacute .. c.dotbelow .. c.tacute,
from = {"[ꟓꟕ]", "[æðᵹꟑȝœẜþꟃƿ]"},
to = {
{
["ꟓ"] = "þþ", ["ꟕ"] = "ƿƿ", -- finalized by the next substitution
},
{
["æ"] = "ae", ["ð"] = "d" .. p[1], ["ᵹ"] = "g", ["ꟑ"] = "g", ["ȝ"] = "g" .. p[1],
["œ"] = "oe", ["ẜ"] = "s", ["þ"] = "t" .. p[1], ["ꟃ"] = "w", ["ƿ"] = "w",
},
},
},
standard_chars = {
Latn = "AaÆæBbCcDdÐðEeFfGgȜȝHhIiJjKkLlMmNnOoPpQqRrSsTtÞþUuVvWwXxYyZz",
c.punc,
},
}
m["enn"] = {
"Engenni",
3915365,
"alv-dlt",
"Latn",
}
m["eno"] = {
"Enggano",
2669164,
"poz",
"Latn",
}
m["enq"] = {
"Enga",
1143040,
"ngf-eng",
"Latn",
}
m["enr"] = {
"Emem",
5370369,
"paa-pau",
}
m["enu"] = {
"Enu",
5380858,
"tbq-bka",
}
m["env"] = {
"Enwan",
3438334,
"alv-yek",
"Latn",
}
m["enw"] = {
"Enwang",
11134434,
"nic-lcr",
"Latn",
}
m["enx"] = {
"Enxet",
15462609,
"sai-mas",
"Latn",
}
m["eot"] = {
"Eotile",
3915347,
"alv-ptn",
"Latn",
}
m["epi"] = {
"Epie",
35291,
"alv-dlt",
"Latn",
}
m["era"] = {
"Eravallan",
5385061,
"dra-tam",
"Taml",
}
m["erg"] = {
"Sie",
426254,
"poz-vns",
"Latn",
}
m["erh"] = {
"Eruwa",
3441244,
"alv-swd",
"Latn",
}
m["eri"] = {
"Ogea",
7079984,
"ngf-nur",
"Latn",
}
m["erk"] = {
"South Efate",
3449070,
"poz-vnc",
"Latn",
}
m["err"] = {
"Erre",
10488401,
"qfa-iso", -- Evans (1997) put it in an Arnhem Land family
"Latn",
}
m["ers"] = {
"Ersu",
12952417,
"sit-ers",
"Latn", -- also Ersu Shaba
}
m["ert"] = {
"Eritai",
56376,
"paa-lkp",
"Latn",
}
m["erw"] = {
"Erokwanas",
5395296,
"poz-hce",
"Latn",
}
m["ese"] = {
"Ese Ejja",
2980381,
"sai-tac",
"Latn",
}
m["esh"] = {
"เอชเตฮาร์ด",
12952418,
"xme-ttc",
"fa-Arab, Latn",
ancestors = "xme-ttc-sou",
}
-- "esi" and "esk" moved to etymology-only per [[WT:LT]] and [[Wiktionary:Beer_parlour/2023/August#Issues_regarding_the_Inuit_languages]]
m["esl"] = {
"Egyptian Sign Language",
5348443,
"sgn",
}
m["esm"] = {
"Esuma",
16927555,
"alv-kwa",
"Latn",
}
m["esn"] = {
"Salvadoran Sign Language",
7406492,
"sgn",
"Latn", -- when documented
}
m["eso"] = {
"Estonian Sign Language",
3196221,
"sgn",
"Latn", -- when documented
}
m["esq"] = {
"Esselen",
1294243,
"qfa-dis", -- isolate or Hokan
"Latn",
}
m["ess"] = {
"Central Siberian Yupik",
27993,
"ypk",
"Cyrl, Latn",
}
m["esu"] = {
"ยุปปิก",
21117,
"ypk",
"Latn",
}
m["esy"] = {
"Eskayan",
867086,
"art",
"Latn", -- also its own native script
}
m["etb"] = {
"Etebi",
11002851,
"nic-ief",
"Latn",
}
m["etc"] = {
"Etchemin",
5402493,
"alg-eas",
"Latn",
}
m["eth"] = {
"Ethiopian Sign Language",
3501903,
"sgn",
}
m["etn"] = {
"Eton (Vanuatu)",
3059362,
"poz-vnc",
"Latn",
}
m["eto"] = {
"Eton (Cameroon)",
35317,
"bnt-btb",
"Latn",
}
m["etr"] = {
"Edolo",
5340184,
"ngf-bos",
"Latn",
}
m["ets"] = {
"Yekhee",
3915848,
"alv-yek",
"Latn",
}
m["ett"] = {
"อีทรัสคัน",
35726,
"qfa-tyn",
"Ital",
-- Ital translit in [[Module:scripts/data]]
}
m["etu"] = {
"Ejagham",
35296,
"nic-eko",
"Latn",
}
m["etx"] = {
"Eten",
3915392,
"nic-beo",
"Latn",
}
m["etz"] = {
"Semimi",
10950308,
"paa-mai",
"Latn",
}
m["eve"] = {
"เอเว็น",
29960,
"tuw-ewe",
"Cyrl, Latn",
translit = {Cyrl = "eve-translit"},
strip_diacritics = {remove_diacritics = c.macron .. c.dotabove .. c.dotbelow},
sort_key = {
Cyrl = {
from = {
"ӫ", -- 2 chars
"ё", "ӈ", "ө" -- 1 char
},
to = {
"о" .. p[2],
"е" .. p[1], "н" .. p[1], "о" .. p[1]
},
},
},
}
m["evh"] = {
"Uvbie",
3441344,
"alv-swd",
"Latn",
}
m["evn"] = {
"เอเวนค์",
30004,
"tuw-ewe",
"Cyrl",
translit = "evn-translit",
strip_diacritics = {remove_diacritics = c.macron .. c.dotabove .. c.dotbelow},
sort_key = {
from = {"ё", "ӈ"},
to = {"е" .. p[1], "н" .. p[1]}
},
}
m["ewo"] = {
"Ewondo",
35459,
"bnt-btb",
"Latn",
}
m["ext"] = {
"เอซเตรมาดูรา",
30007,
"roa-asl",
"Latn",
}
m["eya"] = {
"Eyak",
27480,
"xnd",
"Latn",
}
m["eyo"] = {
"Keiyo",
56856,
"sdv-nma",
"Latn",
}
m["eza"] = {
"Ezaa",
11921436,
"alv-igb",
"Latn",
ancestors = "izi",
}
m["eze"] = {
"Uzekwe",
3502244,
"nic-ucn",
"Latn",
}
return require("Module:languages").finalizeData(m, "language")
5u6epxcieeit1v0aothwfcow04mnqtd
มอดูล:languages/data/3/d
828
36383
5720755
5684152
2026-04-21T07:00:53Z
OctraBot
3198
บอต: แทนที่ข้อความโดยอัตโนมัติ (-standardChars +standard_chars)
5720755
Scribunto
text/plain
local m_langdata = require("Module:languages/data")
-- Loaded on demand, as it may not be needed (depending on the data).
local function u(...)
u = require("Module:string utilities").char
return u(...)
end
local c = m_langdata.chars
local p = m_langdata.puaChars
local s = m_langdata.shared
local m = {}
m["daa"] = {
"Dangaléat",
942591,
"cdc-est",
"Latn",
}
m["dac"] = {
"Dambi",
12629491,
"poz-ocw",
"Latn",
}
m["dad"] = {
"Marik",
6763404,
"poz-ocw",
"Latn",
}
m["dae"] = {
"Duupa",
35263,
"alv-dur",
"Latn",
}
m["dag"] = {
"Dagbani",
32238,
"nic-dag",
"Latn",
}
m["dah"] = {
"Gwahatike",
5623246,
"ngf-fin",
"Latn",
}
m["dai"] = {
"Day",
35163,
"alv-mbd",
"Latn",
}
m["daj"] = {
"Dar Fur Daju",
56370,
"sdv-daj",
"Latn",
}
m["dak"] = {
"ดาโคตา",
530384,
"sio-dkt",
"Latn",
}
m["dal"] = {
"Dahalo",
35143,
"cus",
"Latn",
}
m["dam"] = {
"Damakawa",
1158134,
"nic-knn",
"Latn",
}
m["dao"] = {
"Daai Chin",
860029,
"tbq-kuk",
"Latn",
}
m["daq"] = {
"Dandami Maria",
12952805,
"dra-mdy",
"Deva",
}
m["dar"] = {
"Dargwa",
32332,
"cau-drg",
"Cyrl, Latn, Arab",
translit = {Cyrl = "dar-translit"},
override_translit = true,
display_text = {Cyrl = s["cau-Cyrl-displaytext"]},
strip_diacritics = {
Cyrl = s["cau-Cyrl-stripdiacritics"],
Latn = s["cau-Latn-stripdiacritics"],
},
sort_key = {
Cyrl = {
from = {
"къкъ", "хьхь", -- 4 chars
"гъ", "гь", "гӏ", "ё", "къ", "кь", "кӏ", "пп", "пӏ", "сс", "тт", "тӏ", "хх", "хъ", "хь", "хӏ", "цц", "цӏ", "чч", "чӏ" -- 2 chars
},
to = {
"к" .. p[2], "х" .. p[4],
"г" .. p[1], "г" .. p[2], "г" .. p[3], "е" .. p[1], "к" .. p[1], "к" .. p[3], "к" .. p[4], "п" .. p[1], "п" .. p[2], "с" .. p[1], "т" .. p[1], "т" .. p[2], "х" .. p[1], "х" .. p[2], "х" .. p[3], "х" .. p[5], "ц" .. p[1], "ц" .. p[2], "ч" .. p[1], "ч" .. p[2]
}
},
},
}
m["das"] = {
"Daho-Doo",
3915369,
"kro-wee",
"Latn",
}
m["dau"] = {
"Dar Sila Daju",
7514020,
"sdv-daj",
"Latn",
}
m["dav"] = {
"Taita",
2387274,
"bnt-cht",
"Latn",
}
m["daw"] = {
"Davawenyo",
5228174,
"phi",
"Latn",
}
m["dax"] = {
"Dayi",
10467281,
"aus-yol",
"Latn",
}
m["daz"] = {
"Dao",
5221513,
"ngf-pan",
"Latn",
}
m["dba"] = {
"Bangime",
1982696,
"qfa-iso", -- southern Mali
"Latn",
}
m["dbb"] = {
"Deno",
56275,
"cdc-wst",
"Latn",
}
m["dbd"] = {
"Dadiya",
3914436,
"alv-wjk",
"Latn",
}
m["dbe"] = {
"Dabe",
5207451,
"paa-tkw",
"Latn",
}
m["dbf"] = {
"Edopi",
12953516,
"paa-lkp",
"Latn",
}
m["dbg"] = {
"Dogul Dom",
3912880,
"nic-npd",
"Latn",
}
m["dbi"] = {
"Doka",
3913293,
"nic-plc",
"Latn",
}
m["dbj"] = {
"อีดาอัน",
3041552,
"poz-san",
"Latn",
}
m["dbl"] = {
"Dyirbal",
35465,
"aus-dyb",
"Latn",
}
m["dbm"] = {
"Duguri",
7194057,
"nic-jrw",
"Latn",
}
m["dbn"] = {
"Duriankere",
5316627,
"ngf-sbh",
"Latn",
}
m["dbo"] = {
"Dulbu",
5313310,
"nic-jrn",
"Latn",
}
m["dbp"] = {
"Duwai",
56301,
"cdc-wst",
"Latn",
}
m["dbq"] = {
"Daba",
3913342,
"cdc-cbm",
"Latn",
}
m["dbr"] = {
"Dabarre",
3447286,
"cus-som",
}
m["dbt"] = {
"Ben Tey",
4886561,
"nic-nwa",
"Latn",
}
m["dbu"] = {
"Bondum Dom Dogon",
3912758,
"nic-npd",
"Latn",
}
m["dbv"] = {
"Dungu",
5315230,
"nic-kau",
"Latn",
}
m["dbw"] = {
"Bankan Tey Dogon",
4856243,
"nic-nwa",
"Latn",
}
m["dby"] = {
"Dibiyaso",
5272268,
"qfa-dis", -- Papuan; isolate per Glottolog, unclassified per Pawley and Hammarström (2018), sometimes classified with Bosavi languages
"Latn",
}
m["dcc"] = {
"Deccani",
669431,
"inc-hnd",
"ur-Arab",
ancestors = "ur",
}
m["dcr"] = {
"เนเกอร์ฮอลันดส์",
1815830,
"crp",
"Latn",
ancestors = "nl",
}
m["dda"] = {
"Dadi Dadi",
50207890,
"aus-pam",
"Latn",
}
m["ddd"] = {
"Dongotono",
56676,
"sdv-lma",
"Latn",
}
m["dde"] = {
"Doondo",
11003401,
"bnt-kng",
"Latn",
}
m["ddg"] = {
"Fataluku",
35353,
"paa-tap",
"Latn",
}
m["ddi"] = {
"Diodio",
3028668,
"poz-ocw",
"Latn",
}
m["ddj"] = {
"Jaru",
3162806,
"aus-pam",
"Latn",
}
m["ddn"] = {
"Dendi",
35164,
"son",
"Latn",
}
m["ddo"] = {
"Tsez",
34033,
"cau-wts",
"Cyrl",
translit = "ddo-translit",
display_text = {Cyrl = s["cau-Cyrl-displaytext"]},
strip_diacritics = {Cyrl = s["cau-Cyrl-stripdiacritics"]},
}
m["ddr"] = {
"Dhudhuroa",
5269842,
"aus-pam",
"Latn",
}
m["dds"] = {
"Donno So Dogon",
1234776,
"nic-dge",
"Latn",
}
m["ddw"] = {
"Dawera-Daweloor",
5242304,
"poz-tim",
"Latn",
}
m["dec"] = {
"Dagik",
35125,
"alv-tal",
"Latn",
}
m["ded"] = {
"Dedua",
5249850,
"ngf-huo",
"Latn",
}
m["dee"] = {
"Dewoin",
3914892,
"kro-wkr",
"Latn",
}
m["def"] = {
"Dezfuli",
4115412,
"ira-swi",
"Arab",
}
m["deg"] = {
"Degema",
35182,
"alv-dlt",
"Latn",
}
m["deh"] = {
"Dehwari",
5704314,
"ira-swi",
"fa-Arab",
ancestors = "fa",
}
m["dei"] = {
"Demisa",
56380,
"paa-egb",
"Latn",
}
-- "dek" is no longer an ISO code; spurious
m["dem"] = {
"Dem",
5254989,
"qfa-dis", -- Papuan; isolate in Glottolog; unclassified in Palmer (2018); grouped with Amung by Usher, ultimately in TNG
"Latn",
}
m["dep"] = {
"Pidgin Delaware",
1183938,
"crp",
"Latn",
ancestors = "unm",
}
-- deq is not included, see [[WT:LT]]
m["der"] = {
"Deori",
56478,
"tbq-bdg",
"Beng, Latn",
}
m["des"] = {
"Desano",
962392,
"sai-tuc",
"Latn",
}
m["dev"] = {
"Domung",
5291378,
"ngf-fin",
"Latn",
}
m["dez"] = {
"Dengese",
2909984,
"bnt-tet",
"Latn",
}
m["dga"] = {
"Southern Dagaare",
35159,
"nic-mre",
"Latn",
}
m["dgb"] = {
"Bunoge",
4985178,
"nic-dgw",
"Latn",
}
m["dgc"] = {
"Casiguran Dumagat Agta",
5313599,
"phi",
"Latn",
}
m["dgd"] = {
"Dagaari Dioula",
11153465,
"nic-mre",
"Latn",
}
m["dge"] = {
"Degenan",
5251770,
"ngf-fin",
"Latn",
}
m["dgg"] = {
"Doga",
3033726,
"poz-ocw",
"Latn",
}
m["dgh"] = {
"Dghwede",
56293,
"cdc-cbm",
"Latn",
}
m["dgi"] = {
"Northern Dagara",
11004218,
"nic-mre",
"Latn",
}
m["dgk"] = {
"Dagba",
12952357,
"csu-sar",
"Latn",
}
m["dgn"] = {
"Dagoman",
10465931,
"aus-yng",
"Latn",
}
m["dgo"] = {
"Hindi Dogri",
nil,
"him",
"Deva, Arab, Takr",
ancestors = "doi",
translit = {
Deva = "Deva-translit",
},
}
m["dgr"] = {
"Dogrib",
20979,
"ath-nor",
"Latn",
}
m["dgs"] = {
"Dogoso",
35343,
"nic-gur",
}
m["dgt"] = {
"Ntra'ngith",
6983809,
"aus-pam",
"Latn",
}
-- dgu is not a language; see [[w:Dhekaru]]
m["dgw"] = {
"Daungwurrung",
5228050,
"aus-pam",
"Latn",
}
m["dgx"] = {
"Doghoro",
12952392,
"paa-bin",
"Latn",
}
m["dgz"] = {
"Daga",
5208442,
"ngf-dag",
"Latn",
}
m["dhg"] = {
"Dhangu",
5268960,
"aus-yol",
"Latn",
}
m["dhd"] = {
"Dhundhari",
633359,
"raj",
"Deva",
translit = {
Deva = "Deva-translit",
},
}
m["dhi"] = {
"Dhimal",
35229,
"sit-dhi",
"Deva",
translit = {
Deva = "Deva-translit",
},
}
m["dhl"] = {
"Dhalandji",
5268787,
"aus-psw",
"Latn",
}
m["dhm"] = {
"Zemba",
3502283,
"bnt-swb",
"Latn",
ancestors = "hz",
}
m["dhn"] = {
"Dhanki",
5268992,
"inc-bhi",
}
m["dho"] = {
"Dhodia",
5269658,
"inc-bhi",
"Deva",
translit = {
Deva = "Deva-translit",
},
}
m["dhr"] = {
"Tharrgari",
10470289,
"aus-psw",
"Latn",
}
m["dhs"] = {
"Dhaiso",
11001788,
"bnt-kka",
"Latn",
}
m["dhu"] = {
"Dhurga",
1285318,
"aus-yuk",
"Latn",
}
m["dhv"] = {
"Drehu",
3039319,
"poz-cln",
"Latn",
}
m["dhw"] = {
"Danuwar",
3522797,
"inc-bhi",
"Deva",
translit = {
Deva = "Deva-translit",
},
}
m["dhx"] = {
"Dhungaloo",
16960599,
"aus-pam",
"Latn",
}
m["dia"] = {
"Dia",
3446591,
"paa-tor",
"Latn",
}
m["dib"] = {
"South Central Dinka",
35154,
"sdv-dnu",
"Latn",
ancestors = "din",
}
m["dic"] = {
"Lakota Dida",
11001730,
"kro-did",
"Latn",
}
m["did"] = {
"Didinga",
56365,
"sdv",
"Latn",
}
m["dif"] = {
"Dieri",
25559563,
"aus-kar",
"Latn",
}
m["dig"] = {
"Digo",
3362072,
"bnt-mij",
"Latn",
}
-- "dih" is split into nai-ipa, nai-kum, nai-tip, see [[WT:LT]]
m["dii"] = {
"Dimbong",
35196,
"bnt-baf",
"Latn",
}
m["dij"] = {
"Dai",
5209056,
"poz-tim",
"Latn",
}
m["dik"] = {
"Southwestern Dinka",
36540,
"sdv-dnu",
"Latn",
ancestors = "din",
}
m["dil"] = {
"Dilling",
35152,
"nub-hil",
"Latn",
}
m["dim"] = {
"Dime",
35311,
"omv-aro",
}
m["din"] = {
"Dinka",
56466,
"sdv-dnu",
"Latn",
}
m["dio"] = {
"Dibo",
3914891,
"alv-ngb",
"Latn",
}
m["dip"] = {
"Northeastern Dinka",
36246,
"sdv-dnu",
"Latn",
ancestors = "din",
}
m["dir"] = {
"Dirim",
11130804,
"nic-dak",
"Latn",
}
m["dis"] = {
"Dimasa",
56664,
"tbq-bdg",
"Latn, Beng",
}
m["diu"] = {
"Gciriku",
3780954,
"bnt-kav",
"Latn",
}
m["diw"] = {
"Northwestern Dinka",
36249,
"sdv-dnu",
"Latn",
ancestors = "din",
}
m["dix"] = {
"Dixon Reef",
5284967,
"poz-vnc",
"Latn",
}
m["diy"] = {
"Diuwe",
5283765,
"ngf-ask",
"Latn",
}
m["diz"] = {
"Ding",
35202,
"bnt-bdz",
"Latn",
}
m["dja"] = {
"Djadjawurrung",
5285190,
"aus-pam",
"Latn",
}
m["djb"] = {
"Djinba",
5285351,
"aus-yol",
"Latn",
}
m["djc"] = {
"Dar Daju Daju",
5209890,
"sdv-daj",
"Latn",
}
m["djd"] = {
"Jaminjung",
6147825,
"aus-mir",
"Latn",
}
m["dje"] = {
"Zarma",
36990,
"son",
"Latn, Arab, Brai",
}
m["djf"] = {
"Djangun",
10474818,
"aus-pmn",
"Latn",
}
m["dji"] = {
"Djinang",
5285350,
"aus-yol",
"Latn",
}
m["djj"] = {
"Ndjébbana",
5285274,
"aus-arn",
"Latn",
}
m["djk"] = {
"Aukan",
2659044,
"crp",
"Latn, Afak",
ancestors = "en",
}
m["djl"] = {
"Djiwarli",
2669569,
"aus-psw",
"Latn",
}
m["djm"] = {
"Jamsay",
3913290,
"nic-pld",
"Latn",
}
m["djn"] = {
"Djauan",
13553748,
"aus-gun",
"Latn",
}
m["djo"] = {
"Jangkang",
12952388,
"day",
}
m["djr"] = {
"Djambarrpuyngu",
3915679,
"aus-yol",
"Latn",
}
m["dju"] = {
"Kapriman",
6367199,
"paa-spk",
"Latn",
}
m["djw"] = {
"Djawi",
3913844,
"aus-nyu",
"Latn",
ancestors = "bcj",
}
m["dka"] = {
"Dakpa",
3695189,
"sit-ebo",
"Tibt",
override_translit = true,
-- Tibt translit, display_text, strip_diacritics, sort_key in [[Module:scripts/data]]
}
m["dkk"] = {
"Dakka",
5209962,
"poz-ssw",
}
m["dkr"] = {
"Kuijau",
13580777,
"poz-bnn",
}
m["dks"] = {
"Southeastern Dinka",
36538,
"sdv-dnu",
"Latn",
ancestors = "din",
}
m["dkx"] = {
"Mazagway",
6798209,
"cdc-cbm",
"Latn",
}
m["dlg"] = {
"Dolgan",
32878,
"trk-nsb",
"Cyrl",
sort_key = {
from = {"ё", "һ", "ӈ", "ө", "ү"},
to = {"е" .. p[1], "к" .. p[1], "н" .. p[1], "о" .. p[1], "у" .. p[1]}
},
}
m["dlk"] = {
"Dahalik",
32260,
"sem-eth",
"Ethi",
translit = "Ethi-translit",
}
m["dlm"] = {
"แดลเมเชีย",
35527,
"roa-dal",
"Latn",
}
m["dln"] = {
"Darlong",
5224029,
"tbq-kuk",
"Latn",
}
m["dma"] = {
"Duma",
35319,
"bnt-nze",
"Latn",
}
m["dmb"] = {
"Mombo Dogon",
6897074,
"nic-dgw",
"Latn",
}
m["dmc"] = {
"Gavak",
5277406,
"ngf-nad",
"Latn",
}
m["dmd"] = {
"Madhi Madhi",
6727353,
"aus-pam",
"Latn",
}
m["dme"] = {
"Dugwor",
56313,
"cdc-cbm",
"Latn",
}
m["dmf"] = {
"Medefaidrin",
1519764,
"art",
"Medf",
type = "appendix-constructed",
}
m["dmg"] = {
"กีนาบาตางันตอนบน",
16109975,
"poz-san",
"Latn",
}
m["dmk"] = {
"Domaaki",
32900,
"inc-wes",
"Arab",
}
m["dml"] = {
"Dameli",
32288,
"inc-kun",
}
m["dmm"] = {
"Dama (Nigeria)",
5211865,
"alv-mbm",
"Latn",
}
m["dmo"] = {
"Kemezung",
35562,
"nic-bbe",
"Latn",
}
m["dmr"] = {
"East Damar",
5328200,
"poz-cet",
"Latn",
}
m["dms"] = {
"Dampelas",
5212928,
"poz-tot",
"Latn",
}
m["dmu"] = {
"Dubu",
7692059,
"paa-pau",
"Latn",
}
m["dmv"] = {
"Dumpas",
12953512,
"poz-san",
"Latn",
}
m["dmw"] = {
"Mudburra",
6931573,
"aus-pam",
"Latn",
}
m["dmx"] = {
"Dema",
3553423,
"bnt-sho",
"Latn",
}
m["dmy"] = {
"Demta",
14466283,
"paa-sen",
"Latn",
}
m["dna"] = {
"Upper Grand Valley Dani",
12952361,
"ngf-dan",
"Latn",
}
m["dnd"] = {
"Daonda",
5221528,
"paa-brd",
"Latn",
}
m["dne"] = {
"Ndendeule",
6983725,
"bnt-mbi",
"Latn",
}
m["dng"] = {
"ดุงกาน",
33050,
"zhx-man",
"Cyrl, Hants, Arab",
generate_forms = "zh-generateforms",
translit = {Cyrl = "dng-translit"},
sort_key = {
Cyrl = {
from = {"ё", "ә", "җ", "ң", "ў", "ү"},
to = {"е" .. p[1], "е" .. p[2], "ж" .. p[1], "н" .. p[1], "у" .. p[1], "у" .. p[2]}
},
Hani = "Hani-sortkey",
},
}
m["dni"] = {
"Lower Grand Valley Dani",
12635807,
"ngf-dan",
"Latn",
}
m["dnj"] = {
"Dan",
1158971,
"dmn-mda",
"Latn",
}
m["dnk"] = {
"Dengka",
5256954,
"poz-tim",
"Latn",
}
m["dnn"] = {
"Dzuun",
10973260,
"dmn-smg",
"Latn",
}
m["dno"] = {
"Ndrulo",
60785094,
"csu-lnd",
}
m["dnr"] = {
"Danaru",
5214932,
"ngf-pek",
"Latn",
}
m["dnt"] = {
"Mid Grand Valley Dani",
12952359,
"ngf-dan",
"Latn",
}
m["dnu"] = {
"Danau",
5013745,
"mkh-pal",
"Mymr",
}
m["dnv"] = {
"Danu",
5221251,
"tbq-brm",
"Mymr",
ancestors = "obr",
}
m["dnw"] = {
"Western Dani",
7987774,
"ngf-dan",
"Latn",
}
m["dny"] = {
"Dení",
56562,
"auf",
"Latn",
}
m["doa"] = {
"Dom",
5289770,
"ngf-chw",
"Latn",
}
m["dob"] = {
"Dobu",
952133,
"poz-ocw",
"Latn",
}
m["doc"] = {
"ต้งเหนือ",
17195499,
"qfa-tak",
"Latn",
}
m["doe"] = {
"Doe",
5288055,
"bnt-ruv",
"Latn",
}
m["dof"] = {
"Domu",
5291375,
"paa-mal",
"Latn",
}
m["doh"] = {
"Dong",
3438405,
"nic-dak",
"Latn",
}
m["doi"] = {
"Dogri",
32730,
"him",
"Deva, Takr, fa-Arab, Dogr",
translit = {
Deva = "Deva-translit",
Dogr = "Dogr-translit",
},
}
m["dok"] = {
"Dondo",
5295571,
"poz-tot",
"Latn",
}
m["dol"] = {
"Doso",
4167202,
"paa-dot",
"Latn",
}
m["don"] = {
"Doura",
7829037,
"poz-ocw",
"Latn",
}
m["doo"] = {
"Dongo",
35303,
"nic-mbc",
"Latn",
}
m["dop"] = {
"Lukpa",
3258739,
"nic-gne",
"Latn",
}
m["doq"] = {
"Dominican Sign Language",
5290820,
"sgn",
"Latn", -- when documented
}
m["dor"] = {
"Dori'o",
3037084,
"poz-sls",
"Latn",
}
m["dos"] = {
"Dogosé",
3913314,
"nic-gur",
"Latn",
}
m["dot"] = {
"Dass",
3441293,
"cdc-wst",
"Latn",
}
m["dov"] = {
"Toka-Leya",
11001779,
"bnt-bot",
"Latn",
ancestors = "toi",
}
m["dow"] = {
"Doyayo",
35299,
"alv-dur",
"Latn",
}
m["dox"] = {
"Bussa",
35123,
"cus-eas",
"Latn",
}
m["doy"] = {
"Dompo",
35270,
"alv-gng",
"Latn",
}
m["doz"] = {
"Dorze",
56336,
"omv-nom",
"Latn",
}
m["dpp"] = {
"Papar",
7132487,
"poz-san",
"Latn",
}
m["drb"] = {
"Dair",
12952360,
"nub-hil",
"Latn",
}
m["drc"] = {
"Minderico",
6863806,
"roa-gap",
"Latn",
ancestors = "pt",
}
m["drd"] = {
"Darmiya",
5224058,
"sit-alm",
}
m["drg"] = {
"Rungus",
6897407,
"poz-san",
"Latn",
}
m["dri"] = {
"Lela",
3914004,
"nic-knn",
"Latn",
}
m["drl"] = {
"Baagandji",
5223941,
"aus-pam",
"Latn",
}
m["drn"] = {
"West Damar",
3450459,
"poz-tim",
"Latn",
}
m["dro"] = {
"Daro-Matu Melanau",
5224156,
"poz-bnn",
"Latn",
}
m["drq"] = {
"Dura",
3449842,
"sit-gma",
"Deva",
}
m["drs"] = {
"Gedeo",
56622,
"cus-hec",
"Ethi",
}
m["dru"] = {
"Rukai",
49232,
"map",
"Latn",
ancestors = "dru-pro",
}
m["dry"] = {
"Darai",
46995026,
"inc-bhi",
"Deva",
translit = {
Deva = "Deva-translit",
},
}
m["dsb"] = {
"ซอร์บตอนล่าง",
13286,
"wen",
"Latn",
sort_key = s["wen-sortkey"],
standard_chars = "AaBbCcČčĆćDdEeĚěFfGgHhIiJjKkŁłLlMmNnŃńOoÓóPpRrŔŕSsŠšŚśTtUuWwYyZzŽžŹź" .. c.punc,
}
m["dse"] = {
"Dutch Sign Language",
2201099,
"sgn",
"Latn", -- when documented
}
m["dsh"] = {
"Daasanach",
56637,
"cus-eas",
"Latn",
}
m["dsi"] = {
"Disa",
3914455,
"csu-bgr",
"Latn",
}
m["dsl"] = {
"Danish Sign Language",
2605298,
"sgn",
"Latn", -- when documented
}
m["dsn"] = {
"Dusner",
5316948,
"poz-hce",
"Latn",
}
m["dso"] = {
"Desiya",
12629755,
"inc-eas",
"Orya",
ancestors = "or",
}
m["dsq"] = {
"Tadaksahak",
36568,
"son",
"Arab, Latn",
}
m["dta"] = {
"Daur",
32430,
"xgn",
"Latn, Hani, Cyrl, Mong",
ancestors = "xng",
-- Mong translit, display_text and strip_diacritics in [[Module:scripts/data]]
sort_key = {Hani = "Hani-sortkey"},
}
m["dtb"] = {
"Labuk-Kinabatangan Kadazan",
5330240,
"poz-san",
"Latn",
}
m["dtd"] = {
"Ditidaht",
13728042,
"wak",
"Latn",
}
m["dth"] = { -- contrast 'rrt'
"Adithinngithigh",
4683034,
"aus-pmn",
"Latn",
}
m["dti"] = {
"Ana Tinga Dogon",
4750346,
"qfa-dgn",
"Latn",
}
m["dtk"] = {
"Tene Kan Dogon",
11018863,
"nic-pld",
"Latn",
}
m["dtm"] = {
"Tomo Kan Dogon",
11137719,
"nic-pld",
"Latn",
}
m["dto"] = {
"Tommo So",
47012992,
"nic-dge",
"Latn",
}
m["dtp"] = {
"ดูซุนตอนกลาง",
5317225,
"poz-san",
"Latn",
}
m["dtr"] = {
"Lotud",
6685078,
"poz-san",
"Latn",
}
m["dts"] = {
"Toro So Dogon",
11003311,
"nic-dge",
"Latn",
}
m["dtt"] = {
"Toro Tegu Dogon",
3913924,
"nic-pld",
"Latn",
}
m["dtu"] = {
"Tebul Ure Dogon",
7692089,
"qfa-dgn",
"Latn",
}
m["dty"] = {
"Doteli",
18415595,
"inc-pah",
"Deva",
translit = {
Deva = "Deva-translit",
},
}
m["dua"] = {
"Duala",
33013,
"bnt-saw",
"Latn",
}
m["dub"] = {
"Dubli",
5310792,
"inc-bhi",
}
m["duc"] = {
"Duna",
5314039,
"qfa-dis", -- Papuan; isolate in Glottolog; tentatively grouped with Bogaya into a Duna-Pogaya [sic] family,
-- ultimately in TNG
"Latn",
}
m["due"] = {
"Umiray Dumaget Agta",
7881585,
"phi",
"Latn",
}
m["duf"] = {
"Dumbea",
6983819,
"poz-cln",
"Latn",
}
m["dug"] = {
"Chiduruma",
35614,
"bnt-mij",
"Latn",
}
m["duh"] = {
"Dungra Bhil",
12953513,
"inc-bhi",
"Deva, Gujr",
translit = {
Deva = "Deva-translit",
Gujr = "Gujr-translit",
},
}
m["dui"] = {
"Dumun",
5314004,
"ngf-yag",
"Latn",
}
m["duk"] = {
"Uyajitaya",
7904085,
"ngf-nur",
"Latn",
}
m["dul"] = {
"Alabat Island Agta",
3399709,
"phi",
"Latn",
}
m["dum"] = {
"ดัตช์กลาง",
178806,
"gmw-frk",
"Latn",
ancestors = "odt",
strip_diacritics = {remove_diacritics = c.circ .. c.macron .. c.diaer},
}
m["dun"] = {
"Dusun Deyah",
2784033,
"poz-bre",
"Latn",
}
m["duo"] = {
"Dupaningan Agta",
5315912,
"phi",
"Latn",
}
m["dup"] = {
"Duano",
3040468,
"poz-mly",
"Latn",
}
m["duq"] = {
"Dusun Malang",
3041711,
"poz-bre",
"Latn",
}
m["dur"] = {
"Dii",
35208,
"alv-dur",
"Latn",
}
m["dus"] = {
"Dumi",
56315,
"sit-kiw",
"Deva",
translit = {
Deva = "Deva-translit",
},
}
m["duu"] = {
"Drung",
56406,
"sit-nng",
"Latn",
}
m["duv"] = {
"Duvle",
56364,
"paa-lkp",
"Latn",
}
m["duw"] = {
"Dusun Witu",
2381310,
"poz-bre",
"Latn",
}
m["dux"] = {
"Duun",
3914880,
"dmn-smg",
"Latn",
}
m["duy"] = {
"Dicamay Agta",
5272321,
"phi",
"Latn",
}
m["duz"] = {
"Duli",
5313405,
"alv-ada",
"Latn",
}
m["dva"] = {
"Duau",
5310448,
"poz-ocw",
"Latn",
}
m["dwa"] = {
"Diri",
56286,
"cdc-wst",
"Latn",
}
m["dwr"] = {
"เดาโร",
12629647,
"omv-nom",
"Ethi, Latn",
}
m["dwu"] = {
"Dhuwal",
3120791,
"aus-yol",
"Latn",
}
m["dww"] = {
"Dawawa",
5242286,
"poz-ocw",
"Latn",
}
m["dwy"] = {
"Dhuwaya",
63348560,
"aus-yol",
"Latn",
}
m["dwz"] = {
"Dewas Rai",
62663667,
"inc-bhi",
}
m["dya"] = {
"Dyan",
35340,
"nic-gur",
"Latn",
}
m["dyb"] = {
"Dyaberdyaber",
5285185,
"aus-nyu",
"Latn",
}
m["dyd"] = {
"Dyugun",
3913785,
"aus-nyu",
"Latn",
}
m["dyg"] = {
"Villa Viciosa Agta",
12626611,
"phi",
"Latn",
}
m["dyi"] = {
"Djimini",
35336,
"alv-tdj",
"Latn",
}
m["dym"] = {
"Yanda Dogon",
8048316,
"qfa-dgn",
"Latn",
}
m["dyn"] = {
"Dyangadi",
3913820,
"aus-cww",
"Latn",
}
m["dyo"] = {
"Jola-Fonyi",
3507832,
"alv-jol",
"Latn",
}
m["dyu"] = {
"Dyula",
32706,
"dmn-man",
"Latn",
}
m["dyy"] = {
"Dyaabugay",
2591320,
"aus-pmn",
"Latn",
}
m["dza"] = {
"Tunzu",
3915845,
"nic-jer",
"Latn",
}
m["dzg"] = {
"Dazaga",
35244,
"ssa-sah",
"Latn",
}
m["dzl"] = {
"Dzala",
56607,
"sit-ebo",
"Tibt",
override_translit = true,
-- Tibt translit, display_text, strip_diacritics, sort_key in [[Module:scripts/data]]
}
m["dzn"] = {
"Dzando",
5319622,
"bnt-bun",
"Latn",
}
return require("Module:languages").finalizeData(m, "language")
ryuqo5rd4w08so4wvmqtbfz9z1tn892
มอดูล:languages/data/3/c
828
36384
5720754
5684151
2026-04-21T07:00:52Z
OctraBot
3198
บอต: แทนที่ข้อความโดยอัตโนมัติ (-standardChars +standard_chars)
5720754
Scribunto
text/plain
local m_langdata = require("Module:languages/data")
-- Loaded on demand, as it may not be needed (depending on the data).
local function u(...)
u = require("Module:string utilities").char
return u(...)
end
local c = m_langdata.chars
local p = m_langdata.puaChars
local s = m_langdata.shared
local m = {}
m["caa"] = {
"Ch'orti'",
35177,
"myn",
"Latn",
}
m["cab"] = {
"Garifuna",
35490,
"awd-taa",
"Latn",
ancestors = "crb",
}
m["cac"] = {
"Chuj",
35233,
"myn",
"Latn",
}
m["cad"] = {
"Caddo",
56756,
"cdd",
"Latn",
}
m["cae"] = {
"Laalaa",
35564,
"alv-cng",
"Latn",
}
m["caf"] = {
"Southern Carrier",
12953426,
"ath-nor",
"Latn",
}
m["cag"] = {
"Nivaclé",
3182557,
"sai-mtc",
"Latn",
}
m["cah"] = {
"Cahuarano",
2933175,
"sai-zap",
"Latn",
}
m["caj"] = {
"Chané",
56721,
"awd",
"Latn",
}
m["cak"] = {
"Kaqchikel",
35115,
"myn",
"Latn",
}
m["cal"] = {
"Carolinian",
28427,
"poz-mic",
"Latn",
}
m["cam"] = {
"Cèmuhî",
3009690,
"poz-cln",
"Latn",
}
m["can"] = {
"Chambri",
5069707,
"paa-lsp",
"Latn",
}
m["cao"] = {
"Chácobo",
2591202,
"sai-pan",
"Latn",
}
m["cap"] = {
"Chipaya",
35235,
"sai-ucp",
"Latn",
}
m["caq"] = {
"คาร์นิโคบาร์",
35156,
"aav-nic",
"Latn, Deva",
}
m["car"] = {
"Kari'na",
56611,
"sai-gui",
"Latn",
sort_key = {remove_diacritics = c.grave .. c.acute .. c.circ .. "`" .. "'%-%s"},
strip_diacritics = {
remove_diacritics = c.acute,
from = {"â", "ê", "î", "ô", "û", "ŷ"},
to = {"à", "è", "ì", "ò", "ù", "ỳ"}
},
}
m["cas"] = {
"Tsimané",
35950,
"qfa-iso",
"Latn",
}
m["cav"] = {
"Cavineña",
524102,
"sai-tac",
"Latn",
}
m["caw"] = {
"Kallawaya",
266417,
"qfa-mix",
"Latn",
}
m["cax"] = {
"Chiquitano",
1844993,
"qfa-iso",
"Latn",
}
m["cay"] = {
"Cayuga",
32967,
"iro-nor",
"Latn",
}
m["caz"] = {
"Canichana",
2936374,
"qfa-iso",
"Latn",
}
m["cbb"] = {
"Cabiyarí",
3450660,
"awd-nwk",
"Latn",
}
m["cbc"] = {
"Carapana",
924405,
"sai-tuc",
"Latn",
}
m["cbd"] = {
"Carijona",
3446655,
"sai-tar",
"Latn",
}
m["cbg"] = {
"Chimila",
2963680,
"cba",
"Latn",
}
m["cbi"] = {
"Chachi",
2591329,
"sai-bar",
"Latn",
}
m["cbj"] = {
"Ede Cabe",
33112829,
"alv-ede",
"Latn",
}
m["cbk"] = {
"ชาบากาโน",
33281,
"crp",
"Latn",
ancestors = "es",
strip_diacritics = {remove_diacritics = c.grave .. c.acute .. c.circ .. c.diaer},
sort_key = {
from = {"ch", "ll", "ñ", "r"},
to = {"c" .. p[1], "l" .. p[1], "n" .. p[1], "r" .. p[1]}
},
standard_chars = "AaBbCcDdEeFfGgHhIiJjKkLlMmNnÑñOoPpQqRrSsTtUuVvWwXxYyZz" .. c.punc,
}
m["cbl"] = {
"Bualkhaw Chin",
9229830,
"tbq-kuk",
"Latn",
}
m["cbn"] = {
"ญัฮกุร",
116849,
"mkh-mnc",
"Thai",
ancestors = "omx",
--sort_key = "Thai-sortkey",
}
m["cbo"] = {
"Izora",
3915454,
"nic-jer",
"Latn",
}
m["cbq"] = {
"Tsucuba",
62603062,
"nic-knj",
"Latn",
}
m["cbr"] = {
"Cashibo-Cacataibo",
5359560,
"sai-pan",
"Latn",
}
m["cbs"] = {
"Cashinahua",
2591230,
"sai-pan",
"Latn",
}
m["cbt"] = {
"Chayahuita",
1526525,
"sai-cah",
"Latn",
}
m["cbu"] = {
"Candoshi-Shapra",
642843,
"qfa-iso",
"Latn",
}
m["cbv"] = {
"Cacua",
3192052,
"sai-nad",
"Latn",
ancestors = "mbr",
}
m["cbw"] = {
"Kinabalian",
6410324,
"phi",
"Latn",
}
m["cby"] = {
"Carabayo",
3441762,
"sai-tyu",
"Latn",
}
m["cca"] = {
"Cauca",
5054242,
"sai-chc",
"Latn",
}
m["ccc"] = {
"Chamicuro",
2155119,
"awd",
"Latn",
}
m["ccd"] = {
"Cafundó",
3331506,
"roa-gap",
"Latn",
ancestors = "pt",
}
m["cce"] = {
"Chopi",
3437616,
"bnt-bso",
"Latn",
}
m["ccg"] = {
"Chamba Daka",
33120805,
"nic-dak",
"Latn",
}
m["cch"] = {
"Atsam",
34794,
"nic-kne",
"Latn",
}
m["ccj"] = {
"Kasanga",
35542,
"alv-nyn",
"Latn",
}
m["ccl"] = {
"Cutchi-Swahili",
5196729,
"crp",
"Latn",
ancestors = "sw",
}
m["ccm"] = {
"Malaccan Creole Malay",
12636092,
"crp",
"Latn",
ancestors = "ms",
}
m["cco"] = {
"Comaltepec Chinantec",
2963735,
"omq-chi",
"Latn",
}
m["ccp"] = {
"จักมา",
32952,
"inc-bas",
"Cakm, Beng, Latn",
ancestors = "inc-obn",
translit = {
Cakm = "Cakm-translit",
Beng = "Beng-translit",
},
}
m["ccr"] = {
"Cacaopera",
3438338,
"nai-min",
"Latn",
}
m["cda"] = {
"Choni",
2964447,
"sit-tib",
}
m["cde"] = {
"Chenchu",
32981,
"dra-tel",
"Telu",
}
m["cdf"] = {
"Chiru",
5102016,
"tbq-kuk",
"Latn, Beng",
}
m["cdh"] = {
"Chambeali",
12953424,
"him",
"Deva, Takr",
translit = {
Deva = "Deva-translit",
},
}
m["cdi"] = {
"Chodri",
5103788,
"inc-bhi",
"Gujr",
}
m["cdj"] = {
"Churahi",
12629039,
"him",
"Deva, Takr",
translit = {
Deva = "Deva-translit",
},
}
m["cdm"] = {
"Chepang",
5091700,
"sit-gma",
"Deva",
translit = {
Deva = "Deva-translit",
},
}
m["cdn"] = {
"Chaudangsi",
5088056,
"sit-alm",
}
m["cdo"] = {
"หมิ่นตะวันออก",
36455,
"zhx-com",
"Hants",
generate_forms = "zh-generateforms",
translit = "zh-translit",
sort_key = "Hani-sortkey",
}
m["cdr"] = {
"Cinda-Regi-Tiyal",
35596,
"nic-kmk",
"Latn",
}
m["cds"] = {
"Chadian Sign Language",
10322099,
"sgn",
"Latn", -- when documented
}
m["cdy"] = {
"Chadong",
926742,
"qfa-kms",
}
m["cdz"] = {
"Koda",
6425038,
"mun",
"Beng",
}
m["cea"] = {
"Lower Chehalis",
6693377,
"sal",
"Latn",
}
m["ceb"] = {
"เซบัวโน",
33239,
"phi",
"Latn, Tglg",
translit = {
Tglg = "ceb-translit"
},
override_translit = true,
strip_diacritics = {
Latn = {
remove_diacritics = c.grave .. c.acute .. c.circ
}
},
sort_key = {
Latn = "tl-sortkey",
},
standard_chars = {
Latn = "AaBbKkDdEeGgHhIiLlMmNnOoPpRrSsTtUuWwYy",
c.punc
},
}
m["ceg"] = {
"Chamacoco",
3436637,
"sai-zam",
"Latn",
}
m["cen"] = {
"Cen",
12628777,
"nic-plc",
"Latn",
ancestors = "izr",
}
m["cet"] = {
"Centúúm",
33608,
"qfa-iso", -- northeastern Nigeria
"Latn",
}
m["cfa"] = {
"Dijim-Bwilim",
3438350,
"alv-wjk",
"Latn",
}
m["cfd"] = {
"Cara",
35048,
"nic-beo",
"Latn",
}
m["cfg"] = {
"Como Karim",
35304,
"nic-jkn",
"Latn",
}
m["cfm"] = {
"Falam Chin",
56815,
"tbq-kuk",
"Beng, Latn",
}
m["cga"] = {
"Changriwa",
5072105,
"paa-yua",
"Latn",
}
m["cgc"] = {
"Kagayanen",
6346422,
"mno",
"Latn",
}
m["cgg"] = {
"Rukiga",
3270727,
"bnt-nyg",
"Latn",
}
m["cgk"] = {
"Chocangaca",
56604,
"sit-tib",
"Tibt",
ancestors = "xct",
override_translit = true,
-- Tibt translit, display_text, strip_diacritics, sort_key in [[Module:scripts/data]]
}
m["chb"] = {
"Chibcha",
2356431,
"cba",
"Latn",
}
m["chc"] = {
"Catawba",
5051602,
"nai-cat",
"Latn",
}
m["chd"] = {
"Highland Oaxaca Chontal",
2964457,
"nai-tqn",
"Latn",
}
m["chf"] = {
"Chontal Maya",
35175,
"myn",
"Latn",
}
m["chg"] = {
"ชากาทาย",
36831,
"trk-kar",
"Arab, Ougr",
ancestors = "zkh",
strip_diacritics = {
remove_diacritics = c.kashida .. c.fathatan .. c.dammatan .. c.kasratan .. c.fatha .. c.damma .. c.kasra .. c.shadda .. c.sukun .. c.superalef,
from = {u(0x0671)},
to = {u(0x0627)}
},
translit = {
Arab = "chg-translit",
Ougr = "Ougr-translit",
},
}
m["chh"] = {
"Chinook",
6693380,
"nai-ckn",
"Latn",
}
m["chj"] = {
"Ojitlán Chinantec",
5100110,
"omq-chi",
"Latn",
}
m["chk"] = {
"Chuukese",
33161,
"poz-mic",
"Latn",
}
m["chl"] = {
"Cahuilla",
56438,
"azc-cup",
"Latn",
strip_diacritics = {remove_diacritics = c.acute .. c.macron},
}
-- chm "Mari" is not recognized as a language, but it is a family code
m["chn"] = {
"Chinook Jargon",
35173,
"crp",
"Latn, Dupl",
ancestors = "chh, nuk",
}
m["cho"] = {
"Choctaw",
32979,
"nai-mus",
"Latn",
sort_key = {remove_diacritics = c.macronbelow .. "-"},
strip_diacritics = {remove_diacritics = c.acute .. c.dotbelow},
}
m["chp"] = {
"Chipewyan",
27692,
"ath-nor",
"Latn, Cans",
}
m["chq"] = {
"Quiotepec Chinantec",
5758709,
"omq-chi",
"Latn",
}
m["chr"] = {
"เชโรกี",
33388,
"iro",
"Cher",
translit = "Cher-translit",
}
m["cht"] = {
"Cholón",
2591243,
"qfa-unc", -- poorly attested; possibly in a Hibito-Cholon or Cholonan family
"Latn",
}
m["chw"] = {
"Chuabo",
5118412,
"bnt-mak",
"Latn",
}
m["chx"] = {
"Chantyal",
4926344,
"sit-tam",
"Deva",
translit = {
Deva = "Deva-translit",
},
}
m["chy"] = {
"เชเยนน์",
33265,
"alg",
"Latn",
sort_key = {remove_diacritics = c.grave .. c.acute .. c.macron .. c.dotabove .. "-"},
standard_chars = "AaÁáÀàĀāȦȧEeÉéÈèĒēĖėHhKkMmNnOoÓóÒòŌōȮȯPpSsŠšTtVvXx" .. c.punc, --umlaut and circumflex not allowed
}
m["chz"] = {
"Ozumacín Chinantec",
5100111,
"omq-chi",
"Latn",
}
m["cia"] = {
"Cia-Cia",
35284,
"poz-mun",
"Hang, Latn, Arab",
}
m["cib"] = {
"Ci Gbe",
12952445,
"alv-gbe",
"Latn",
}
m["cic"] = {
"Chickasaw",
33192,
"nai-mus",
"Latn",
}
m["cid"] = {
"Chimariko",
1294251,
"qfa-iso", -- possibly Hokan
"Latn",
}
m["cie"] = {
"Cineni",
56243,
"cdc-cbm",
"Latn",
}
m["cih"] = {
"Chinali",
11855245,
"inc",
"Deva",
ancestors = "sa",
translit = {
Deva = "Deva-translit",
},
}
m["cik"] = {
"Chitkuli Kinnauri",
15615982,
"sit-kin",
}
m["cim"] = {
"Cimbrian",
37053,
"gmw-hgm",
"Latn",
ancestors = "bar",
sort_key = {remove_diacritics = c.grave .. c.acute .. c.circ .. c.diaer .. c.ringabove .. c.caron},
}
m["cin"] = {
"Cinta Larga",
5121095,
"tup",
"Latn",
}
m["cip"] = {
"Chiapanec",
3364475,
"omq",
"Latn",
}
m["cir"] = {
"Tinrin",
7862281,
"poz-cln",
"Latn",
}
m["ciy"] = {
"Chaima",
12628867,
"sai-ven",
"Latn",
}
m["cja"] = {
"จามตะวันตก",
12645578,
"cmc",
"Latn, Arab, Khmr, Cham", -- Western Cham script is not yet available. Also, Arabic script is missing some glyphs.
}
m["cje"] = {
"Chru",
2967321,
"cmc",
"Latn",
}
m["cjh"] = {
"Upper Chehalis",
2962074,
"sal",
"Latn",
}
m["cji"] = {
"Chamalal",
56567,
"cau-and",
"Cyrl",
translit = "cau-nec-translit",
override_translit = true,
display_text = s["cau-Cyrl-displaytext"],
strip_diacritics = s["cau-Cyrl-stripdiacritics"],
}
m["cjk"] = {
"Chokwe",
2422065,
"bnt-clu",
"Latn",
}
m["cjm"] = {
"จามตะวันออก",
2948019,
"cmc",
"Latn, Cham",
}
m["cjn"] = {
"Chenapian",
5091044,
"paa-spk",
"Latn",
}
m["cjo"] = {
"Ashéninka Pajonal",
3450481,
"awd",
"Latn",
}
m["cjp"] = {
"Cabécar",
27878,
"cba",
"Latn",
}
m["cjs"] = {
"โชร์",
34139,
"trk-ssb",
"Cyrl",
}
m["cjv"] = {
"Chuave",
5115226,
"ngf-chw",
"Latn",
}
m["cjy"] = {
"จิ้น",
56479,
"zhx",
"Hants",
ancestors = "ltc",
generate_forms = "zh-generateforms",
translit = "zh-translit",
sort_key = "Hani-sortkey",
}
m["ckb"] = {
"เคิร์ดตอนกลาง",
36811,
"ku",
"ku-Arab",
translit = "ckb-translit",
strip_diacritics = {remove_diacritics = c.kasra .. c.sukun},
}
m["ckh"] = {
"Chak",
12628870,
"sit-luu",
"Latn",
ancestors = "kdv",
}
m["ckl"] = {
"Cibak",
56279,
"cdc-cbm",
"Latn",
}
m["ckn"] = {
"Kaang Chin",
6343432,
"tbq-kuk",
"Latn",
}
m["cko"] = {
"Anufo",
34845,
"alv-ctn",
"Latn",
}
m["ckq"] = {
"Kajakse",
3440422,
"cdc-est",
"Latn",
}
m["ckr"] = {
"Kairak",
3503002,
"paa-bng",
"Latn",
}
m["cks"] = {
"Tayo",
1133089,
"crp",
"Latn",
ancestors = "fr",
sort_key = s["roa-oil-sortkey"],
}
m["ckt"] = {
"ชุกชี",
33170,
"qfa-ckn",
"Cyrl, Latn", -- Latn is obsolete
strip_diacritics = {
from = {"['’]"},
to = {"ʼ"}
},
sort_key = {
from = {"ё", "ӄ", "ԓ", "ӈ"},
to = {"е" .. p[1], "к" .. p[1], "л" .. p[1], "н" .. p[1]}
},
}
m["cku"] = {
"Koasati",
35162,
"nai-mus",
"Latn",
}
m["ckv"] = {
"กบาลัน",
716627,
"map",
"Latn",
}
m["ckx"] = {
"Caka",
5018037,
"nic-tvc",
"Latn",
}
m["cky"] = {
"Cakfem-Mushere",
3441199,
"cdc-wst",
"Latn",
}
m["ckz"] = {
"Kaqchikel-K'iche' Mixed Language",
5054550,
"qfa-mix",
"Latn",
ancestors = "cak, quc"
}
m["cla"] = {
"Ron",
3440432,
"cdc-wst",
"Latn",
}
m["clc"] = {
"Chilcotin",
28535,
"ath-nor",
"Latn",
}
m["cld"] = {
"Chaldean Neo-Aramaic",
33236,
"sem-are",
"Syrc",
strip_diacritics = "Syrc-stripdiacritics",
}
m["cle"] = {
"Lealao Chinantec",
6509365,
"omq-chi",
"Latn",
}
m["clh"] = {
"Chilisso",
3250629,
"inc-koh",
"ur-Arab",
}
m["cli"] = {
"Chakali",
35206,
"nic-gnw",
"Latn",
}
m["clj"] = {
"Laitu Chin",
6474196,
"tbq-kuk",
}
m["clk"] = {
"Idu",
56412,
"sit-gsi",
"Tibt, Deva",
translit = {
Deva = "Deva-translit",
},
override_translit = true,
-- Tibt translit, display_text, strip_diacritics, sort_key in [[Module:scripts/data]]
}
m["cll"] = {
"Chala",
35190,
"nic-gne",
"Latn",
}
m["clm"] = {
"Klallam",
33404,
"sal",
"Latn",
}
m["clo"] = {
"Lowland Oaxaca Chontal",
2964450,
"nai-tqn",
"Latn",
}
m["clt"] = {
"Lutuv",
6502107,
"tbq-kuk",
"Latn",
}
m["clu"] = {
"Caluyanun",
32964,
"phi",
"Latn",
}
m["clw"] = {
"Chulym",
33125,
"trk-ssb",
"Latn, Cyrl",
}
m["cly"] = {
"Eastern Highland Chatino",
12642078,
"omq-cha",
"Latn",
}
m["cma"] = {
"หมะ",
12953680,
"mkh-ban",
"Latn",
}
m["cme"] = {
"Cerma",
35074,
"nic-gur",
"Latn",
}
m["cmg"] = {
"มองโกเลียคลาสสิก",
5128303,
"xgn-cen",
"Mong, Soyo, Zanb",
-- Mong translit, display_text and strip_diacritics in [[Module:scripts/data]]
}
m["cmi"] = {
"Emberá-Chamí",
3052042,
"sai-chc",
"Latn",
}
m["cml"] = {
"Campalagian",
5027893,
"poz-ssw",
"Latn",
}
m["cmm"] = {
"Michigamea",
12636809,
"sio-msv",
"Latn",
}
m["cmn"] = {
"จีนกลาง",
9192,
"zhx-man",
"Hants, Latn, Bopo, Brai",
wikimedia_codes = "zh",
generate_forms = "zh-generateforms",
translit = {
Hani = "zh-translit",
Bopo = "zh-translit",
},
sort_key = {
Hani = "Hani-sortkey",
Latn = {
from = {
-- Sort terms with tone numbers immediately after equivalent terms with diacritics.
"[aeiouv][" .. c.circ .. c.diaer .. "]?[nr]?g?[0-5]",
-- Add temporary breaks between syllables.
"([aeiouvmn][" .. c.circ .. c.diaer .. "]?[" .. c.macron .. c.acute .. c.caron .. c.grave .. "]?n?ŋ?g?r?)([bpmfdtnlgkhjqxzcsywrv']h?[aeiouvmn ])", p[1] .. "([ngr])$", p[1] .. "([ngr][%s%-'" .. p[1] .. "])",
-- Substitute diacritics for syllable-final tone numbers, and add tone 0 where necessary.
c.macron, c.acute, c.caron, c.grave, "([1-4])([^%s%p" .. p[1] .. "]+)", "([^0-5])%f[%z%s%p" .. p[1] .. "]",
-- Substitute "v" shorthand for "ü" for a temporary placeholder, so that the (very rare) "v" initial is not affected by the later shorthand substitutions.
"([^ " .. p[1] .. "])v",
-- Remove temporary breaks.
p[1],
-- Substitute shorthands for full forms, and sort them immediately after equivalent terms.
"%S*[csz]" .. c.circ .. "%S*", "%S*[ŋ" .. p[2] .. "]%S*", "ĉ", "ŝ", "ŋ", p[2], "ẑ",
-- "ê" comes after "e", "ü" comes after "u" and apostrophes are removed (as their function is replaced by tone numbers).
"[" .. c.circ .. c.diaer .. "]", "'",
-- Sort numbered tone 5 after tone 0.
"5!"
},
to = {
"%0!",
"%1" .. p[1] .. "%2", "%1", "%1",
"1", "2", "3", "4", "%2%1", "%10",
"%1" .. p[2],
"",
"%0\"", "%0\"", "ch", "sh", "ng", "ü", "zh",
p[1], "",
"0!!"
}
},
},
}
m["cmo"] = {
"มนองตอนกลาง",
33369881,
"mkh-ban",
"Khmr, Latn",
}
m["cmr"] = {
"Mro Chin",
16889978,
"tbq-kuk",
}
m["cms"] = {
"Messapic",
36383,
"ine",
"Ital, Latn, Polyt",
-- Ital translit in [[Module:scripts/data]]
-- Polyt translit, display_text, strip_diacritics, sort_key in [[Module:scripts/data]]
}
m["cmt"] = {
"Camtho",
10441336,
"crp",
"Latn",
ancestors = "fly, zu"
}
m["cna"] = {
"Changthang",
12952322,
"sit-lab",
"Tibt",
override_translit = true,
-- Tibt translit, display_text, strip_diacritics, sort_key in [[Module:scripts/data]]
}
m["cnb"] = {
"Chinbon Chin",
12952327,
"tbq-kuk",
"Latn",
}
m["cnc"] = {
"Cốông",
5202780,
"tbq-bis",
"Latn",
}
m["cng"] = {
"Northern Qiang",
56559,
"sit-qia",
"Latn",
}
m["cnh"] = {
"Lai",
3250286,
"tbq-kuk",
"Latn, Mymr",
}
m["cni"] = {
"Asháninka",
3437230,
"awd",
"Latn",
}
m["cnk"] = {
"Khumi Chin",
56308,
"tbq-kuk",
"Latn",
}
m["cnl"] = {
"Lalana Chinantec",
12953437,
"omq-chi",
"Latn",
}
m["cno"] = {
"Con",
3440883,
"mkh-pal",
}
m["cnp"] = {
"ผิงเหนือ",
84302463,
"zhx-pin",
"Hants",
generate_forms = "zh-generateforms",
sort_key = "Hani-sortkey",
}
m["cns"] = {
"Central Asmat",
11732048,
"ngf-ask",
"Latn",
}
m["cnt"] = {
"Tepetotutla Chinantec",
5100113,
"omq-chi",
"Latn",
}
m["cnu"] = {
"Chenoua",
33276,
"ber",
"Latn",
}
m["cnw"] = {
"Ngawn Chin",
6583675,
"tbq-kuk",
}
m["cnx"] = {
"Middle Cornish",
12642603,
"cel-brs",
"Latn",
ancestors = "oco",
}
m["coa"] = {
"Cocos Islands Malay",
3441699,
"crp",
"Latn",
ancestors = "ms",
}
m["cob"] = {
"Chicomuceltec",
3307204,
"myn",
"Latn",
}
m["coc"] = {
"Cocopa",
33044,
"nai-yuc",
"Latn",
}
m["cod"] = {
"Cocama",
33317,
"tup",
"Latn",
}
m["coe"] = {
"Koreguaje",
3198924,
"sai-tuc",
"Latn",
}
m["cof"] = {
"Tsafiki",
2567055,
"sai-bar",
"Latn",
}
m["cog"] = {
"ชอง",
3914630,
"mkh-pea",
"Thai, Khmr",
translit = {
Khmr = "Khmr-translit",
},
--sort_key = {
-- Thai = "Thai-sortkey"
--},
}
m["coh"] = {
"Chichonyi-Chidzihana-Chikauma",
12629011,
"bnt-mij",
"Latn",
}
m["coj"] = {
"Cochimi",
3915551,
"nai-yuc",
"Latn",
}
m["cok"] = {
"Santa Teresa Cora",
12641754,
"azc",
"Latn",
}
m["col"] = {
"Columbia-Wenatchi",
3324744,
"sal",
"Latn",
}
m["com"] = {
"Comanche",
32972,
"azc-num",
"Latn",
}
m["con"] = {
"Cofán",
2669254,
"qfa-iso",
"Latn",
}
m["coo"] = {
"Comox",
13583746,
"sal",
"Latn",
}
m["cop"] = {
"คอปติก",
36155,
"egx",
"Copt",
translit = "Copt-translit",
ancestors = "egx-dem",
strip_diacritics = {remove_diacritics = c.grave .. c.macron .. c.overline .. c.diaer .. "ˋ"},
sort_key = "Copt-sortkey",
}
m["coq"] = {
"Coquille",
12953452,
"ath-pco",
"Latn",
}
m["cot"] = {
"Caquinte",
3915557,
"awd",
"Latn",
}
m["cou"] = {
"Wamey",
36935,
"alv-ten",
"Latn",
}
m["cov"] = {
"เฉ่าเหมียว",
2936935,
"qfa-tak",
}
m["cow"] = {
"Cowlitz",
3001877,
"sal",
"Latn",
}
m["cox"] = {
"Nanti",
15342275,
"awd",
"Latn",
}
m["coy"] = {
"Coyaima",
56450,
"sai-car",
"Latn",
}
m["coz"] = {
"Chochotec",
2964262,
"omq-pop",
"Latn",
}
m["cpa"] = {
"Palantla Chinantec",
5100112,
"omq-chi",
"Latn",
}
m["cpb"] = {
"Ucayali-Yurúa Ashéninka",
3501858,
"awd",
"Latn",
}
m["cpc"] = {
"Ajyíninka Apurucayali",
3327405,
"awd",
"Latn",
}
m["cpg"] = {
"Cappadocian Greek",
853414,
"grk",
"Grek, fa-Arab",
ancestors = "gkm",
translit = {
Grek = "el-translit",
},
-- Grek display_text, strip_diacritics, sort_key in [[Module:scripts/data]]
}
m["cpi"] = {
"Chinese Pidgin English",
3435078,
"crp",
"Latn, Hant",
ancestors = "en",
sort_key = {
Hant = "Hani-sortkey"
},
}
m["cpn"] = {
"Cherepon",
35181,
"alv-gng",
"Latn",
}
m["cpo"] = {
"Kpee",
6435722,
"dmn-jje",
}
m["cps"] = {
"Capiznon",
2937525,
"phi",
"Latn",
}
m["cpu"] = {
"Pichis Ashéninka",
7190661,
"awd",
"Latn",
}
m["cpx"] = {
"ผูเซียน",
56583,
"zhx-com",
"Hants",
generate_forms = "zh-generateforms",
sort_key = "Hani-sortkey",
}
m["cpy"] = {
"South Ucayali Ashéninka",
3501868,
"awd",
"Latn",
}
m["cqd"] = {
"Chuanqiandian Cluster Miao",
121627627,
"hmn",
"Latn, Plrd",
}
m["cra"] = {
"Chara",
5073694,
"omv",
"Latn",
}
m["crb"] = {
"Kalinago",
3450735,
"awd-taa",
"Latn",
}
m["crc"] = {
"Lonwolwol",
3259216,
"poz-vnc",
"Latn",
}
m["crd"] = {
"Coeur d'Alene",
32915,
"sal",
"Latn",
}
m["crf"] = {
"Caramanta",
3504195,
"sai-chc",
"Latn",
}
m["crg"] = {
"Michif",
13315,
"qfa-mix",
"Latn",
ancestors = "cr, fr",
}
m["crh"] = {
"ตาตาร์แบบไครเมีย",
33357,
"trk-kcu",
"Latn, Cyrl",
dotted_dotless_i = true,
sort_key = {
Latn = {
from = {
"[ıi]" .. c.breve, -- Convert ĭ into PUA so that the decomposed form does not get caught by the next step. Also cover decomposed forms with ı and i, as decomposed Ĭ is converted to ı + ̆ due to the dotted dotless I logic).
"i", -- Ensure "i" comes after "ı".
"â", "ç", "ğ", "ı", p[3], "ñ", "ö", "ş", "ü"
},
to = {
p[3],
"i" .. p[1],
"a", "c" .. p[1], "g" .. p[1], "i", "i" .. p[2], "n" .. p[1], "o" .. p[1], "s" .. p[1], "u" .. p[1],
}
},
Cyrl = {
from = {"гъ", "ё", "къ", "нъ", "дж"},
to = {"г" .. p[1], "е" .. p[1], "к" .. p[1], "н" .. p[1], "ч" .. p[1]}
},
},
}
m["cri"] = {
"Sãotomense",
36536,
"crp",
"Latn",
ancestors = "pt",
}
m["crj"] = {
"Southern East Cree",
12953464,
"alg",
"Latn, Cans",
ancestors = "cr",
translit = {
Cans = "cr-translit"
},
}
m["crk"] = {
"Plains Cree",
56699,
"alg",
"Latn, Cans",
ancestors = "cr",
}
m["crl"] = {
"Northern East Cree",
12642195,
"alg",
"Latn, Cans",
ancestors = "cr",
translit = {
Cans = "cr-translit"
},
}
m["crm"] = {
"Moose Cree",
3446671,
"alg",
"Latn, Cans",
ancestors = "cr",
}
m["crn"] = {
"Cora",
12953454,
"azc",
"Latn",
}
m["cro"] = {
"Crow",
1207611,
"sio-mor",
"Latn",
}
m["crq"] = {
"Iyo'wujwa Chorote",
3540927,
"sai-mtc",
"Latn",
}
m["crr"] = {
"Carolina Algonquian",
16113723,
"alg-eas",
"Latn",
}
m["crs"] = {
"Seychellois Creole",
34015,
"crp",
"Latn",
ancestors = "fr",
sort_key = s["roa-oil-sortkey"],
}
m["crt"] = {
"Iyojwa'ja Chorote",
3504118,
"sai-mtc",
"Latn",
}
m["crv"] = {
"Chaura",
2605680,
"aav-nic",
"Latn",
}
m["crw"] = {
"Chrau",
5105629,
"mkh-ban",
"Latn",
}
m["crx"] = {
"Carrier",
12953431,
"ath-nor",
"Latn, Cans",
}
m["cry"] = {
"Cori",
35204,
"nic-plc",
"Latn",
}
m["crz"] = {
"Cruzeño",
2967636,
"nai-chu",
"Latn",
}
m["csa"] = {
"Chiltepec Chinantec",
12953435,
"omq-chi",
"Latn",
}
m["csb"] = {
"คาชุบ",
33690,
"zlw-pom",
"Latn",
}
m["csc"] = {
"Catalan Sign Language",
35768,
"sgn",
"Latn", -- when documented
}
m["csd"] = {
"Chiangmai Sign Language",
5095211,
"sgn",
}
m["cse"] = {
"Czech Sign Language",
5201809,
"sgn",
"Latn", -- when documented
}
m["csf"] = {
"Cuban Sign Language",
5192046,
"sgn",
"Latn", -- when documented
}
m["csg"] = {
"Chilean Sign Language",
3322112,
"sgn",
"Latn", -- when documented
}
m["csh"] = {
"Asho Chin",
12627282,
"tbq-kuk",
"Latn, Mymr",
}
m["csi"] = {
"Coast Miwok",
2981109,
"nai-utn",
"Latn",
}
m["csj"] = {
"Songlai Chin",
7561280,
"tbq-kuk",
}
m["csk"] = {
"Jola-Kasa",
3446622,
"alv-jol",
"Latn",
}
m["csl"] = {
"Chinese Sign Language",
1094190,
"sgn",
}
m["csm"] = {
"Central Sierra Miwok",
2944443,
"nai-utn",
"Latn",
}
m["csn"] = {
"Colombian Sign Language",
2748229,
"sgn",
"Latn", -- when documented
}
m["cso"] = {
"Sochiapam Chinantec",
7550388,
"omq-chi",
"Latn",
}
m["csp"] = {
"ผิงใต้",
84302019,
"zhx-pin",
"Hants",
generate_forms = "zh-generateforms",
translit = "zh-translit",
sort_key = "Hani-sortkey",
}
m["csq"] = {
"Croatian Sign Language",
3507506,
"sgn",
}
m["csr"] = {
"Costa Rican Sign Language",
5174901,
"sgn",
"Latn", -- when documented
}
m["css"] = {
"Southern Ohlone",
25559664,
"nai-utn",
"Latn",
}
m["cst"] = {
"Northern Ohlone",
25559666,
"nai-utn",
"Latn",
}
m["csv"] = {
"Sumtu Chin",
7638087,
"tbq-kuk",
}
m["csw"] = {
"Swampy Cree",
56696,
"alg",
"Latn, Cans",
ancestors = "cr",
}
m["csx"] = {
"Cambodian Sign Language",
50934287,
"sgn",
}
m["csy"] = {
"Siyin Chin",
7533375,
"tbq-kuk",
}
m["csz"] = {
"Coos",
3126783,
"nai-coo",
"Latn",
}
m["cta"] = {
"Tataltepec Chatino",
7687853,
"omq-cha",
"Latn",
}
m["ctc"] = {
"Chetco-Tolowa",
12628946,
"ath-pco",
"Latn",
}
m["ctd"] = {
"Tedim Chin",
56357,
"tbq-kuk",
"Latn, Pauc",
}
m["cte"] = {
"Tepinapa Chinantec",
12953443,
"omq-chi",
"Latn",
}
m["ctg"] = {
"Chittagonian",
33173,
"inc-bas",
"Beng",
ancestors = "inc-obn",
}
m["cth"] = {
"Thaiphum Chin",
16912048,
"tbq-kuk",
}
m["ctl"] = {
"Tlacoatzintepec Chinantec",
12643657,
"omq-chi",
"Latn",
}
m["ctm"] = {
"Chitimacha",
1294227,
"qfa-iso", -- recently proposed to be in the Totozoquean family
"Latn",
}
m["ctn"] = {
"Chhintange",
32994,
"sit-kie",
"Deva",
translit = {
Deva = "Deva-translit",
},
}
m["cto"] = {
"Emberá-Catío",
3052039,
"sai-chc",
"Latn",
}
m["ctp"] = {
"Western Highland Chatino",
32861734,
"omq-cha",
"Latn",
strip_diacritics = {remove_diacritics = "¹²³⁴⁵"},
sort_key = {remove_diacritics = c.acute},
}
m["cts"] = {
"Northern Catanduanes Bicolano",
7130477,
"phi",
"Latn",
}
m["ctt"] = {
"Wayanad Chetti",
7975850,
"dra-mal",
"Taml",
}
m["ctu"] = {
"Chol",
35179,
"myn",
"Latn",
}
m["ctz"] = {
"Zacatepec Chatino",
8063754,
"omq-cha",
"Latn",
}
m["cua"] = {
"Cua",
3441115,
"mkh-ban",
"Latn",
}
m["cub"] = {
"Cubeo",
3006705,
"sai-tuc",
"Latn",
}
m["cuc"] = {
"Usila Chinantec",
7901979,
"omq-chi",
"Latn",
}
m["cug"] = {
"Cung",
35194,
"nic-bbe",
"Latn",
}
m["cuh"] = {
"Chuka",
12952344,
"bnt-kka",
"Latn",
}
m["cui"] = {
"Cuiba",
2980421,
"sai-guh",
"Latn",
}
m["cuj"] = {
"Mashco Piro",
3446596,
"awd",
"Latn",
}
m["cuk"] = {
"Kuna",
12953659,
"cba",
"Latn",
}
m["cul"] = {
"Culina",
2475442,
"auf",
"Latn",
}
m["cuo"] = {
"Cumanagoto",
5193784,
"sai-cpc",
"Latn",
}
m["cup"] = {
"Cupeño",
143130,
"azc-cup",
"Latn",
}
m["cuq"] = {
"จุน",
2475478,
"qfa-lic",
"Latn",
}
m["cur"] = {
"Chhulung",
5116126,
"sit-kie",
"Deva",
translit = {
Deva = "Deva-translit",
},
}
m["cut"] = {
"Teutila Cuicatec",
12953453,
"omq-cui",
"Latn",
}
m["cuu"] = {
"Tai Ya",
3441122,
"qfa-tak",
"Latn",
}
m["cuv"] = {
"Cuvok",
3515056,
"cdc-cbm",
"Latn",
}
m["cuw"] = {
"Chukwa",
12629033,
"sit-kic",
}
m["cux"] = {
"Tepeuxila Cuicatec",
20527242,
"omq-cui",
"Latn",
}
m["cuy"] = {
"Cuitlatec",
2030998,
"qfa-iso",
"Latn",
}
m["cvg"] = {
"Chug",
47683644,
"sit-khc",
"Tibt, Latn",
-- Tibt translit, display_text, strip_diacritics, sort_key in [[Module:scripts/data]]
-- (NOTE: formerly not present, probably an accidental omission)
}
m["cvn"] = {
"Valle Nacional Chinantec",
12953442,
"omq-chi",
"Latn",
}
m["cwa"] = {
"Kabwa",
6344537,
"bnt-lok",
"Latn",
}
m["cwb"] = {
"Maindo",
11002891,
"bnt-mak",
"Latn",
ancestors = "chw",
}
m["cwd"] = {
"Woods Cree",
56305,
"alg",
"Latn, Cans",
ancestors = "cr",
}
m["cwe"] = {
"Kwere",
779632,
"bnt-ruv",
"Latn",
}
m["cwg"] = {
"Chewong",
646718,
"mkh-asl",
"Latn",
}
m["cwt"] = {
"Kuwaataay",
35699,
"alv-jol",
"Latn",
}
m["cya"] = {
"Nopala Chatino",
15616302,
"omq-cha",
"Latn",
}
m["cyb"] = {
"Cayubaba",
3183382,
"qfa-iso",
"Latn",
}
m["cyo"] = {
"Cuyunon",
33153,
"phi",
"Latn",
}
m["czh"] = {
"Huizhou",
56546,
"zhx",
"Hants", -- ?
ancestors = "ltc",
generate_forms = "zh-generateforms",
sort_key = "Hani-sortkey",
}
m["czk"] = {
"Knaanic",
56384,
"zlw",
"Hebr",
ancestors = "zlw-ocs",
-- Hebr display_text, strip_diacritics, sort_key in [[Module:scripts/data]]
}
m["czn"] = {
"Zenzontepec Chatino",
603106,
"omq-cha",
"Latn",
}
m["czo"] = {
"หมิ่นตอนกลาง",
56435,
"zhx-inm",
"Hants",
generate_forms = "zh-generateforms",
sort_key = "Hani-sortkey",
}
m["czt"] = {
"Zotung Chin",
8074599,
"tbq-kuk",
"Latn",
}
return require("Module:languages").finalizeData(m, "language")
p2zltpvocfy604hk72zjsoomdfi3qb3
มอดูล:languages/data/3/b
828
36385
5720753
5684150
2026-04-21T07:00:50Z
OctraBot
3198
บอต: แทนที่ข้อความโดยอัตโนมัติ (-standardChars +standard_chars)
5720753
Scribunto
text/plain
local m_langdata = require("Module:languages/data")
-- Loaded on demand, as it may not be needed (depending on the data).
local function u(...)
u = require("Module:string utilities").char
return u(...)
end
local c = m_langdata.chars
local p = m_langdata.puaChars
local s = m_langdata.shared
local m = {}
m["baa"] = {
"Babatana",
2877785,
"poz-ocw",
"Latn",
}
m["bab"] = {
"Bainouk-Gunyuño",
35508,
"alv-bny",
"Latn",
}
m["bac"] = {
"Baduy",
3449885,
"poz-msa",
"Latn, Sund",
ancestors = "osn",
translit = {
Sund = "Sund-translit"
},
}
m["bae"] = {
"Baré",
3504087,
"awd",
"Latn",
}
m["baf"] = {
"Nubaca",
36270,
"nic-ymb",
"Latn",
}
m["bag"] = {
"Tuki",
36621,
"nic-mba",
"Latn",
}
m["bah"] = {
"Bahamian Creole",
2669229,
"crp",
"Latn",
ancestors = "en",
}
m["baj"] = {
"Barakai",
3502030,
"poz-cet",
"Latn",
}
m["bal"] = {
"บาโลจ",
33049,
"ira-nwi",
"fa-Arab",
}
m["ban"] = {
"บาหลี",
33070,
"poz-bss",
"Latn, Bali",
}
m["bao"] = {
"Waimaha",
2883738,
"sai-tuc",
"Latn",
}
m["bap"] = {
"Bantawa",
56500,
"sit-kic",
"Krai, Deva",
translit = {
Deva = "Deva-translit",
},
}
m["bar"] = {
"บาวาเรีย",
29540,
"gmw-hgm",
"Latn",
ancestors = "gmh",
}
m["bas"] = {
"Basaa",
33093,
"bnt-bsa",
"Latn",
}
m["bau"] = {
"Badanchi",
11001650,
"nic-jrw",
"Latn",
}
m["bav"] = {
"Babungo",
34885,
"nic-rnn",
"Latn",
}
m["baw"] = {
"Bambili-Bambui",
34880,
"nic-nge",
"Latn",
}
m["bax"] = {
"Bamum",
35280,
"nic-nun",
"Latn, Bamu",
}
m["bay"] = {
"Batuley",
8828787,
"poz",
"Latn",
}
m["bba"] = {
"Baatonum",
34889,
"alv-sav",
"Latn",
}
m["bbb"] = {
"Barai",
4858206,
"ngf-koi",
"Latn",
}
m["bbc"] = {
"Toba Batak",
33017,
"btk",
"Latn, Batk",
}
m["bbd"] = {
"Bau",
4873415,
"ngf-gum",
"Latn",
}
m["bbe"] = {
"Bangba",
34895,
"nic-nke",
"Latn",
}
m["bbf"] = {
"Baibai",
56902,
"paa-fas",
"Latn",
}
m["bbg"] = {
"Barama",
34884,
"bnt-sir",
"Latn",
}
m["bbh"] = {
"Bugan",
3033554,
"mkh-pkn",
"Latn",
}
m["bbi"] = {
"Barombi",
34985,
"bnt-bsa",
"Latn",
}
m["bbj"] = {
"Ghomala'",
35271,
"bai",
"Latn",
}
m["bbk"] = {
"Babanki",
34790,
"nic-rnc",
"Latn",
}
m["bbl"] = {
"บัตส์",
33259,
"cau-nkh",
"Geor",
-- Geor translit in [[Module:scripts/data]]
override_translit = true,
strip_diacritics = {
remove_diacritics = c.tilde .. c.macron .. c.breve,
from = {"<sup>ნ</sup>"},
to = {"ნ"}
},
}
m["bbm"] = { -- name includes prefix
"Babango",
34819,
"bnt-bta",
"Latn",
}
m["bbn"] = {
"Uneapa",
7884126,
"poz-ocw",
"Latn",
}
m["bbo"] = {
"Konabéré",
35371,
"dmn-snb",
"Latn",
}
m["bbp"] = {
"West Central Banda",
7984377,
"bad",
"Latn",
}
m["bbq"] = {
"Bamali",
34901,
"nic-nun",
"Latn",
}
m["bbr"] = {
"Girawa",
5564185,
"ngf-kok",
"Latn",
}
m["bbs"] = {
"Bakpinka",
3515061,
"nic-ucr",
"Latn",
}
m["bbt"] = {
"Mburku",
3441324,
"cdc-wst",
"Latn",
}
m["bbu"] = {
"Bakulung",
35580,
"nic-jrn",
"Latn",
}
m["bbv"] = {
"Karnai",
6372803,
"poz-ocw",
"Latn",
}
m["bbw"] = {
"Baba",
34822,
"nic-nun",
"Latn",
}
m["bbx"] = { -- cf bvb
"Bubia",
34953,
"nic-bds",
"Latn",
ancestors = "bvb",
}
m["bby"] = {
"Befang",
34960,
"nic-bds",
"Latn",
}
m["bca"] = {
"Central Bai",
12628803,
"sit-bai",
"Hani, Latn",
sort_key = {Hani = "Hani-sortkey"},
}
m["bcb"] = {
"Bainouk-Samik",
36390,
"alv-bny",
"Latn",
}
m["bcd"] = {
"North Babar",
7054041,
"poz-tim",
"Latn",
}
m["bce"] = {
"Bamenyam",
34968,
"nic-nun",
"Latn",
}
m["bcf"] = {
"Bamu",
3503788,
"paa-kiw",
"Latn",
}
m["bcg"] = {
"Baga Pokur",
31172660,
"alv-nal",
"Latn",
}
m["bch"] = {
"Bariai",
2884502,
"poz-ocw",
"Latn",
}
m["bci"] = {
"Baoule",
35107,
"alv-ctn",
"Latn",
}
m["bcj"] = {
"Bardi",
3913852,
"aus-nyu",
"Latn",
}
m["bck"] = {
"Bunaba",
580923,
"aus-bub",
"Latn",
}
m["bcl"] = {
"บีโคลตอนกลาง",
33284,
"phi",
"Latn, Tglg",
translit = {
Tglg = "bcl-translit",
},
override_translit = true,
strip_diacritics = {
Latn = {
remove_diacritics = c.grave .. c.acute .. c.circ,
}
},
sort_key = {
Latn = "tl-sortkey",
},
standard_chars = {
Latn = "AaBbKkDdEeGgHhIiLlMmNnOoPpRrSsTtUuWwYy" .. c.punc,
},
}
m["bcm"] = {
"Banoni",
2882857,
"poz-ocw",
"Latn",
}
m["bcn"] = {
"Bibaali",
34892,
"alv-mye",
"Latn",
}
m["bco"] = {
"Kaluli",
6354586,
"ngf-bos",
"Latn",
}
m["bcp"] = {
"Bali",
3515074,
"bnt-kbi",
"Latn",
}
m["bcq"] = {
"Bench",
35108,
"omv",
"Latn",
}
m["bcr"] = {
"Babine-Witsuwit'en",
27864,
"ath-nor",
"Latn",
}
m["bcs"] = {
"Kohumono",
35590,
"nic-ucn",
"Latn",
}
m["bct"] = {
"Bendi",
8836662,
"csu-mle",
"Latn",
}
m["bcu"] = {
"Biliau",
2874658,
"poz-ocw",
"Latn",
}
m["bcv"] = {
"Shoo-Minda-Nye",
36548,
"nic-jkn",
"Latn",
}
m["bcw"] = {
"Bana",
56272,
"cdc-cbm",
"Latn",
}
m["bcy"] = {
"Bacama",
56274,
"cdc-cbm",
"Latn",
}
m["bcz"] = {
"Bainouk-Gunyaamolo",
35506,
"alv-bny",
"Latn",
}
m["bda"] = {
"Bayot",
35019,
"alv-jol",
"Latn",
}
m["bdb"] = {
"Basap",
3504208,
"poz-bnn",
"Latn",
}
m["bdc"] = {
"Emberá-Baudó",
11173166,
"sai-chc",
"Latn",
}
m["bdd"] = {
"Bunama",
4997416,
"poz-ocw",
"Latn",
}
m["bde"] = {
"Bade",
56239,
"cdc-wst",
"Latn",
}
m["bdf"] = {
"Biage",
48037487,
"ngf-koi",
"Latn",
}
m["bdg"] = {
"Bonggi",
2910053,
"poz-bnn",
"Latn",
}
m["bdh"] = {
"Tara Baka",
2880165,
"csu-bbk",
"Latn",
}
m["bdi"] = {
"Burun",
35040,
"sdv-niw",
"Latn",
}
m["bdj"] = {
"Bai (South Sudan)",
34894,
"nic-ser",
"Latn",
}
m["bdk"] = {
"Budukh",
35397,
"cau-ssm",
"Cyrl",
translit = "cau-nec-translit",
override_translit = true,
display_text = {Cyrl = s["cau-Cyrl-displaytext"]},
strip_diacritics = {Cyrl = s["cau-Cyrl-stripdiacritics"]},
}
m["bdl"] = {
"บาเจาแบบอินโดนีเซีย",
2880038,
"poz",
"Latn",
}
m["bdm"] = {
"Buduma",
56287,
"cdc-cbm",
"Latn",
}
m["bdn"] = {
"Baldemu",
56280,
"cdc-cbm",
"Latn",
}
m["bdo"] = {
"Morom",
759770,
"csu-bgr",
"Latn",
}
m["bdp"] = {
"Bende",
8836490,
"bnt",
"Latn",
}
m["bdq"] = {
"บะห์นัร",
32924,
"mkh-ban",
"Latn",
}
m["bdr"] = {
"บาเจาแบบเวสต์โคสต์",
2880037,
"poz-sbj",
"Latn",
}
m["bds"] = {
"Burunge",
56617,
"cus-sou",
"Latn",
}
m["bdt"] = {
"Bokoto",
4938812,
"gba-wes",
"Latn",
}
m["bdu"] = {
"Oroko",
36278,
"bnt-saw",
"Latn",
}
m["bdv"] = {
"Bodo Parja",
8845881,
"inc-eas",
"Orya",
}
m["bdw"] = {
"Baham",
3513309,
"paa-mbi",
"Latn",
}
m["bdx"] = {
"Budong-Budong",
4985158,
"poz-ssw",
"Latn",
}
m["bdy"] = {
"Bandjalang",
2980386,
"aus-pam",
"Latn",
}
m["bdz"] = {
"Badeshi",
33028,
"iir",
"Arab, Latn",
}
m["bea"] = {
"Beaver",
20826,
"ath-nor",
"Latn",
}
m["beb"] = {
"Bebele",
34976,
"bnt-btb",
"Latn",
}
m["bec"] = {
"Iceve-Maci",
35449,
"nic-tvc",
"Latn",
}
m["bed"] = {
"Bedoanas",
4879330,
"poz-hce",
"Latn",
}
m["bee"] = {
"Byangsi",
56904,
"sit-alm",
"Deva",
translit = {
Deva = "Deva-translit",
},
}
m["bef"] = {
"Benabena",
2895638,
"ngf-kag",
"Latn",
}
m["beg"] = {
"Belait",
2894198,
"poz-swa",
"Latn",
}
m["beh"] = {
"Biali",
34961,
"nic-eov",
"Latn",
}
m["bei"] = {
"Bekati'",
3441683,
"day",
"Latn",
}
m["bej"] = {
"Beja",
33025,
"cus",
"Arab, Latn",
strip_diacritics = {
Latn = {
remove_diacritics = c.acute,
}
},
}
m["bek"] = {
"Bebeli",
4878430,
"poz-ocw",
"Latn",
}
m["bem"] = {
"Bemba",
33052,
"bnt-sbi",
"Latn",
}
m["beo"] = {
"Beami",
3504079,
"ngf-bos",
"Latn",
}
m["bep"] = {
"Besoa",
8840465,
"poz-kal",
"Latn",
}
m["beq"] = {
"Beembe",
3196320,
"bnt-kng",
"Latn",
}
m["bes"] = {
"Besme",
289832,
"alv-kim",
"Latn",
}
m["bet"] = {
"Guiberoua Bété",
11019185,
"kro-bet",
"Latn",
}
m["beu"] = {
"บลาการ์",
4923846,
"paa-tap",
"Latn",
}
m["bev"] = {
"Daloa Bété",
11155819,
"kro-bet",
"Latn",
}
m["bew"] = {
"เบอตาวี",
33014,
"crp",
"Latn",
ancestors = "ms",
}
m["bex"] = {
"Jur Modo",
56682,
"csu-bbk",
"Latn",
}
m["bey"] = {
"Akuwagel",
3504170,
"paa-tor",
"Latn",
}
m["bez"] = {
"Kibena",
2502949,
"bnt-bki",
"Latn",
}
m["bfa"] = {
"Bari",
35042,
"sdv-bri",
"Latn",
}
m["bfb"] = {
"Pauri Bareli",
7155462,
"inc-bhi",
"Deva",
translit = {
Deva = "Deva-translit",
},
}
m["bfc"] = {
"Panyi Bai",
12642165,
"sit-nba",
"Hani, Latn",
sort_key = {Hani = "Hani-sortkey"},
}
m["bfd"] = {
"Bafut",
34888,
"nic-nge",
"Latn",
}
m["bfe"] = {
"Betaf",
4897329,
"paa-tkw",
"Latn",
}
m["bff"] = {
"Bofi",
34914,
"gba-eas",
"Latn",
}
m["bfg"] = {
"Busang Kayan",
9231909,
"poz",
"Latn",
}
m["bfh"] = {
"Blafe",
12628007,
"paa-yam",
"Latn",
}
m["bfi"] = {
"British Sign Language",
33000,
"sgn",
"Latn", -- when documented
}
m["bfj"] = {
"Bafanji",
34890,
"nic-nun",
"Latn",
}
m["bfk"] = {
"Ban Khor Sign Language",
3441103,
"sgn",
}
m["bfl"] = {
"Banda-Ndélé",
34850,
"bad-cnt",
"Latn",
}
m["bfm"] = {
"Mmen",
36132,
"nic-rnc",
"Latn",
}
m["bfn"] = {
"Bunak",
35101,
"paa-tap",
"Latn",
}
m["bfo"] = {
"Malba Birifor",
11150710,
"nic-mre",
"Latn",
}
m["bfp"] = {
"Beba",
35050,
"nic-nge",
"Latn",
}
m["bfq"] = {
"พทคะ",
33205,
"dra-kan",
"Taml, Knda, Mlym",
translit = {
Taml = "Taml-translit",
},
-- Knda translit in [[Module:scripts/data]]
-- Mlym translit in [[Module:scripts/data]]
}
m["bfr"] = {
"Bazigar",
8829558,
"inc",
}
m["bfs"] = {
"Southern Bai",
12952250,
"sit-bai",
"Hani, Latn",
sort_key = {Hani = "Hani-sortkey"},
}
m["bft"] = {
"บัลติ",
33086,
"sit-lab",
"fa-Arab, Deva, Tibt",
translit = {
Deva = "Deva-translit",
},
override_translit = "Tibt",
-- Tibt translit, display_text, strip_diacritics, sort_key in [[Module:scripts/data]]
strip_diacritics = {
["fa-Arab"] = {
from = {"هٔ", "ٱ"},
to = {"ه", "ا"},
remove_diacritics = c.fathatan .. c.dammatan .. c.kasratan .. c.kashida .. c.fatha .. c.damma .. c.kasra .. c.shadda .. c.sukun .. c.superalef,
},
},
}
m["bfu"] = {
"Gahri",
5516952,
"sit-whm",
"Takr, Tibt",
override_translit = true,
-- Tibt translit, display_text, strip_diacritics, sort_key in [[Module:scripts/data]]
}
m["bfw"] = {
"Bondo",
2567942,
"mun",
"Orya",
}
m["bfx"] = {
"Bantayanon",
16837866,
"phi",
"Latn",
}
m["bfy"] = {
"Bagheli",
2356364,
"inc-hie",
"Deva",
ancestors = "inc-oaw",
translit = {
Deva = "Deva-translit",
},
}
m["bfz"] = {
"Mahasu Pahari",
6733460,
"him",
"Deva, Takr",
translit = {
Deva = "Deva-translit",
},
}
m["bga"] = {
"Gwamhi-Wuri",
6707102,
"nic-knn",
"Latn",
}
m["bgb"] = {
"Bobongko",
4935896,
"poz-slb",
"Latn",
}
m["bgc"] = {
"Haryanvi",
33410,
"inc-hiw",
"Deva",
translit = {
Deva = "Deva-translit",
},
}
m["bgd"] = {
"Rathwi Bareli",
7295692,
"inc-bhi",
"Deva",
translit = {
Deva = "Deva-translit",
},
}
m["bge"] = {
"Bauria",
4873579,
"inc-bhi",
"Deva",
translit = {
Deva = "Deva-translit",
},
}
m["bgf"] = {
"Bangandu",
34938,
"gba-sou",
"Latn",
}
m["bgg"] = {
"Bugun",
3514220,
"sit-khb",
"Latn",
}
m["bgi"] = {
"Giangan",
4842057,
"phi",
"Latn",
}
m["bgj"] = {
"Bangolan",
34862,
"nic-nun",
"Latn",
}
m["bgk"] = {
"Bit",
2904868,
"mkh-pal",
"Latn", -- also Hani?
}
m["bgl"] = {
"Bo",
8845514,
"mkh-vie",
}
m["bgo"] = {
"Baga Koga",
35695,
"alv-bag",
"Latn",
}
m["bgq"] = {
"Bagri",
2426319,
"raj",
"Deva",
translit = {
Deva = "Deva-translit",
},
}
m["bgr"] = {
"Bawm Chin",
56765,
"tbq-kuk",
"Latn",
}
m["bgs"] = {
"Tagabawa",
7675121,
"mno",
"Latn",
}
m["bgt"] = {
"Bughotu",
2927723,
"poz-sls",
"Latn",
}
m["bgu"] = {
"Mbongno",
36141,
"nic-mmb",
"Latn",
}
m["bgv"] = {
"Warkay-Bipim",
4915439,
"paa-ani",
"Latn",
}
m["bgw"] = {
"Bhatri",
8841054,
"inc-eas",
"Deva",
translit = {
Deva = "Deva-translit",
},
}
m["bgx"] = {
"Balkan Gagauz Turkish",
2360396,
"trk-ogz",
"Latn",
ancestors = "trk-oat",
}
m["bgy"] = {
"Benggoi",
4887742,
"poz-cma",
"Latn",
}
m["bgz"] = {
"Banggai",
3441692,
"poz-slb",
"Latn",
}
m["bha"] = {
"Bharia",
4901287,
"inc",
"Deva",
translit = {
Deva = "Deva-translit",
},
}
m["bhb"] = {
"Bhili",
33229,
"inc-bhi",
"Deva, Gujr",
translit = {
Deva = "Deva-translit",
Gujr = "Gujr-translit",
},
}
m["bhc"] = {
"Biga",
2902375,
"poz-hce",
"Latn",
}
m["bhd"] = {
"Bhadrawahi",
4900565,
"him",
"Arab, Deva",
translit = {
Deva = "Deva-translit",
},
}
m["bhe"] = {
"Bhaya",
8841168,
"raj",
}
m["bhf"] = {
"Odiai",
56690,
"qfa-dis", -- Papuan; no consensus; may be in the Kwomtari family, an isolate and/or distantly related to the
-- Torricelli family.
"Latn",
}
m["bhg"] = {
"Binandere",
3503802,
"paa-bin",
"Latn",
}
m["bhh"] = {
"Bukhari",
56469,
"ira-swi",
"Cyrl, Hebr, Latn, fa-Arab",
ancestors = "tg",
-- Hebr display_text, strip_diacritics, sort_key in [[Module:scripts/data]]
}
m["bhi"] = {
"Bhilali",
4901729,
"inc-bhi",
"Deva",
translit = {
Deva = "Deva-translit",
},
}
m["bhj"] = {
"Bahing",
56442,
"sit-kiw",
"Deva, Latn",
translit = {
Deva = "Deva-translit",
},
}
m["bhl"] = {
"Bimin",
4913743,
"ngf-okk",
"Latn",
}
m["bhm"] = {
"Bathari",
2586893,
"sem-sar",
"Arab, Latn",
}
m["bhn"] = {
"Bohtan Neo-Aramaic",
33230,
"sem-nna",
"Syrc",
}
m["bho"] = {
"โภชปุระ",
33268,
"inc-bih",
"Deva, Kthi",
wikimedia_codes = "bh",
translit = {
Deva = "Deva-translit",
Kthi = "Kthi-translit",
},
}
m["bhp"] = {
"Bima",
2796873,
"poz-cet",
"Latn",
}
m["bhq"] = {
"Tukang Besi South",
12643975,
"poz-mun",
"Latn",
}
m["bhs"] = {
"Buwal",
3515065,
"cdc-cbm",
"Latn",
}
m["bht"] = {
"Bhattiyali",
4901452,
"him",
"Deva",
translit = {
Deva = "Deva-translit",
},
}
m["bhu"] = {
"Bhunjia",
8841766,
"inc-hal",
"Deva, Orya",
translit = {
Deva = "Deva-translit",
Orya = "Orya-translit",
},
}
m["bhv"] = {
"Bahau",
3502039,
"poz",
"Latn",
}
m["bhw"] = {
"Biak",
1961488,
"poz-hce",
"Latn",
}
m["bhx"] = { -- spurious?
"Bhalay",
8840773,
"inc",
}
m["bhy"] = {
"Bhele",
4901671,
"bnt-kbi",
"Latn",
}
m["bhz"] = {
"Bada",
4840520,
"poz-kal",
"Latn",
}
m["bia"] = {
"Badimaya",
3442745,
"aus-psw",
"Latn",
}
m["bib"] = {
"Bissa",
32934,
"dmn-bbu",
"Latn",
}
m["bic"] = {
"Bikaru",
56342,
"ngf-eng",
"Latn",
}
m["bid"] = {
"Bidiyo",
56258,
"cdc-est",
"Latn",
}
m["bie"] = {
"Bepour",
4890914,
"ngf-kum",
"Latn",
}
m["bif"] = {
"Biafada",
35099,
"alv-ten",
"Latn",
}
m["big"] = {
"Biangai",
8842027,
"paa-kun",
"Latn",
}
m["bij"] = {
"Kwanka",
35598,
"nic-tar",
"Latn",
}
m["bil"] = {
"Bile",
34987,
"nic-jrn",
"Latn",
}
m["bim"] = {
"Bimoba",
34971,
"nic-grm",
"Latn",
}
m["bin"] = {
"Edo",
35375,
"alv-eeo",
"Latn",
strip_diacritics = {remove_diacritics = c.acute .. c.grave .. c.macron .. c.dgrave},
sort_key = {
from = {"ẹ", "gb", "gh", "kh", "kp", "mw", "nw", "ny", "ọ", "rh", "rr", "vb"},
to = {"e" .. p[1], "g" .. p[1], "g" .. p[2], "k" .. p[1], "k" .. p[2], "m" .. p[1], "n" .. p[1], "n" .. p[2], "o" .. p[1], "r" .. p[1], "r" .. p[1], "v" .. p[1]}
},
}
m["bio"] = {
"Nai",
3508074,
"paa-kwm",
"Latn",
}
m["bip"] = {
"Bila",
2902626,
"bnt-kbi",
"Latn",
}
m["biq"] = {
"Bipi",
2904312,
"poz-aay",
"Latn",
}
m["bir"] = {
"Bisorio",
8844749,
"ngf-eng",
"Latn",
}
m["bit"] = {
"Berinomo",
56447,
"paa-spk",
"Latn",
}
m["biu"] = {
"Biete",
4904687,
"tbq-kuk",
"Latn",
}
m["biv"] = {
"Southern Birifor",
32859745,
"nic-mre",
"Latn",
}
m["biw"] = {
"Kol (Cameroon)",
35582,
"bnt-mka",
"Latn",
}
m["bix"] = {
"Bijori",
3450686,
"mun",
"Deva",
translit = {
Deva = "Deva-translit",
},
}
m["biy"] = {
"Birhor",
3450469,
"mun",
"Deva",
translit = {
Deva = "Deva-translit",
},
}
m["biz"] = {
"Baloi",
3450590,
"bnt-ngn",
"Latn",
}
m["bja"] = {
"Budza",
3046889,
"bnt-bun",
"Latn",
}
m["bjb"] = {
"Barngarla",
3439071,
"aus-pam",
"Latn",
}
m["bjc"] = {
"Bariji",
4690919,
"ngf-yar",
"Latn",
}
m["bje"] = {
"Biao-Jiao Mien",
3503800,
"hmx-mie",
"Hani, Latn",
sort_key = {Hani = "Hani-sortkey"},
}
m["bjf"] = {
"Barzani Jewish Neo-Aramaic",
33234,
"sem-nna",
"Hebr", -- maybe others
-- Hebr display_text, strip_diacritics, sort_key in [[Module:scripts/data]]
}
m["bjg"] = {
"Bidyogo",
35365,
"alv-bak",
"Latn",
}
m["bjh"] = {
"Bahinemo",
56361,
"paa-spk",
"Latn",
}
m["bji"] = {
"Burji",
34999,
"cus-hec",
"Latn, Ethi",
}
m["bjj"] = {
"Kannauji",
2726867,
"inc-hiw",
"Deva",
translit = {
Deva = "Deva-translit",
},
}
m["bjk"] = {
"Barok",
2884743,
"poz-ocw",
"Latn",
}
m["bjl"] = {
"Bulu (New Guinea)",
4997162,
"poz-ocw",
"Latn",
}
m["bjm"] = {
"Bajelani",
4848866,
"ira-zgr",
"Latn, Arab",
ancestors = "hac",
}
m["bjn"] = {
"บันจาร์",
33151,
"poz-mly",
"Latn, Arab",
}
m["bjo"] = {
"Mid-Southern Banda",
42303990,
"bad-cnt",
"Latn",
}
m["bjp"] = {
"Fanamaket",
56704263,
"poz-oce",
"Latn",
}
m["bjr"] = {
"Binumarien",
538364,
"ngf-kag",
"Latn",
}
m["bjs"] = {
"Bajan",
2524014,
"crp",
"Latn",
ancestors = "en",
}
m["bjt"] = {
"Balanta-Ganja",
19359034,
"alv-bak",
"Arab, Latn",
}
m["bju"] = {
"Busuu",
35046,
"nic-fru",
"Latn",
}
m["bjv"] = {
"Bedjond",
8829831,
"csu-sar",
"Latn",
}
m["bjw"] = {
"Bakwé",
34899,
"kro-ekr",
"Latn",
}
m["bjx"] = {
"Banao Itneg",
12627559,
"phi",
"Latn",
}
m["bjy"] = {
"Bayali",
4874263,
"aus-pam",
"Latn",
}
m["bjz"] = {
"Baruga",
2886189,
"paa-bin",
"Latn",
}
m["bka"] = {
"Kyak",
35653,
"alv-bwj",
"Latn",
}
m["bkc"] = {
"Baka",
34905,
"nic-nkb",
"Latn",
}
m["bkd"] = {
"บีนูกิด",
4914553,
"mno",
"Latn",
}
m["bkf"] = {
"Beeke",
3441375,
"bnt-kbi",
"Latn",
}
m["bkg"] = {
"Buraka",
35066,
"nic-nkg",
"Latn",
}
m["bkh"] = {
"Bakoko",
34866,
"bnt-bsa",
"Latn",
}
m["bki"] = {
"Baki",
11024697,
"poz-vnc",
"Latn",
}
m["bkj"] = {
"Pande",
36263,
"bnt-ngn",
"Latn",
}
m["bkk"] = { -- written in Balti script
"Brokskat",
2925988,
"inc-shn",
"Tibt, Arab",
-- Tibt translit, display_text, strip_diacritics, sort_key in [[Module:scripts/data]]
-- (NOTE: formerly not present, probably an accidental omission)
}
m["bkl"] = {
"Berik",
378743,
"paa-tkw",
"Latn",
}
m["bkm"] = {
"Kom (Cameroon)",
1656595,
"nic-rnc",
"Latn",
}
m["bkn"] = {
"Bukitan",
3446774,
"poz-bnn",
"Latn",
}
m["bko"] = {
"Kwa'",
35567,
"bai",
"Latn",
}
m["bkp"] = {
"Iboko",
35089,
"bnt-ngn",
"Latn",
}
m["bkq"] = {
"Bakairí",
56846,
"sai-pek",
"Latn",
}
m["bkr"] = {
"Bakumpai",
3436626,
"poz-brw",
"Latn",
}
m["bks"] = {
"Masbate Sorsogon",
16113356,
"phi",
"Latn",
}
m["bkt"] = {
"Boloki",
4144560,
"bnt-zbi",
"Latn",
ancestors = "lse",
}
m["bku"] = {
"Buhid",
1002956,
"phi",
"Latn, Buhd",
translit = {
Buhd = "bku-translit",
},
override_translit = true,
strip_diacritics = {
Latn = {
remove_diacritics = c.grave .. c.acute .. c.circ,
}
},
sort_key = {
Latn = "tl-sortkey",
},
standard_chars = {
Latn = "AaBbKkDdEeFfGgHhIiLlMmNnOoPpRrSsTtUuWwYy" .. c.punc,
},
}
m["bkv"] = {
"Bekwarra",
34954,
"nic-ben",
"Latn",
}
m["bkw"] = {
"Bekwel",
34950,
"bnt-bek",
"Latn",
}
m["bkx"] = {
"Baikeno",
11200640,
"poz-tim",
"Latn",
}
m["bky"] = {
"Bokyi",
35087,
"nic-ben",
"Latn",
}
m["bkz"] = {
"Bungku",
2928207,
"poz-btk",
"Latn",
}
m["bla"] = {
"Blackfoot",
33060,
"alg",
"Latn, Cans",
}
m["blb"] = {
"Bilua",
35003,
"qfa-dis", -- Papuan; isolate per Glottolog, Central Solomon per Ross (2005) and Pedrós (2015)
"Latn",
}
m["blc"] = {
"Bella Coola",
977808,
"sal",
"Latn",
}
m["bld"] = {
"Bolango",
3450578,
"phi",
"Latn",
}
m["ble"] = {
"Balanta-Kentohe",
56789,
"alv-bak",
"Latn",
}
m["blf"] = {
"Buol",
2928278,
"phi",
"Latn",
}
m["blg"] = {
"Balau",
4850134,
"poz-mly",
"Latn",
}
m["blh"] = {
"Kuwaa",
35579,
"kro",
"Latn",
}
m["bli"] = {
"Bolia",
34910,
"bnt-mon",
"Latn",
}
m["blj"] = {
"Bulungan",
9229310,
"poz",
"Latn",
}
m["blk"] = {
"กะเหรี่ยงปะโอ",
7121294,
"kar",
"Mymr",
}
m["bll"] = {
"Biloxi",
2903780,
"sio-ohv",
"Latn",
}
m["blm"] = {
"Beli",
56821,
"csu-bbk",
"Latn",
}
m["bln"] = {
"Southern Catanduanes Bicolano",
7569754,
"phi",
"Latn",
}
m["blo"] = {
"Anii",
34838,
"alv-ntg",
"Latn",
}
m["blp"] = {
"Blablanga",
2905245,
"poz-ocw",
"Latn",
}
m["blq"] = {
"Baluan-Pam",
2881675,
"poz-aay",
"Latn",
}
m["blr"] = {
"Blang",
4925096,
"mkh-pal",
"Latn, Tale, Lana, Thai",
sort_key = { -- FIXME: This needs to be converted into the current standardized format.
from = {"[%pᪧๆ]", "[᩠ᩳ-᩿]", "ᩔ", "ᩕ", "ᩖ", "ᩘ", "([ᨭ-ᨱ])ᩛ", "([ᨷ-ᨾ])ᩛ", "ᩤ", "[็-๎]", "([เแโใไ])([ก-ฮ])"},
to = {"", "", "ᩈᩈ", "ᩁ", "ᩃ", "ᨦ", "%1ᨮ", "%1ᨻ", "ᩣ", "", "%2%1"}
},
}
m["bls"] = {
"Balaesang",
4849796,
"poz",
"Latn",
}
m["blt"] = {
"ไทดำ",
56407,
"tai-swe",
"Tavt, Latn",
translit = "Tavt-translit",
sort_key = {
Tavt = {
from = {"[꪿ꫀ꫁ꫂ]", "([ꪵꪶꪹꪻꪼ])([ꪀ-ꪯ])"},
to = {"", "%2%1"}
},
},
}
m["blv"] = {
"Kibala",
4939959,
"bnt-kmb",
"Latn",
}
m["blw"] = {
"Balangao",
4850033,
"phi",
"Latn",
}
m["blx"] = {
"Mag-Indi Ayta",
1931221,
"phi",
"Latn",
}
m["bly"] = {
"Notre",
11009194,
"nic-wov",
"Latn",
}
m["blz"] = {
"Balantak",
4850053,
"poz-slb",
"Latn",
}
m["bma"] = {
"Lame",
3913997,
"nic-jrn",
"Latn",
}
m["bmb"] = {
"Bembe",
4885023,
"bnt-lgb",
"Latn",
}
m["bmc"] = {
"Biem",
4904523,
"poz-ocw",
"Latn",
}
m["bmd"] = {
"Baga Manduri",
35815,
"alv-bag",
"Latn",
}
m["bme"] = {
"Limassa",
11004666,
"nic-nkb",
"Latn",
}
m["bmf"] = {
"Bom",
35088,
"alv-mel",
"Latn",
}
m["bmg"] = {
"Bamwe",
34867,
"bnt-bun",
"Latn",
}
m["bmh"] = {
"Kein",
6383764,
"ngf-kok",
"Latn",
}
m["bmi"] = {
"Bagirmi",
34903,
"csu-bgr",
"Latn",
}
m["bmj"] = {
"Bote-Majhi",
9229570,
"inc-eas",
"Deva",
ancestors = "bh",
translit = {
Deva = "Deva-translit",
},
}
m["bmk"] = {
"Ghayavi",
5555976,
"poz-ocw",
"Latn",
}
m["bml"] = {
"Bomboli",
35055,
"bnt-ngn",
"Latn",
}
m["bmn"] = {
"Bina",
8843664,
"poz-ocw",
"Latn",
}
m["bmo"] = {
"Bambalang",
34868,
"nic-nun",
"Latn",
}
m["bmp"] = {
"Bulgebi",
4996380,
"ngf-fin",
"Latn",
}
m["bmq"] = {
"Bomu",
35065,
"nic-bwa",
"Latn",
}
m["bmr"] = {
"Muinane",
3027894,
"sai-bor",
"Latn",
}
m["bmt"] = {
"Biao Mon",
8842159,
"hmx-mie",
}
m["bmu"] = {
"Somba-Siawari",
5000983,
"ngf-huo",
"Latn",
}
m["bmv"] = {
"Bum",
35058,
"nic-rnc",
"Latn",
}
m["bmw"] = {
"Bomwali",
34984,
"bnt-ndb",
"Latn",
}
m["bmx"] = {
"Baimak",
3450546,
"ngf-han",
"Latn",
}
m["bmz"] = {
"Baramu",
4858315,
"paa-ani",
"Latn",
}
m["bna"] = {
"Bonerate",
4941729,
"poz-mun",
"Latn",
}
m["bnb"] = {
"Bookan",
4943150,
"poz-san",
"Latn",
}
m["bnd"] = {
"Banda",
3504147,
"poz-cma",
"Latn",
}
m["bne"] = {
"Bintauna",
4914533,
"phi",
"Latn",
}
m["bnf"] = {
"Masiwang",
6783305,
"poz-cma",
"Latn",
}
m["bng"] = {
"Benga",
34952,
"bnt-saw",
"Latn",
}
m["bni"] = {
"Bangi",
34936,
"bnt-bmo",
"Latn",
}
m["bnj"] = {
"Eastern Tawbuid",
18757427,
"phi",
"Latn",
}
m["bnk"] = {
"Bierebo",
2902029,
"poz-vnc",
"Latn",
}
m["bnl"] = {
"Boon",
56616,
"cus-eas",
"Latn",
}
m["bnm"] = {
"Batanga",
34979,
"bnt-saw",
"Latn",
}
m["bnn"] = {
"Bunun",
56505,
"map",
"Latn",
}
m["bno"] = {
"อาซี",
29490,
"phi",
"Latn",
}
m["bnp"] = {
"Bola",
4938876,
"poz-ocw",
"Latn",
}
m["bnq"] = {
"Bantik",
2883521,
"poz",
"Latn",
}
m["bnr"] = {
"Butmas-Tur",
2928942,
"poz-vnn",
"Latn",
}
m["bns"] = {
"Bundeli",
56399,
"inc-hiw",
"Deva",
translit = {
Deva = "Deva-translit",
},
}
m["bnu"] = {
"Bentong",
4890644,
"poz-ssw",
"Latn",
}
m["bnv"] = {
"Beneraf",
4941733,
"paa-tkw",
"Latn",
}
m["bnw"] = {
"Bisis",
56356,
"paa-spk",
"Latn",
}
m["bnx"] = {
"Bangubangu",
3438330,
"bnt-lbn",
"Latn",
}
m["bny"] = {
"Bintulu",
3450775,
"poz-swa",
"Latn",
}
m["bnz"] = {
"Beezen",
35083,
"nic-ykb",
"Latn",
}
m["boa"] = {
"Bora",
2375468,
"sai-bor",
"Latn",
}
m["bob"] = {
"Aweer",
56526,
"cus-som",
"Latn",
}
m["boe"] = {
"Mundabli",
36127,
"nic-beb",
"Latn",
}
m["bof"] = {
"Bolon",
3913301,
"dmn-emn",
"Latn",
}
m["bog"] = {
"Bamako Sign Language",
4853284,
"sgn",
}
m["boh"] = {
"North Boma",
35080,
"bnt-bdz",
"Latn",
}
m["boi"] = {
"Barbareño",
56391,
"nai-chu",
"Latn",
}
m["boj"] = {
"Anjam",
3504136,
"ngf-min",
"Latn",
}
m["bok"] = {
"Bonjo",
34942,
"alv",
"Latn",
}
m["bol"] = {
"โบล",
3436680,
"cdc-wst",
"Latn",
}
m["bom"] = {
"Berom",
35013,
"nic-beo",
"Latn",
}
m["bon"] = {
"Bine",
4914077,
"paa-etf",
"Latn",
}
m["boo"] = {
"Tiemacèwè Bozo",
12643582,
"dmn-snb",
"Latn", -- and others?
}
m["bop"] = {
"Bonkiman",
4942134,
"ngf-fin",
"Latn",
}
m["boq"] = {
"Bogaya",
7207578,
"qfa-dis", -- Papuan; isolate per Glottolog, grouped in Duna-Pogaya family by Voorhoeve (1975), Ross (2005) and Usher (2018)
"Latn",
}
m["bor"] = {
"Borôro",
32986,
"sai-mje",
"Latn",
}
m["bot"] = {
"Bongo",
2910067,
"csu-bbk",
"Latn",
}
m["bou"] = {
"Bondei",
4941378,
"bnt-seu",
"Latn",
}
m["bov"] = {
"Tuwuli",
36974,
"alv-ktg",
"Latn",
}
m["bow"] = {
"Rema",
7311502,
"paa-yam",
"Latn",
}
m["box"] = {
"Buamu",
35157,
"nic-bwa",
"Latn",
}
m["boy"] = {
"Bodo (Central Africa)",
4936715,
"bnt-leb",
"Latn",
}
m["boz"] = {
"Tiéyaxo Bozo",
32860401,
"dmn-snb",
"Latn",
}
m["bpa"] = {
"Daakaka",
1157729,
"poz-vnc",
"Latn",
}
m["bpd"] = {
"Banda-Banda",
3450674,
"bad-cnt",
"Latn",
}
-- bpe (Bauni, Papua New Guinea): not yet accepted; in the Sko/Skou family
m["bpg"] = {
"Bonggo",
4941860,
"poz-ocw",
"Latn",
}
m["bph"] = {
"Botlikh",
56560,
"cau-and",
"Cyrl",
translit = "cau-nec-translit",
override_translit = true,
display_text = {Cyrl = s["cau-Cyrl-displaytext"]},
strip_diacritics = {Cyrl = s["cau-Cyrl-stripdiacritics"]},
}
m["bpi"] = {
"Bagupi",
3450697,
"ngf-han",
"Latn",
}
m["bpj"] = {
"Binji",
4914403,
"bnt-lbn",
"Latn",
}
m["bpk"] = {
"Orowe",
7103905,
"poz-cln",
"Latn",
}
m["bpl"] = {
"Broome Pearling Lugger Pidgin",
4975277,
"crp",
"Latn",
ancestors = "ms",
}
m["bpm"] = {
"Biyom",
4919327,
"ngf-rai",
"Latn",
}
m["bpn"] = {
"Dzao Min",
3042189,
"hmx-mie",
}
m["bpo"] = {
"Anasi",
11207813,
"paa-egb",
"Latn",
}
m["bpp"] = {
"Kaure",
20526532,
"paa-kko",
"Latn",
}
m["bpq"] = {
"Banda Malay",
12473442,
"crp",
"Latn",
ancestors = "ms",
}
m["bpr"] = {
"Koronadal Blaan",
16115430,
"phi",
"Latn",
}
m["bps"] = {
"Sarangani Blaan",
16117272,
"phi",
"Latn",
}
m["bpt"] = {
"Barrow Point",
2567916,
"aus-pmn",
"Latn",
}
m["bpu"] = {
"Bongu",
4941930,
"ngf-min",
"Latn",
}
m["bpv"] = {
"Bian Marind",
8841889,
"paa-ani",
"Latn",
}
-- bpw: Bo (Papua New Guinea): pending acceptance; per Wikipedia: "It is essentially undocumented, and its status as a separate language is unconfirmed."
m["bpx"] = {
"Palya Bareli",
7128872,
"inc-bhi",
"Deva",
translit = {
Deva = "Deva-translit",
},
}
m["bpy"] = {
"Bishnupriya Manipuri",
37059,
"inc-bas",
"Beng",
ancestors = "inc-obn",
}
m["bpz"] = {
"Bilba",
8843362,
"poz-tim",
"Latn",
}
m["bqa"] = {
"Tchumbuli",
11008162,
"alv-ctn",
"Latn",
ancestors = "ak",
}
m["bqb"] = {
"Bagusa",
4842178,
"paa-tkw",
"Latn",
}
m["bqc"] = {
"Boko",
34983,
"dmn-bbu",
"Latn",
}
m["bqd"] = {
"Bung",
3436612,
"nic-bdn",
"Latn",
}
m["bqf"] = {
"Baga Kaloum",
3502293,
"alv-bag",
"Latn",
}
m["bqg"] = {
"Bago-Kusuntu",
34878,
"nic-gne",
}
m["bqh"] = {
"Baima",
674990,
"sit-qia",
}
m["bqi"] = {
"Bakhtiari",
257829,
"ira-swi",
"fa-Arab",
ancestors = "pal",
}
m["bqj"] = {
"Bandial",
34872,
"alv-jol",
"Latn",
}
m["bqk"] = {
"Banda-Mbrès",
3450724,
"bad-cnt",
"Latn",
}
m["bql"] = {
"Bilakura",
4907504,
"ngf-num",
"Latn",
}
m["bqm"] = {
"Wumboko",
37051,
"bnt-kpw",
"Latn",
}
m["bqn"] = {
"Bulgarian Sign Language",
3438325,
"sgn",
}
m["bqo"] = {
"Balo",
34865,
"nic-grs",
"Latn",
}
m["bqp"] = {
"Busa",
35185,
"dmn-bbu",
"Latn",
}
m["bqq"] = {
"Biritai",
56382,
"paa-lkp",
"Latn",
}
m["bqr"] = {
"Burusu",
5001028,
"poz-san",
"Latn",
}
m["bqs"] = {
"Bosngun",
56838,
"paa-ram",
"Latn",
}
m["bqt"] = {
"Bamukumbit",
35078,
"nic-nge",
"Latn",
}
m["bqu"] = {
"Boguru",
3438444,
"bnt-boa",
"Latn",
}
m["bqv"] = {
"Begbere-Ejar",
7194098,
"nic-plc",
"Latn",
}
m["bqw"] = {
"Buru (Nigeria)",
1017152,
"nic-bds",
"Latn",
}
m["bqx"] = {
"Baangi",
3450648,
"nic-kam",
"Latn",
}
m["bqy"] = {
"Bengkala Sign Language",
3322119,
"sgn",
}
m["bqz"] = {
"Bakaka",
34855,
"bnt-mne",
"Latn",
}
m["bra"] = {
"พรัช",
35243,
"inc-hiw",
"Deva",
translit = {
Deva = "Deva-translit",
},
}
m["brb"] = {
"Lave",
4957737,
"mkh-ban",
}
m["brc"] = {
"Berbice Creole Dutch",
35215,
"crp",
"Latn",
ancestors = "nl",
}
m["brd"] = {
"Baraamu",
56804,
"sit-new",
"Deva",
translit = {
Deva = "Deva-translit",
},
}
m["brf"] = {
"Bera",
2896850,
"bnt-kbi",
"Latn",
}
m["brg"] = {
"Baure",
2839722,
"awd",
"Latn",
}
m["brh"] = {
"บราฮุอี",
33202,
"dra-nor",
"ur-Arab, Latn",
translit = {["ur-Arab"] = "ur-translit"},
strip_diacritics = {
-- character "ۂ" code U+06C2 to "ه" and "هٔ" (U+0647 + U+0654) to "ه"; hamzatu l-waṣli to a regular alif
from = {"هٔ", "ۂ", "ٱ"},
to = {"ہ", "ہ", "ا"},
remove_diacritics = c.fathatan .. c.dammatan .. c.kasratan .. c.fatha .. c.damma .. c.kasra .. c.shadda .. c.sukun .. c.nunghunna .. c.superalef
},
}
m["bri"] = {
"Mokpwe",
36428,
"bnt-kpw",
"Latn",
}
m["brj"] = {
"Bieria",
4904607,
"poz-vnc",
"Latn",
}
m["brk"] = {
"Birgid",
56823,
"nub",
"Latn",
}
m["brl"] = {
"Birwa",
3501019,
"bnt-sts",
"Latn",
}
m["brm"] = {
"Barambu",
34893,
"znd",
"Latn",
}
m["brn"] = {
"Boruca",
4946773,
"cba",
"Latn",
}
m["bro"] = {
"Brokkat",
56605,
"sit-tib",
"Tibt, Latn",
override_translit = true,
-- Tibt translit, display_text, strip_diacritics, sort_key in [[Module:scripts/data]]
}
m["brp"] = {
"Barapasi",
56995,
"paa-egb",
"Latn",
}
m["brq"] = {
"Breri",
4961835,
"paa-ram",
"Latn",
}
m["brr"] = {
"Birao",
2904383,
"poz-sls",
"Latn",
}
m["brs"] = {
"Baras",
8827053,
"poz",
"Latn",
}
m["brt"] = {
"Bitare",
34946,
"nic-tvn",
"Latn",
}
m["bru"] = {
"บรูตะวันออก",
16115463,
"mkh-kat",
"Latn, Laoo, Thai",
--sort_key = {
-- Laoo = "Laoo-sortkey",
-- Thai = "Thai-sortkey",
--},
}
m["brv"] = {
"บรูตะวันตก",
13018531,
"mkh-kat",
"Latn, Laoo, Thai",
--sort_key = {
-- Laoo = "Laoo-sortkey",
-- Thai = "Thai-sortkey",
--},
}
m["brw"] = {
"Bellari",
4883496,
"dra-tlk",
"Knda, Mlym",
-- Knda translit in [[Module:scripts/data]]
-- Mlym translit in [[Module:scripts/data]]
}
m["brx"] = {
"โบโด",
33223,
"tbq-bdg",
"Deva, Latn",
translit = {
Deva = "Deva-translit",
},
}
m["bry"] = {
"Burui",
5000976,
"paa-ndu",
"Latn",
}
m["brz"] = {
"Bilbil",
4907473,
"poz-ocw",
"Latn",
}
m["bsa"] = {
"Abinomn",
56648,
"qfa-iso", -- Papuan
"Latn",
}
m["bsb"] = {
"Brunei Bisaya",
3450611,
"poz-san",
"Latn",
}
m["bsc"] = {
"Bassari",
35098,
"alv-ten",
"Latn",
}
m["bse"] = {
"Wushi",
36973,
"nic-rnn",
"Latn",
}
m["bsf"] = {
"Bauchi",
34974,
"nic-shi",
"Latn",
}
m["bsg"] = {
"Bashkardi",
33030,
"ira-swi",
"fa-Arab, Latn",
}
m["bsh"] = {
"Kamkata-viri",
2605045,
"nur-nor",
"Latn, Arab",
}
m["bsi"] = {
"Bassossi",
34940,
"bnt-mne",
"Latn",
}
m["bsj"] = {
"Bangwinji",
3446631,
"alv-wjk",
"Latn",
}
m["bsk"] = {
"Burushaski",
216286,
"qfa-iso",
"Arab",
strip_diacritics = {
-- character "ۂ" code U+06C2 to "ه" and "هٔ" (U+0647 + U+0654) to "ه"; hamzatu l-waṣli to a regular alif
from = {"هٔ", "ۂ", "ٱ"},
to = {"ہ", "ہ", "ا"},
remove_diacritics = c.fathatan .. c.dammatan .. c.kasratan .. c.fatha .. c.damma .. c.kasra .. c.shadda .. c.sukun .. c.nunghunna .. c.superalef
},
}
m["bsl"] = {
"Basa-Gumna",
4866150,
"nic-bas",
"Latn",
}
m["bsm"] = {
"Busami",
5001255,
"poz-hce",
"Latn",
}
m["bsn"] = {
"Barasana",
2883843,
"sai-tuc",
"Latn",
}
m["bso"] = {
"Buso",
3441370,
"cdc-est",
"Latn",
}
m["bsp"] = {
"Baga Sitemu",
36466,
"alv-bag",
"Latn",
}
m["bsq"] = {
"Bassa",
34949,
"kro-wkr",
"Latn, Bass",
}
m["bsr"] = {
"Bassa-Kontagora",
4866152,
"nic-bas",
"Latn",
}
m["bss"] = {
"Akoose",
34806,
"bnt-mne",
"Latn",
}
m["bst"] = {
"Basketo",
56531,
"omv-ome",
"Ethi",
}
m["bsu"] = {
"Bahonsuai",
2879298,
"poz-btk",
"Latn",
}
m["bsv"] = {
"Baga Sobané",
3450433,
"alv-bag",
"Latn",
}
m["bsw"] = {
"Baiso",
56615,
"cus-som",
"Latn",
}
m["bsx"] = {
"Yangkam",
36922,
"nic-tar",
"Latn",
}
m["bsy"] = {
"Sabah Bisaya",
12641557,
"poz-san",
"Latn",
}
m["bta"] = {
"Bata",
56254,
"cdc-cbm",
"Latn",
}
m["btc"] = {
"Bati (Cameroon)",
34944,
"nic-mbw",
"Latn",
}
m["btd"] = {
"Dairi Batak",
2891045,
"btk",
"Latn, Batk",
}
m["bte"] = {
"Gamo-Ningi",
5520366,
"nic-jer",
"Latn",
}
m["btf"] = {
"Birgit",
56302,
"cdc-est",
"Latn",
}
m["btg"] = {
"Gagnoa Bété",
5005069,
"kro-bet",
"Latn",
}
m["bth"] = {
"Biatah Bidayuh",
2900881,
"day",
"Latn",
}
m["bti"] = {
"Burate",
56900,
"paa-egb",
"Latn",
}
m["btj"] = {
"Bacanese Malay",
8828608,
"poz-mly",
"Latn",
}
m["btm"] = {
"Mandailing Batak",
2891049,
"btk",
"Latn, Batk",
}
m["btn"] = {
"Ratagnon",
13197,
"phi",
"Latn",
}
m["bto"] = {
"Iriga Bicolano",
12633026,
"phi",
"Latn",
}
m["btp"] = {
"Budibud",
4985086,
"poz-ocw",
"Latn",
}
m["btq"] = {
"Batek",
860315,
"mkh-asl",
"Latn",
}
m["btr"] = {
"Baetora",
2878874,
"poz-vnn",
"Latn",
}
m["bts"] = {
"Simalungun Batak",
2891054,
"btk",
"Latn, Batk",
}
m["btt"] = {
"Bete-Bendi",
4887064,
"nic-ben",
"Latn",
}
m["btu"] = {
"Batu",
34964,
"nic-tvn",
"Latn",
}
m["btv"] = {
"Bateri",
3812564,
"inc-koh",
"Deva",
translit = {
Deva = "Deva-translit",
},
}
m["btw"] = {
"Butuanon",
5003156,
"phi",
"Latn",
}
m["btx"] = {
"Karo Batak",
33012,
"btk",
"Latn, Batk",
}
m["bty"] = {
"Bobot",
3446788,
"poz-cma",
"Latn",
}
m["btz"] = {
"Alas-Kluet Batak",
2891042,
"btk",
"Latn, Batk",
}
m["bua"] = {
"บูร์ยัต",
33120,
"xgn-cen",
"Cyrl, Mong, Latn",
wikimedia_codes = "bxr",
ancestors = "cmg",
translit = {
--Cyrl = "bua-translit",
-- Mong translit in [[Module:scripts/data]]
},
override_translit = true,
-- Mong display_text and strip_diacritics in [[Module:scripts/data]]
strip_diacritics = {
Cyrl = {remove_diacritics = c.grave .. c.acute},
},
sort_key = {
Cyrl = {
from = {"ё", "ө", "ү", "һ"},
to = {"е" .. p[1], "о" .. p[1], "у" .. p[1], "х" .. p[1]}
},
},
}
m["bub"] = {
"Bua",
32928,
"alv-bua",
"Latn",
}
m["bud"] = {
"Ntcham",
36266,
"nic-grm",
"Latn",
}
m["bue"] = {
"Beothuk",
56234,
"qfa-unc", -- extinct since 1829, poorly attested; possibly a divergent Algonquian language
"Latn",
}
m["buf"] = {
"Bushoong",
3449964,
"bnt-bsh",
"Latn",
}
m["bug"] = {
"บูกิส",
33190,
"poz-ssw",
"Bugi, Latn",
}
m["buh"] = {
"Younuo Bunu",
56299,
"hmn",
"Latn",
}
m["bui"] = {
"Bongili",
35084,
"bnt-ngn",
"Latn",
}
m["buj"] = {
"Basa-Gurmana",
6432515,
"nic-bas",
"Latn",
}
m["buk"] = {
"Bukawa",
35043,
"poz-ocw",
"Latn",
}
m["bum"] = {
"Bulu (Cameroon)",
35028,
"bnt-btb",
"Latn",
}
m["bun"] = {
"Sherbro",
36339,
"alv-mel",
"Latn",
}
m["buo"] = {
"Terei",
56831,
"paa-sbo",
"Latn",
}
m["bup"] = {
"Busoa",
5002001,
"poz",
"Latn",
}
m["buq"] = {
"Brem",
4960502,
"ngf-nad",
"Latn",
}
m["bus"] = {
"Bokobaru",
9228931,
"dmn-bbu",
"Latn",
}
m["but"] = {
"Bungain",
3450623,
"paa-tor",
"Latn",
}
m["buu"] = {
"Budu",
3450207,
"bnt-nya",
"Latn",
}
m["buv"] = {
"Bun",
56351,
"paa-yua",
"Latn",
}
m["buw"] = {
"Bubi",
35017,
"bnt-tso",
"Latn",
}
m["bux"] = {
"Boghom",
3440412,
"cdc-wst",
"Latn",
}
m["buy"] = {
"Mmani",
35061,
"alv-mel",
"Latn",
}
m["bva"] = {
"Barein",
56285,
"cdc-est",
"Latn",
}
m["bvb"] = {
"Bube",
35110,
"nic-bds",
"Latn",
}
m["bvc"] = {
"Baelelea",
2878833,
"poz-sls",
"Latn",
}
m["bvd"] = {
"Baeggu",
2878850,
"poz-sls",
"Latn",
}
m["bve"] = {
"Berau Malay",
3915770,
"poz-mly",
"Latn",
}
m["bvf"] = {
"Boor",
56250,
"cdc-est",
"Latn",
}
m["bvg"] = {
"Bonkeng",
34958,
"bnt-bbo",
"Latn",
}
m["bvh"] = {
"Bure",
56294,
"cdc-wst",
"Latn",
}
m["bvi"] = {
"Belanda Viri",
35247,
"nic-ser",
"Latn",
}
m["bvj"] = {
"Baan",
3515067,
"nic-ogo",
"Latn",
}
m["bvk"] = {
"Bukat",
4986814,
"poz-bnn",
"Latn",
}
m["bvl"] = {
"Bolivian Sign Language",
1783590,
"sgn",
"Latn", -- when documented
}
m["bvm"] = {
"Bamunka",
34882,
"nic-rnn",
"Latn",
}
m["bvn"] = {
"Buna",
3450516,
"paa-tor",
"Latn",
}
m["bvo"] = {
"Bolgo",
35038,
"alv-bua",
"Latn",
}
m["bvp"] = {
"Bumang",
4997235,
"mkh-pal",
}
m["bvq"] = {
"Birri",
56514,
"csu-bkr",
"Latn",
}
m["bvr"] = {
"Burarra",
4998124,
"aus-arn",
"Latn",
}
m["bvt"] = {
"Bati (Indonesia)",
4869253,
"poz-cma",
"Latn",
}
m["bvu"] = {
"Bukit Malay",
9230148,
"poz-mly",
"Latn",
}
m["bvv"] = {
"Baniva",
3515198,
"awd",
"Latn",
}
m["bvw"] = {
"Boga",
56262,
"cdc-cbm",
"Latn",
}
m["bvx"] = {
"Babole",
35180,
"bnt-ngn",
"Latn",
}
m["bvy"] = {
"Baybayanon",
16839275,
"phi",
"Latn",
}
m["bvz"] = {
"Bauzi",
56360,
"paa-egb",
"Latn",
}
m["bwa"] = {
"Bwatoo",
9232446,
"poz-cln",
"Latn",
}
m["bwb"] = {
"Namosi-Naitasiri-Serua",
3130290,
"poz-pcc",
"Latn",
}
m["bwc"] = {
"Bwile",
3447440,
"bnt-sbi",
"Latn",
}
m["bwd"] = {
"Bwaidoka",
2929111,
"poz-ocw",
"Latn",
}
m["bwe"] = {
"Bwe Karen",
56994,
"kar",
"Mymr, Latn",
}
m["bwf"] = {
"Boselewa",
4947229,
"poz-ocw",
"Latn",
}
m["bwg"] = {
"Barwe",
8826802,
"bnt-sna",
"Latn",
}
m["bwh"] = {
"Bishuo",
34973,
"nic-fru",
"Latn",
}
m["bwi"] = {
"Baniwa",
3501735,
"awd-nwk",
"Latn",
}
m["bwj"] = {
"Láá Láá Bwamu",
11017275,
"nic-bwa",
"Latn",
}
m["bwk"] = {
"Bauwaki",
4873607,
"paa-mal",
"Latn",
}
m["bwl"] = {
"Bwela",
5003678,
"bnt-bun",
"Latn",
}
m["bwm"] = {
"Biwat",
56352,
"paa-yua",
"Latn",
}
m["bwn"] = {
"Wunai Bunu",
56452,
"hmn",
}
m["bwo"] = {
"Shinasha",
56260,
"omv-gon",
"Latn",
}
m["bwp"] = {
"Mandobo Bawah",
12636155,
"ngf-gaw",
"Latn",
}
m["bwq"] = {
"Southern Bobo",
11001714,
"dmn-snb",
"Latn",
}
m["bwr"] = {
"Bura",
56552,
"cdc-cbm",
"Latn",
}
m["bws"] = {
"Bomboma",
9229429,
"bnt-bun",
"Latn",
}
m["bwt"] = {
"Bafaw",
34853,
"bnt-bbo",
"Latn",
}
m["bwu"] = {
"Buli (Ghana)",
35085,
"nic-buk",
"Latn",
}
m["bww"] = {
"Bwa",
3515058,
"bnt-bta",
"Latn",
}
m["bwx"] = {
"Bu-Nao Bunu",
56411,
"hmn",
"Latn",
}
m["bwy"] = {
"Cwi Bwamu",
11150714,
"nic-bwa",
"Latn",
}
m["bwz"] = {
"Bwisi",
35067,
"bnt-sir",
"Latn",
}
m["bxa"] = {
"Bauro",
2892068,
"poz-sls",
"Latn",
}
m["bxb"] = {
"Belanda Bor",
56678,
"sdv-lon",
"Latn",
}
m["bxc"] = {
"Molengue",
13345,
"bnt-kel",
"Latn",
}
m["bxd"] = {
"Pela",
57000,
"tbq-brm",
}
m["bxe"] = {
"Ongota",
36344,
"qfa-unc", -- moribund, no academic consensus on classification; might be an isolate
"Latn",
}
m["bxf"] = {
"Bilur",
2903788,
"poz-ocw",
"Latn",
}
m["bxg"] = {
"Bangala",
34989,
"bnt-bmo",
"Latn",
}
m["bxh"] = {
"Buhutu",
4986329,
"poz-ocw",
"Latn",
}
m["bxi"] = {
"Pirlatapa",
10632195,
"aus-kar",
"Latn",
}
m["bxj"] = {
"Bayungu",
10427485,
"aus-psw",
"Latn",
}
m["bxk"] = {
"Bukusu",
32930,
"bnt-msl",
"Latn",
}
m["bxl"] = {
"Jalkunan",
11009787,
"dmn-jje",
"Latn",
}
m["bxn"] = {
"Burduna",
4998313,
"aus-psw",
"Latn",
}
m["bxo"] = {
"Barikanchi",
3450802,
"crp",
"Latn",
ancestors = "ha",
}
m["bxp"] = {
"Bebil",
34941,
"bnt-btb",
"Latn",
}
m["bxq"] = {
"Beele",
56238,
"cdc-wst",
"Latn",
}
m["bxs"] = {
"Busam",
35189,
"nic-grs",
"Latn",
}
m["bxv"] = {
"Berakou",
56796,
"csu-bgr",
"Latn",
}
m["bxw"] = {
"Banka",
3438402,
"dmn-smg",
"Latn",
}
m["bxz"] = {
"Binahari",
4913840,
"paa-mal",
"Latn",
}
m["bya"] = {
"Palawan Batak",
3450443,
"phi",
"Tagb",
}
m["byb"] = {
"Bikya",
33257,
"nic-fru",
"Latn",
}
m["byc"] = {
"Ubaghara",
36625,
"nic-ucn",
"Latn",
}
m["byd"] = {
"Benyadu'",
11173588,
"day",
"Latn",
}
m["bye"] = {
"Pouye",
7235814,
"paa-spk",
"Latn",
}
m["byf"] = {
"Bete",
32932,
"nic-ykb",
"Latn",
}
m["byg"] = {
"Baygo",
56836,
"sdv-daj",
"Latn",
}
m["byh"] = {
"Bujhyal",
56317,
"sit-gma",
"Deva",
translit = {
Deva = "Deva-translit",
},
}
m["byi"] = {
"Buyu",
5003401,
"bnt-nyb",
"Latn",
}
m["byj"] = {
"Binawa",
4913807,
"nic-kau",
"Latn",
}
m["byk"] = {
"Biao",
4902547,
"qfa-tak",
"Latn", -- also Hani?
}
m["byl"] = {
"Bayono",
3503856,
"paa-baw",
"Latn",
}
m["bym"] = {
"Bidyara",
8842355,
"aus-pam",
"Latn",
}
m["byn"] = {
"Blin",
56491,
"cus-cen",
"Ethi, Latn",
translit = {Ethi = "Ethi-translit"},
}
m["byo"] = {
"Biyo",
56848,
"tbq-bka",
"Latn, Hani",
sort_key = {Hani = "Hani-sortkey"},
}
m["byp"] = {
"Bumaji",
4997234,
"nic-ben",
"Latn",
}
m["byq"] = {
"Basay",
716647,
"map",
"Latn",
}
m["byr"] = {
"Baruya",
3450812,
"ngf-ang",
"Latn",
}
m["bys"] = {
"Burak",
4998097,
"alv-bwj",
"Latn",
}
m["byt"] = {
"Berti",
35008,
"ssa-sah",
"Latn",
}
m["byv"] = {
"Medumba",
36019,
"bai",
"Latn",
}
m["byw"] = {
"Belhariya",
32961,
"sit-kie",
"Deva",
translit = {
Deva = "Deva-translit",
},
}
m["byx"] = {
"Qaqet",
3503009,
"paa-bng",
"Latn",
}
m["byz"] = {
"Banaro",
56858,
"paa-ram",
"Latn",
}
m["bza"] = {
"Bandi",
34912,
"dmn-msw",
"Latn",
}
m["bzb"] = {
"Andio",
4754487,
"poz-slb",
"Latn",
}
m["bzd"] = {
"Bribri",
28400,
"cba",
"Latn",
}
m["bze"] = {
"Jenaama Bozo",
10950633,
"dmn-snb",
"Latn",
}
m["bzf"] = {
"Boikin",
56829,
"paa-ndu",
"Latn",
}
m["bzg"] = {
"Babuza",
716615,
"map",
"Latn",
}
m["bzh"] = {
"Mapos Buang",
2927370,
"poz-ocw",
"Latn",
}
m["bzi"] = {
"บีซู",
56852,
"tbq-bis",
"Latn, Thai",
--sort_key = {Thai = "Thai-sortkey"},
}
m["bzj"] = {
"Belizean Creole",
1363055,
"crp",
"Latn",
ancestors = "en",
}
m["bzk"] = {
"Nicaraguan Creole",
3504097,
"crp",
"Latn",
ancestors = "en",
}
m["bzl"] = { -- supposedly also called "Bolano", but I can find no evidence of that
"Boano (Sulawesi)",
4931258,
"poz",
"Latn",
}
m["bzm"] = {
"Bolondo",
35071,
"bnt-bun",
"Latn",
}
m["bzn"] = {
"Boano (Maluku)",
4931255,
"poz-cma",
"Latn",
}
m["bzo"] = {
"Bozaba",
4952785,
"bnt-ngn",
"Latn",
}
m["bzp"] = {
"Kemberano",
12634399,
"ngf-sbh",
"Latn",
}
m["bzq"] = {
"Buli (Indonesia)",
2927952,
"poz-hce",
"Latn",
}
m["bzr"] = {
"Biri",
4087011,
"aus-pam",
"Latn",
}
m["bzs"] = {
"Brazilian Sign Language",
3436689,
"sgn",
"Latn",
}
m["bzu"] = {
"Burmeso",
56746,
"qfa-dis", -- isolate in Glottolog, Wurm and Foley; in East Bird's Head-Sentani fmaily by Ross
"Latn",
}
m["bzv"] = {
"Bebe",
34977,
"nic-bbe",
"Latn",
}
m["bzw"] = {
"Basa",
34898,
"nic-bas",
"Latn",
}
m["bzx"] = {
"Hainyaxo Bozo",
11159536,
"dmn-snb",
"Latn",
}
m["bzy"] = {
"Obanliku",
36276,
"nic-ben",
"Latn",
}
m["bzz"] = {
"Evant",
35259,
"nic-tvc",
"Latn",
}
return require("Module:languages").finalizeData(m, "language")
ti53ede14bo2w134dbnx4bjhvzo90qy
มอดูล:languages/data/3/a
828
36386
5720752
5684149
2026-04-21T07:00:47Z
OctraBot
3198
บอต: แทนที่ข้อความโดยอัตโนมัติ (-standardChars +standard_chars)
5720752
Scribunto
text/plain
local m_langdata = require("Module:languages/data")
-- Loaded on demand, as it may not be needed (depending on the data).
local function u(...)
u = require("Module:string utilities").char
return u(...)
end
local c = m_langdata.chars
local p = m_langdata.puaChars
local s = m_langdata.shared
local m = {}
m["aaa"] = {
"Ghotuo",
35463,
"alv-yek",
"Latn",
}
m["aab"] = {
"Alumu-Tesu",
35034,
"nic-alu",
"Latn",
}
m["aac"] = {
"Ari",
1811224,
"ngf-gsu",
"Latn",
}
m["aad"] = {
"Amal",
56708,
"paa-spk",
"Latn",
}
-- "aae" is treated as "sq", see [[WT:LT]]
m["aaf"] = {
"Aranadan",
3507928,
"dra-mal",
"Mlym",
-- Mlym translit in [[Module:scripts/data]] (NOTE: not present before, presumably an accidental omission)
}
m["aag"] = {
"Ambrak",
4741706,
"paa-tor",
"Latn",
}
m["aah"] = {
"Abu'",
4670715,
"paa-tor",
"Latn",
}
m["aai"] = {
"Arifama-Miniafia",
4790560,
"poz-ocw",
"Latn",
}
m["aak"] = {
"Ankave",
3446690,
"ngf-ang",
"Latn",
}
m["aal"] = {
"Afade",
56434,
"cdc-cbm",
"Latn",
}
m["aan"] = {
"Anambé",
3507873,
"tup-gua",
"Latn",
}
m["aap"] = {
"Arára (Pará)",
56807,
"sai-pek",
"Latn",
}
m["aaq"] = {
"Penobscot",
3515185,
"alg-abp",
"Latn",
}
m["aas"] = {
"Aasax",
56620,
"cus-sou",
"Latn",
}
-- "aat" is treated as "sq", see [[WT:LT]]
m["aau"] = {
"Abau",
3073568,
"paa-spk",
"Latn",
}
m["aaw"] = {
"Solong",
7558834,
"poz-ocw",
"Latn",
}
m["aax"] = {
"Mandobo Atas",
12636156,
"ngf-gaw",
"Latn",
}
m["aaz"] = {
"Amarasi",
4740192,
"poz-tim",
"Latn",
}
m["aba"] = {
"อาเบ",
34833,
"alv-lag",
"Latn",
}
m["abb"] = {
"Bankon",
34860,
"bnt-bsa",
"Latn",
}
m["abc"] = {
"Ambala Ayta",
3448896,
"phi",
"Latn",
}
m["abd"] = {
"Camarines Norte Agta",
3399682,
"phi",
"Latn",
}
m["abe"] = {
"Abenaki",
17502788,
"alg-abp",
"Latn",
}
m["abf"] = {
"Abai Sungai",
4663287,
"poz-san",
"Latn",
}
m["abg"] = {
"Abaga",
3507954,
"ngf-kag",
"Latn",
}
m["abh"] = {
"อาหรับแบบทาจิกิสถาน",
56833,
"sem-arb",
"Arab",
strip_diacritics = "ar-stripdiacritics",
}
m["abi"] = {
"Abidji",
34781,
"alv-lag",
"Latn",
}
m["abj"] = {
"Aka-Bea",
2356391,
"qfa-ads",
"Latn",
}
m["abl"] = {
"Abung",
49215,
"poz-lgx",
"Latn",
}
m["abm"] = {
"Abanyom",
7502,
"nic-eko",
"Latn",
}
m["abn"] = {
"Abua",
34835,
"nic-cde",
"Latn",
}
m["abo"] = {
"Abon",
35121,
"nic-tvn",
"Latn",
}
m["abp"] = {
"Abenlen Ayta",
3436621,
"phi",
"Latn",
}
m["abq"] = {
"อาบาซา",
27567,
"cau-abz",
"Cyrl, Latn",
translit = {
Cyrl = "abq-translit"
},
override_translit = true,
display_text = {
Cyrl = s["cau-Cyrl-displaytext"]
},
strip_diacritics = {
Cyrl = s["cau-Cyrl-stripdiacritics"],
Latn = s["cau-Latn-stripdiacritics"],
},
sort_key = {
Cyrl = {
from = {
"гъв", "гъь", "гӏв", "джв", "джь", "къв", "къь", "кӏв", "кӏь", "хъв", "хӏв", "чӏв", -- 3 chars
"гв", "гъ", "гь", "гӏ", "дж", "дз", "ё", "жв", "жь", "кв", "къ", "кь", "кӏ", "ль", "лӏ", "пӏ", "тл", "тш", "тӏ", "фӏ", "хв", "хъ", "хь", "хӏ", "цӏ", "чв", "чӏ", "шв", "шӏ" -- 2 chars
},
to = {
"г" .. p[3], "г" .. p[4], "г" .. p[7], "д" .. p[2], "д" .. p[3], "к" .. p[3], "к" .. p[4], "к" .. p[7], "к" .. p[8], "х" .. p[3], "х" .. p[6], "ч" .. p[3],
"г" .. p[1], "г" .. p[2], "г" .. p[5], "г" .. p[6], "д" .. p[1], "д" .. p[4], "е" .. p[1], "ж" .. p[1], "ж" .. p[2], "к" .. p[1], "к" .. p[2], "к" .. p[5], "к" .. p[6], "л" .. p[1], "л" .. p[2], "п" .. p[1], "т" .. p[1], "т" .. p[2], "т" .. p[3], "ф" .. p[1], "х" .. p[1], "х" .. p[2], "х" .. p[4], "х" .. p[5], "ц" .. p[1], "ч" .. p[1], "ч" .. p[2], "ш" .. p[1], "ш" .. p[2]
}
},
},
}
-- "abr" Abron is treated as "ak" Akan, see [[WT:LT]]
m["abs"] = {
"Ambonese Malay",
3124354,
"crp",
"Latn",
ancestors = "ms",
}
m["abt"] = {
"Ambulas",
3508015,
"paa-ndu",
"Latn",
}
m["abu"] = {
"Abure",
34767,
"alv-ptn",
"Latn",
}
m["abv"] = {
"อาหรับแบบบาห์เรน",
56576,
"sem-arb",
"Arab",
strip_diacritics = "ar-stripdiacritics",
}
m["abw"] = {
"Pal",
7126121,
"ngf-mad",
"Latn",
}
m["abx"] = {
"Inabaknon",
2820163,
"poz-sbj",
"Latn",
}
m["aby"] = {
"Aneme Wake",
3508107,
"ngf-yar",
"Latn",
}
m["abz"] = {
"Abui",
2822110,
"paa-tap",
"Latn",
}
m["aca"] = {
"Achagua",
2822982,
"awd",
"Latn",
}
m["acb"] = {
"Áncá",
11130787,
"nic-mom",
"Latn",
}
m["acd"] = {
"Gikyode",
35256,
"alv-gng",
"Latn",
}
m["ace"] = {
"อาเจะฮ์",
27683,
"cmc",
"Latn, ms-Arab",
standard_chars = {
Latn = "AaBbCcDdEeÉéÈèËëFfGgHhIiJjKkLlMmNnOoÔôÖöPpQqRrSsTtUuVvWwXxYyZz", -- current orthography (not yet add Arab)
c.punc
},
}
m["ach"] = {
"Acholi",
34926,
"sdv-los",
"Latn",
}
m["aci"] = {
"Aka-Cari",
2670418,
"qfa-adn",
"Latn",
}
m["ack"] = {
"Aka-Kora",
3433680,
"qfa-adn",
"Latn",
}
m["acl"] = {
"Akar-Bale",
3436825,
"qfa-ads",
"Latn",
}
m["acm"] = {
"อาหรับแบบอิรัก",
56232,
"sem-arb",
"Arab, Hebr",
strip_diacritics = {
Arab = "ar-stripdiacritics",
},
-- Hebr display_text, strip_diacritics, sort_key in [[Module:scripts/data]]
}
m["acn"] = {
"Achang",
56582,
"tbq-brm",
"Latn",
}
m["acp"] = {
"Eastern Acipa",
5329945,
"nic-kmk",
"Latn",
}
m["acr"] = {
"Achi",
34774,
"myn",
"Latn",
}
m["acs"] = {
"Acroá",
2829146,
"sai-cje",
"Latn",
}
m["acu"] = {
"Achuar",
2823170,
"sai-jiv",
"Latn",
}
m["acv"] = {
"Achumawi",
56661,
"nai-pal",
"Latn",
}
m["acw"] = {
"อาหรับแบบฮิญาซ",
56608,
"sem-arb",
"Arab",
strip_diacritics = "ar-stripdiacritics",
}
m["acx"] = {
"อาหรับแบบโอมาน",
56630,
"sem-arb",
"Arab",
strip_diacritics = "ar-stripdiacritics",
}
m["acy"] = {
"อาหรับแบบไซปรัส",
56416,
"sem-arb",
"Latn, Grek",
ancestors = "acm",
strip_diacritics = {
Latn = {remove_diacritics = c.grave .. c.acute .. c.breve},
},
-- Grek display_text, strip_diacritics, sort_key in [[Module:scripts/data]]
standard_chars = {
Latn = "AaBbCcDdΔδEeFfGgĠġĊċIiJjKkLlMmNnOoPpΘθRrSsTtUuVvWwXxYyZzŞş",
c.punc
},
}
m["acz"] = {
"Acheron",
34769,
"alv-tal",
"Latn",
}
m["ada"] = {
"Adangme",
35141,
"alv-gda",
"Latn",
}
m["adb"] = {
"Atauran",
125421255,
"poz-cet",
"Latn",
}
m["add"] = {
"Dzodinka",
35266,
"nic-nka",
"Latn",
}
m["ade"] = {
"Adele",
27740,
"alv-ntg",
"Latn",
}
m["adf"] = {
"Dhofari Arabic",
56565,
"sem-arb",
"Arab",
strip_diacritics = "ar-stripdiacritics",
}
m["adg"] = {
"Andegerebinha",
3508123,
"aus-pam",
"Latn",
}
m["adh"] = {
"Adhola",
1971400,
"sdv-los",
"Latn",
}
m["adi"] = {
"Adi",
56440,
"sit-tan",
"Latn",
}
m["adj"] = {
"Adioukrou",
34738,
"alv-lag",
"Latn",
}
m["adl"] = {
"Galo",
2857892,
"sit-tan",
"Latn",
}
m["adn"] = {
"Adang",
3398276,
"paa-tap",
"Latn",
}
m["ado"] = {
"Abu",
56659,
"paa-ram",
"Latn",
}
m["adp"] = {
"Adap",
3512402,
"sit-tib",
"Tibt",
ancestors = "dz",
override_translit = true,
-- Tibt translit, display_text, strip_diacritics, sort_key in [[Module:scripts/data]]
}
m["adq"] = {
"Adangbe",
34730,
"alv-gda",
"Latn",
ancestors = "ada",
}
m["adr"] = {
"Adonara",
4684505,
"poz-cet",
"Latn",
}
m["ads"] = {
"Adamorobe Sign Language",
27709,
"sgn",
"Latn", -- when documented
}
m["adt"] = {
"Adnyamathanha",
2225391,
"aus-psw",
"Latn",
}
m["adu"] = {
"Aduge",
34734,
"alv-nwd",
"Latn",
ancestors = "opa",
}
m["adw"] = {
"Amondawa",
12626847,
"tup-gua",
"Latn",
}
m["ady"] = {
"อะดีเกยา",
27776,
"cau-cir",
"Cyrl, Latn, Arab",
translit = {
Cyrl = "cau-cir-translit",
Arab = "ar-translit",
},
override_translit = true,
display_text = {
Cyrl = s["cau-Cyrl-displaytext"]
},
strip_diacritics = {
Cyrl = s["cau-Cyrl-stripdiacritics"],
Latn = s["cau-Latn-stripdiacritics"],
},
sort_key = {
Cyrl = {
from = {
"кхъу", "къӏу", -- 4 chars
"гъу", "джу", "дзу", "жъу", "къу", "кхъ", "къӏ", "кӏу", "кӏь", "лъу", "лӏу", "пӏу", "сӏу", "тӏу", "фӏу", "хъу", "цӏу", "чъу", "чӏу", "шъу", "шӏу", "щӏу", -- 3 chars
"гу", "гъ", "гь", "дж", "дз", "ё", "жъ", "жь", "ку", "къ", "кь", "кӏ", "лъ", "ль", "лӏ", "пӏ", "сӏ", "тӏ", "фӏ", "ху", "хъ", "хь", "цу", "цӏ", "чу", "чъ", "чӏ", "шъ", "шӏ", "щӏ", "ӏу", "ӏь" -- 2 chars
},
to = {
"к" .. p[5], "к" .. p[7],
"г" .. p[3], "д" .. p[2], "д" .. p[4], "ж" .. p[2], "к" .. p[3], "к" .. p[4], "к" .. p[6], "к" .. p[10], "к" .. p[11], "л" .. p[2], "л" .. p[5], "п" .. p[2], "с" .. p[2], "т" .. p[2], "ф" .. p[2], "х" .. p[3], "ц" .. p[3], "ч" .. p[3], "ч" .. p[5], "ш" .. p[2], "ш" .. p[4], "щ" .. p[2],
"г" .. p[1], "г" .. p[2], "г" .. p[4], "д" .. p[1], "д" .. p[3], "е" .. p[1], "ж" .. p[1], "ж" .. p[3], "к" .. p[1], "к" .. p[2], "к" .. p[8], "к" .. p[9], "л" .. p[1], "л" .. p[3], "л" .. p[4], "п" .. p[1], "с" .. p[1], "т" .. p[1], "ф" .. p[1], "х" .. p[1], "х" .. p[2], "х" .. p[4], "ц" .. p[1], "ц" .. p[2], "ч" .. p[1], "ч" .. p[2], "ч" .. p[4], "ш" .. p[1], "ш" .. p[3], "щ" .. p[1], "ӏ" .. p[1], "ӏ" .. p[2]
}
},
},
}
m["adz"] = {
"Adzera",
3327445,
"poz-ocw",
"Latn",
}
m["aea"] = {
"Areba",
3509129,
"aus-pam",
"Latn",
}
m["aeb"] = {
"อาหรับแบบตูนิเซีย",
56240,
"sem-arb",
"Arab",
strip_diacritics = "ar-stripdiacritics",
}
m["aed"] = {
"Argentine Sign Language",
3322073,
"sgn",
"Latn", -- when documented
}
m["aee"] = {
"Northeast Pashayi",
12642198,
"inc-pas",
"fa-Arab, Latn",
}
m["aek"] = {
"Haeke",
5638166,
"poz-cln",
"Latn",
}
m["ael"] = {
"Ambele",
34818,
"nic-grf",
"Latn",
}
m["aem"] = {
"Arem",
3507920,
"mkh-vie",
"Latn",
}
m["aen"] = {
"Armenian Sign Language",
3446604,
"sgn",
}
m["aeq"] = {
"Aer",
3246741,
"inc-wes",
"Arab",
}
m["aer"] = {
"Eastern Arrernte",
10728232,
"aus-pam",
"Latn",
}
m["aes"] = {
"Alsea",
2395641,
nil,
"Latn",
}
m["aeu"] = {
"Akeu",
4700657,
"tbq-sil",
"Latn",
}
m["aew"] = {
"Ambakich",
56642,
"paa-eke",
"Latn",
}
m["aey"] = {
"Amele",
3508025,
"ngf-gum",
"Latn",
}
m["aez"] = {
"Aeka",
16110528,
"paa-bin",
"Latn",
}
m["afb"] = {
"Gulf Arabic",
56385,
"sem-arb",
"Arab",
strip_diacritics = "ar-stripdiacritics",
}
m["afd"] = {
"Andai",
4753480,
"paa-arf",
"Latn",
}
m["afe"] = {
"Putukwam",
3914930,
"nic-ben",
"Latn",
}
m["afg"] = {
"Afghan Sign Language",
4689093,
"sgn",
}
m["afh"] = {
"Afrihili",
384707,
"art",
"Latn",
type = "appendix-constructed",
}
m["afi"] = {
"Akrukay",
57003,
"paa-ram",
"Latn",
}
m["afk"] = {
"Nanubae",
6964416,
"paa-arf",
"Latn",
}
m["afn"] = {
"Defaka",
35174,
"nic",
"Latn",
}
m["afo"] = {
"Eloyi",
3914066,
"nic-plt",
"Latn",
}
m["afp"] = {
"Tapei",
16887371,
"paa-arf",
"Latn",
}
m["afs"] = {
"Afro-Seminole Creole",
27867,
"crp",
"Latn",
ancestors = "en",
}
m["aft"] = {
"Afitti",
3400829,
"sdv-nyi",
"Latn",
}
m["afu"] = {
"Awutu",
34847,
"alv-gng",
"Latn",
}
m["afz"] = {
"Obokuitai",
7075258,
"paa-lkp",
"Latn",
}
m["aga"] = {
"Aguano",
3331203,
nil,
"Latn",
}
m["agb"] = {
"Legbo",
35584,
"nic-uce",
"Latn",
}
m["agc"] = {
"Agatu",
34732,
"alv-ido",
"Latn",
}
m["agd"] = {
"Agarabi",
3399642,
"ngf-kag",
"Latn",
}
m["age"] = {
"Angal",
10951553,
"ngf-eng",
"Latn",
}
m["agf"] = {
"Arguni",
12473346,
"poz-cet",
"Latn",
}
m["agg"] = {
"Angor",
3508100,
"paa-sng",
"Latn",
}
m["agh"] = {
"Ngelima",
7022266,
"bnt-bta",
"Latn",
}
m["agi"] = {
"Agariya",
663586,
"mun",
"Deva",
translit = "Deva-translit",
}
m["agj"] = {
"Argobba",
29292,
"sem-eth",
"Ethi",
}
m["agk"] = {
"Isarog Agta",
6078982,
"phi",
"Latn",
}
m["agl"] = {
"Fembe",
372927,
"ngf-est",
"Latn",
}
m["agm"] = {
"Angaataha",
3508001,
"ngf-ang",
"Latn",
}
m["agn"] = {
"อากูตายา",
3399717,
"phi-kal",
"Latn",
}
m["ago"] = {
"Tainae",
7676186,
"ngf-ang",
"Latn",
}
m["agq"] = {
"Aghem",
34737,
"nic-rnw",
"Latn",
}
m["agr"] = {
"Aguaruna",
1526530,
"sai-jiv",
"Latn",
}
m["ags"] = {
"Esimbi",
35260,
"nic-bds",
"Latn",
}
m["agt"] = {
"Central Cagayan Agta",
5017296,
"phi",
"Latn",
}
m["agu"] = {
"Aguacateca",
35091,
"myn",
"Latn",
}
m["agv"] = {
"Remontado Agta",
3508085,
"phi",
"Latn",
}
m["agw"] = {
"Kahua",
3191906,
"poz-sls",
"Latn",
}
m["agx"] = {
"Aghul",
36498,
"cau-esm",
"Cyrl",
translit = "cau-nec-translit",
override_translit = true,
display_text = s["cau-Cyrl-displaytext"],
strip_diacritics = s["cau-Cyrl-stripdiacritics"],
sort_key = {
from = {"аь", "гъ", "гь", "гӏ", "дж", "ё", "къ", "кь", "кӏ", "оь", "пӏ", "тӏ", "уь", "хъ", "хь", "хӏ", "цӏ", "чӏ"},
to = {"а" .. p[1], "г" .. p[1], "г" .. p[2], "г" .. p[3], "д" .. p[1], "е" .. p[1], "к" .. p[1], "к" .. p[2], "к" .. p[3], "о" .. p[1], "п" .. p[1], "т" .. p[1], "у" .. p[1], "х" .. p[1], "х" .. p[2], "х" .. p[3], "ц" .. p[1], "ч" .. p[1]}
},
}
m["agy"] = {
"Southern Alta",
7569611,
"phi",
"Latn",
}
m["agz"] = {
"Mount Iriga Agta",
6921432,
"phi",
"Latn",
}
m["aha"] = {
"Ahanta",
34729,
"alv-ctn",
"Latn",
}
m["ahb"] = {
"Axamb",
2874710,
"poz-vnc",
"Latn",
}
m["ahg"] = {
"Qimant",
35663,
"cus-cen",
"Latn",
}
m["ahh"] = {
"Aghu",
3436645,
"ngf-gaw",
"Latn",
}
m["ahi"] = {
"Tiagba",
3400073,
"kro-aiz",
"Latn",
}
m["ahk"] = {
"อาข่า",
56643,
"tbq-han",
"Latn, Mymr, Thai",
sort_key = {
Thai = {
from = {"[%pๆ]", "[็-๎]", "([เแโใไ])([ก-ฮ])"},
to = {"", "", "%2%1"}
},
},
}
m["ahl"] = {
"Igo",
35412,
"alv-ktg",
"Latn",
}
m["ahm"] = {
"Mobu",
35967,
"kro-aiz",
"Latn",
}
m["ahn"] = {
"Àhàn",
34723,
"alv-aah",
"Latn",
}
m["aho"] = {
"อาหม",
34778,
"tai-swe",
"Ahom",
translit = "Ahom-translit",
}
m["ahp"] = {
"Apro",
34810,
"alv-kwa",
"Latn",
}
m["ahr"] = {
"Ahirani",
15549890,
"raj",
"Deva",
translit = "Deva-translit",
}
m["ahs"] = {
"Ashe",
34823,
"nic-plc",
"Latn",
}
m["aht"] = {
"Ahtna",
21058,
"ath-nor",
"Latn",
}
m["aia"] = {
"Arosi",
2863483,
"poz-sls",
"Latn",
}
m["aib"] = {
"Äynu",
27927,
"qfa-mix",
"Arab, Latn",
ancestors = "ug, fa"
}
m["aic"] = {
"Ainbai",
3332149,
"paa-brd",
"Latn",
}
m["aid"] = {
"Alngith",
3279409,
"aus-pmn",
"Latn",
}
m["aie"] = {
"Amara",
2841180,
"poz-ocw",
"Latn",
}
m["aif"] = {
"Agi",
3331491,
"paa-tor",
"Latn",
}
m["aig"] = {
"Antigua and Barbuda Creole English",
3244184,
"crp",
"Latn",
ancestors = "en",
}
m["aih"] = {
"Ai-Cham",
2827749,
"qfa-kms",
"Latn, Hani",
sort_key = {
Hani = "Hani-sortkey"
},
}
m["aii"] = {
"Assyrian Neo-Aramaic",
29440,
"sem-nna",
"Syrc",
translit = "aii-translit",
strip_diacritics = "Syrc-stripdiacritics",
}
m["aij"] = {
"Lishanid Noshan",
3436467,
"sem-nna",
"Hebr",
-- Hebr display_text, strip_diacritics, sort_key in [[Module:scripts/data]]
}
m["aik"] = {
"Ake",
34808,
"nic-pls",
"Latn",
}
m["ail"] = {
"Aimele",
3327418,
"ngf-bos",
"Latn",
}
m["aim"] = {
"Aimol",
4697175,
"tbq-kuk",
"Latn, Beng",
}
m["ain"] = {
"ไอนุ",
27969,
"qfa-ain",
"Kana, Latn, Cyrl",
sort_key = {
Kana = "Kana-sortkey"
},
}
m["aio"] = {
"อ่ายตน",
3399725,
"tai-swe",
"Mymr",
translit = "aio-phk-translit",
display_text = s["aio-displaytext"],
strip_diacritics = s["aio-stripdiacritics"],
}
m["aip"] = {
"Burumakok",
5000984,
"ngf-okk",
"Latn",
}
m["air"] = {
"Airoran",
3321131,
"paa-tkw",
"Latn",
}
m["ait"] = {
"Arikem",
3446679,
"tup",
"Latn",
}
m["aiw"] = {
"Aari",
7495,
"omv-aro",
"Latn",
}
m["aix"] = {
"Aighon",
3504287,
"poz-ocw",
"Latn",
}
m["aiy"] = {
"Ali",
34814,
"gba-eas",
"Latn",
}
m["aja"] = {
"Aja",
3237491,
"csu-bkr",
"Latn",
}
m["ajg"] = {
"Adja",
35035,
"alv-gbe",
"Latn",
}
m["aji"] = {
"Ajië",
2828867,
"poz-cln",
"Latn",
}
m["ajn"] = {
"Andajin",
16111302,
"aus-wor",
"Latn",
}
m["ajp"] = {
"อาหรับแบบลิแวนต์ใต้",
55633582,
"sem-arb",
"Arab",
strip_diacritics = "ar-stripdiacritics",
}
m["ajw"] = {
"Ajawa",
56645,
"cdc-wst",
"Latn",
}
m["ajz"] = {
"Amri Karbi",
3508092,
"tbq-kuk",
"Latn",
ancestors = "mjw",
}
m["akb"] = {
"Angkola Batak",
2640686,
"btk",
"Latn, Batk",
}
m["akc"] = {
"Mpur",
3327139,
"qfa-iso", -- Papuan; based on Palmer (2018), Ethnologue and Glottolog
"Latn",
}
m["akd"] = {
"Ukpet-Ehom",
36618,
"nic-ucr",
"Latn",
}
m["ake"] = {
"Akawaio",
28059,
"sai-pem",
"Latn",
}
m["akf"] = {
"Akpa",
34801,
"alv-ido",
"Latn",
}
m["akg"] = {
"Anakalangu",
4750964,
"poz-cet",
"Latn",
}
m["akh"] = {
"Angal Heneng",
10950354,
"ngf-eng",
"Latn",
}
m["aki"] = {
"Aiome",
56735,
"paa-ram",
"Latn",
}
m["akj"] = {
"Jeru",
2919121,
"qfa-adn",
"Latn, Deva",
translit = {
Deva = "Deva-translit",
},
}
m["akk"] = {
"แอกแคด",
35518,
"sem-eas",
"Xsux, Latn",
}
m["akl"] = {
"อักลัน",
8773,
"phi",
"Latn",
}
m["akm"] = {
"Aka-Bo",
35361,
"qfa-adn",
"Latn",
}
m["ako"] = {
"Akurio",
56650,
"sai-tar",
"Latn",
}
m["akp"] = {
"Siwu",
36470,
"alv-ntg",
"Latn",
}
m["akq"] = {
"Ak",
56654,
"paa-spk",
"Latn",
}
m["akr"] = {
"Araki",
2699882,
"poz-vnn",
"Latn",
}
m["aks"] = {
"Akaselem",
34817,
"nic-grm",
"Latn",
}
m["akt"] = {
"Akolet",
3330162,
"poz-ocw",
"Latn",
}
m["aku"] = {
"Akum",
34799,
"nic-ykb",
"Latn",
}
m["akv"] = {
"Akhvakh",
56423,
"cau-and",
"Cyrl",
translit = "cau-nec-translit",
override_translit = true,
display_text = s["cau-Cyrl-displaytext"],
strip_diacritics = s["cau-Cyrl-stripdiacritics"],
}
m["akw"] = {
"Akwa",
34802,
"bnt-mbo",
"Latn",
}
m["akx"] = {
"Aka-Kede",
3436816,
"qfa-adc",
"Latn",
}
m["aky"] = {
"Aka-Kol",
3436784,
"qfa-adc",
"Latn",
}
m["akz"] = {
"แอละแบมา",
1815020,
"nai-mus",
"Latn",
}
m["ala"] = {
"Alago",
34813,
"alv-ido",
"Latn",
}
m["alc"] = {
"Kawésqar",
56544,
"aqa",
"Latn",
}
m["ald"] = {
"Alladian",
34837,
"alv-lag",
"Latn",
}
m["ale"] = {
"Aleut",
27210,
"esx",
"Latn, Cyrl",
}
m["alf"] = {
"Alege",
34815,
"nic-ben",
"Latn",
}
m["alh"] = {
"Alawa",
2147917,
"aus-gun",
"Latn",
}
m["ali"] = {
"Amaimon",
3327427,
"ngf-mad",
"Latn",
}
m["alj"] = {
"Alangan",
3327423,
"phi",
"Latn",
}
m["alk"] = {
"Alak",
2714690,
"mkh",
"Latn",
}
m["all"] = {
"Allar",
3393634,
"dra-mal",
"Mlym",
-- Mlym translit in [[Module:scripts/data]] (NOTE: not present before, presumably an accidental omission)
}
-- "aln" is treated as "sq", see [[WT:LT]]
m["alm"] = {
"Amblong",
11022615,
"poz-vnn",
"Latn",
}
m["alo"] = {
"Larike-Wakasihu",
3217929,
"poz-cma",
"Latn",
}
m["alp"] = {
"Alune",
3327367,
"poz-cet",
"Latn",
}
m["alq"] = {
"Algonquin",
28092,
"alg",
"Latn, Cans",
ancestors = "oj",
}
m["alr"] = {
"Alutor",
28213,
"qfa-ckn",
"Cyrl",
strip_diacritics = {
from = {"['’]"},
to = {"ʼ"}
},
sort_key = {
from = {"вʼ", "гʼ", "ғ", "ә", "ё", "ӄ", "ӈ"},
to = {"в" .. p[1], "г" .. p[1], "г" .. p[2], "е" .. p[1], "е" .. p[2], "к" .. p[1], "н" .. p[1]}
},
}
m["alt"] = {
"อัลไตใต้",
1991779,
"trk-kkp",
"Cyrl",
translit = "Altai-translit",
sort_key = {
from = {"ј", "ё", "ҥ", "ӧ", "ӱ"},
to = {"д" .. p[1], "е" .. p[1], "н" .. p[1], "о" .. p[1], "у" .. p[1]}
},
}
m["alu"] = {
"'Are'are",
5160,
"poz-sls",
"Latn",
}
m["alw"] = {
"Alaba",
56652,
"cus-hec",
"Latn",
}
m["alx"] = {
"Amol",
3504260,
"paa-tor",
"Latn",
}
m["aly"] = {
"Alyawarr",
3327389,
"aus-pam",
"Latn",
}
m["alz"] = {
"Alur",
56507,
"sdv-los",
"Latn",
}
m["ama"] = {
"Amanayé",
3508053,
"tup-gua",
"Latn",
}
m["amb"] = {
"Ambo",
3450142,
"nic-tvn",
"Latn",
}
m["amc"] = {
"Amahuaca",
2669150,
"sai-pan",
"Latn",
}
m["ame"] = {
"Yanesha'",
3088540,
"awd",
"Latn",
}
m["amf"] = {
"Hamer-Banna",
35764,
"omv-aro",
"Latn, Ethi",
sort_key = "amf-utilities"
}
m["amg"] = {
"Amurdag",
3360016,
"aus-wdj",
"Latn",
}
m["ami"] = {
"Amis",
35132,
"map",
"Latn",
}
m["amj"] = {
"Amdang",
28335,
"ssa-fur",
"Latn",
}
m["amk"] = {
"Ambai",
1875885,
"poz-hce",
"Latn",
}
m["aml"] = {
"War-Jaintia",
56321,
"aav-khs",
"Latn",
}
m["amm"] = {
"Ama",
3446626,
"paa-lem",
"Latn",
}
m["amn"] = {
"Amanab",
3327399,
"paa-brd",
"Latn",
}
m["amo"] = {
"Amo",
34826,
"nic-kne",
"Latn",
}
m["amp"] = {
"Alamblak",
56688,
"paa-spk",
"Latn",
}
m["amq"] = {
"Amahai",
3327384,
"poz-cma",
"Latn",
}
m["amr"] = {
"Amarakaeri",
35128,
"sai-har",
"Latn",
}
m["ams"] = {
"อามามิโอชิมะใต้",
2840986,
"jpx-nry",
"Jpan",
translit = s["jpx-translit"],
display_text = s["jpx-displaytext"],
strip_diacritics = s["jpx-stripdiacritics"],
sort_key = s["jpx-sortkey"],
}
m["amt"] = {
"Amto",
56517,
"paa-amu",
"Latn",
}
m["amu"] = {
"Guerrero Amuzgo",
3501942,
"omq",
"Latn",
}
m["amv"] = {
"Ambelau",
2669214,
"poz-cma",
"Latn",
}
m["amw"] = {
"Western Neo-Aramaic",
34226,
"sem-arw",
"Armi, Syrc, Latn",
strip_diacritics = {
Syrc = "Syrc-stripdiacritics"
},
}
m["amx"] = {
"Anmatyerre",
10412317,
"aus-pam",
"Latn",
}
m["amy"] = {
"Ami",
10408315,
"aus-dal",
"Latn",
}
m["amz"] = {
"Atampaya",
3446651,
"aus-pam",
"Latn",
}
m["ana"] = {
"Andaqui",
2846078,
nil,
"Latn",
}
m["anb"] = {
"Andoa",
2846171,
"sai-zap",
"Latn",
}
m["anc"] = {
"Ngas",
35999,
"cdc-wst",
"Latn",
}
m["and"] = {
"Ansus",
3513300,
"poz-hce",
"Latn",
}
m["ane"] = {
"คังรังชือ",
3571097,
"poz-cln",
"Latn",
}
m["anf"] = {
"Animere",
34783,
"alv-ktg",
"Latn",
}
m["ang"] = {
"อังกฤษเก่า",
42365,
"gmw-ang",
"Latn, Runr",
translit = {
Runr = "Runr-translit"
},
strip_diacritics = {
Latn = {
remove_diacritics = c.acute .. c.circ .. c.macron .. c.breve .. c.dotabove .. c.diaer .. c.dotbelow,
from = {"[Ƿƿ]"},
to = {{
["Ƿ"] = "W", ["ƿ"] = "w",
}},
},
},
sort_key = {
Latn = {
remove_diacritics = c.acute .. c.circ .. c.macron .. c.breve .. c.dotabove .. c.diaer .. c.dotbelow,
from = {"[æƀꝺðꝼᵹȝłœꞃꞅꞇþꝥꝧƿ]"},
to = {{
["æ"] = "ae", ["ƀ"] = "b", ["ꝺ"] = "d", ["ð"] = "d" .. p[1], ["ꝼ"] = "f",
["ᵹ"] = "g", ["ȝ"] = "g" .. p[1], ["ł"] = "l", ["œ"] = "oe", ["ꞃ"] = "r",
["ꞅ"] = "s", ["ꞇ"] = "t", ["þ"] = "t" .. p[1], ["ꝥ"] = "t" .. p[1],
["ꝧ"] = "t" .. p[1], ["ƿ"] = "w",
}},
},
},
standard_chars = {
Latn = "AaÆæBbCcDdÐðEeFfGgHhIiLlMmNnOoŒœPpRrSsTtÞþUuWwXxYy",
c.punc,
},
}
m["anh"] = {
"Nend",
6991554,
"ngf-wso",
"Latn",
}
m["ani"] = {
"Andi",
34849,
"cau-and",
"Cyrl",
translit = "cau-nec-translit",
override_translit = true,
display_text = s["cau-Cyrl-displaytext"],
strip_diacritics = s["cau-Cyrl-stripdiacritics"],
}
m["anj"] = {
"Anor",
56458,
"paa-ram",
"Latn",
}
m["ank"] = {
"Goemai",
35272,
"cdc-wst",
"Latn",
}
m["anl"] = {
"Anu",
4777679,
"sit-mru",
"Latn",
}
m["anm"] = {
"Anāl",
56235,
"tbq-kuk",
"Latn",
}
m["ann"] = {
"Obolo",
36614,
"nic-lcr",
"Latn",
}
m["ano"] = {
"Andoque",
2669225,
"qfa-iso",
"Latn",
}
m["anp"] = {
"Angika",
28378,
"inc-bih",
"Deva, Kthi",
translit = {
Deva = "Deva-translit",
Kthi = "Kthi-translit",
},
}
m["anq"] = {
"Jarawa",
2475526,
"qfa-ong",
"Latn",
}
m["anr"] = {
"Andh",
4754314,
"inc-sou",
"Deva",
translit = "Deva-translit",
}
m["ans"] = {
"Anserma",
3446613,
"sai-chc",
"Latn",
}
m["ant"] = {
"Antakarinya",
921304,
"aus-psw",
"Latn",
}
m["anu"] = {
"Anuak",
56677,
"sdv-lon",
"Latn",
}
m["anv"] = {
"Denya",
35187,
"nic-mam",
"Latn",
}
m["anw"] = {
"Anaang",
2845320,
"nic-ief",
"Latn",
}
m["anx"] = {
"Andra-Hus",
2846195,
"poz-aay",
"Latn",
}
m["any"] = {
"Anyi",
28395,
"alv-ctn",
"Latn",
}
m["anz"] = {
"Anem",
56512,
"qfa-dis", -- Papuan; might be an isolate or in a putative West New Britain family
"Latn",
}
m["aoa"] = {
"Angolar",
34994,
"crp",
"Latn",
ancestors = "pt",
}
m["aob"] = {
"Abom",
3446647,
"paa-ani",
"Latn",
}
m["aoc"] = {
"Pemon",
10729616,
"sai-pem",
"Latn",
}
m["aod"] = {
"Andarum",
3507888,
"paa-ram",
"Latn",
}
m["aoe"] = {
"Angal Enen",
10951638,
"ngf-eng",
"Latn",
}
m["aof"] = {
"Bragat",
3507977,
"paa-tor",
"Latn",
}
m["aog"] = {
"Angoram",
56366, -- cf 6754745 for merged dialect
"paa-lsp",
"Latn",
}
m["aoi"] = {
"Anindilyakwa",
2714654,
"aus-arn",
"Latn",
}
m["aoj"] = {
"Mufian",
3507881,
"paa-tor",
"Latn",
}
m["aok"] = {
"Arhö",
4790086,
"poz-cln",
"Latn",
}
m["aol"] = {
"Alorese",
3332062,
"poz",
"Latn",
}
m["aom"] = {
"Ömie",
8078975,
"ngf-koi",
"Latn",
}
m["aon"] = {
"Bumbita Arapesh",
3508044,
"paa-tor",
"Latn",
}
m["aor"] = {
"Aore",
12627129,
"poz-vnn",
"Latn",
}
m["aos"] = {
"Taikat",
7676018,
"paa-brd",
"Latn",
}
m["aot"] = {
"อะตง", --actual pronounciation
5646,
"tbq-bdg",
"Latn, Beng",
}
m["aou"] = {
"A'ou",
16109994,
"gio",
"Latn", -- also Hani?
}
m["aox"] = {
"Atorada",
3507932,
"awd",
"Latn",
}
m["aoz"] = {
"Uab Meto",
3441962,
"poz-tim",
"Latn",
}
m["apb"] = {
"Sa'a",
36294,
"poz-sls",
"Latn",
}
m["apc"] = {
"North Levantine Arabic",
22809485,
"sem-arb",
"Arab",
strip_diacritics = "ar-stripdiacritics",
}
m["apd"] = {
"อาหรับแบบซูดาน",
56573,
"sem-arb",
"Arab",
strip_diacritics = "ar-stripdiacritics",
}
m["ape"] = {
"Bukiyip",
3507895,
"paa-tor",
"Latn",
}
m["apf"] = {
"Pahanan Agta",
7135432,
"phi",
"Latn",
}
m["apg"] = {
"Ampanang",
4748035,
"poz",
"Latn",
}
m["aph"] = {
"Athpare",
3449126,
"sit-kie",
"Deva, Latn",
translit = {
Deva = "Deva-translit",
},
}
m["api"] = {
"Apiaká",
3507941,
"tup-gua",
"Latn",
}
m["apj"] = {
"Jicarilla",
28277,
"apa",
"Latn",
}
m["apk"] = {
"Plains Apache",
27861,
"apa",
"Latn",
}
m["apl"] = {
"Lipan",
28269,
"apa",
"Latn",
}
m["apm"] = {
"Chiricahua",
13368,
"apa",
"Latn",
}
m["apn"] = {
"Apinayé",
2858311,
"sai-nje",
"Latn",
}
m["apo"] = {
"Ambul",
12627135,
"poz-ocw",
"Latn",
}
m["app"] = {
"Apma",
2669188,
"poz-vnn",
"Latn",
}
m["apq"] = {
"A-Pucikwar",
28466,
"qfa-adc",
"Latn",
}
m["apr"] = {
"Arop-Lokep",
2863482,
"poz-ocw",
"Latn",
}
m["aps"] = {
"Arop-Sissano",
12627242,
"poz-ocw",
"Latn",
}
m["apt"] = {
"Apatani",
56306,
"sit-tan",
"Latn",
}
m["apu"] = {
"Apurinã",
2859081,
"awd",
"Latn",
}
m["apv"] = {
"Alapmunte",
16110782,
"sai-nmk",
"Latn",
}
m["apw"] = {
"อะแพชีตะวันตก",
28060,
"apa",
"Latn",
}
m["apx"] = {
"Aputai",
12473343,
"poz-tim",
"Latn",
}
m["apy"] = {
"Apalaí",
2736980,
"sai-gui",
"Latn",
}
m["apz"] = {
"Safeyoka",
7398693,
"ngf-ang",
"Latn",
}
m["aqc"] = {
"Archi",
34915,
"cau-lzg",
"Cyrl",
translit = "cau-nec-translit",
override_translit = true,
display_text = s["cau-Cyrl-displaytext"],
strip_diacritics = s["cau-Cyrl-stripdiacritics"],
sort_key = {
from = {
"ккъӏв", "ххьӏв", -- 5 chars
"гъӏв", "ёоӏ", "ккъӏ", "ккъв", "къӏв", "ллъв", "ххьӏ", "хъӏв", "хьӏв", "ццӏв", "ччӏв", -- 4 chars
"ааӏ", "гӏв", "гъӏ", "гъв", "гьв", "ееӏ", "ёӏ", "ёо", "ииӏ", "кӏв", "ккв", "ккъ", "къӏ", "къв", "кьв", "лӏв", "ллъ", "лъв", "льв", "ооӏ", "пӏв", "ппв", "ссв", "тӏв", "ттв", "ууӏ", "хӏв", "ххв", "хъӏ", "хъв", "хьӏ", "цӏв", "ццӏ", "ццв", "чӏв", "ччӏ", "ээӏ", "юуӏ", "яаӏ", -- 3 chars
"аӏ", "аа", "гӏ", "гв", "гъ", "гь", "дв", "еӏ", "ее", "ё", "жв", "зв", "иӏ", "ии", "кӏ", "кв", "кк", "къ", "кь", "лӏ", "лв", "лъ", "ль", "оӏ", "оо", "пӏ", "пв", "пп", "св", "сс", "тӏ", "тв", "тт", "уӏ", "уу", "фв", "хӏ", "хв", "хх", "хъ", "цӏ", "цв", "цц", "чӏ", "чв", "шв", "щв", "эӏ", "ээ", "юӏ", "юу", "яӏ", "яа" -- 2 chars
},
to = {
"к" .. p[8], "х" .. p[7],
"г" .. p[6], "е" .. p[7], "к" .. p[7], "к" .. p[9], "к" .. p[12], "л" .. p[5], "х" .. p[6], "х" .. p[10], "х" .. p[13], "ц" .. p[6], "ч" .. p[5],
"а" .. p[3], "г" .. p[2], "г" .. p[5], "г" .. p[7], "г" .. p[9], "е" .. p[3], "е" .. p[5], "е" .. p[6], "и" .. p[3], "к" .. p[2], "к" .. p[5], "к" .. p[6], "к" .. p[11], "к" .. p[13], "к" .. p[15], "л" .. p[2], "л" .. p[4], "л" .. p[7], "л" .. p[9], "о" .. p[3], "п" .. p[2], "п" .. p[5], "с" .. p[3], "т" .. p[2], "т" .. p[5], "у" .. p[3], "х" .. p[2], "х" .. p[5], "х" .. p[9], "х" .. p[11], "х" .. p[12], "ц" .. p[2], "ц" .. p[5], "ц" .. p[7], "ч" .. p[2], "ч" .. p[4], "э" .. p[3], "ю" .. p[3], "я" .. p[3],
"а" .. p[1], "а" .. p[2], "г" .. p[1], "г" .. p[3], "г" .. p[4], "г" .. p[8], "д" .. p[1], "е" .. p[1], "е" .. p[2], "е" .. p[4], "ж" .. p[1], "з" .. p[1], "и" .. p[1], "и" .. p[2], "к" .. p[1], "к" .. p[3], "к" .. p[4], "к" .. p[10], "к" .. p[14], "л" .. p[1], "л" .. p[3], "л" .. p[6], "л" .. p[8], "о" .. p[1], "о" .. p[2], "п" .. p[1], "п" .. p[3], "п" .. p[4], "с" .. p[1], "с" .. p[2], "т" .. p[1], "т" .. p[3], "т" .. p[4], "у" .. p[1], "у" .. p[2], "ф" .. p[1], "х" .. p[1], "х" .. p[3], "х" .. p[4], "х" .. p[8], "ц" .. p[1], "ц" .. p[3], "ц" .. p[4], "ч" .. p[1], "ч" .. p[3], "ш" .. p[1], "щ" .. p[1], "э" .. p[1], "э" .. p[2], "ю" .. p[1], "ю" .. p[2], "я" .. p[1], "я" .. p[2]
}
},
}
m["aqd"] = {
"Ampari Dogon",
4748057,
"nic-dgw",
"Latn",
}
m["aqg"] = {
"Arigidi",
34829,
"alv-von",
"Latn",
}
m["aqm"] = {
"Atohwaim",
11732297,
"paa-kay",
"Latn",
}
m["aqn"] = {
"Northern Alta",
7058116,
"phi",
"Latn",
}
m["aqp"] = {
"Atakapa",
10975683,
"qfa-iso",
"Latn",
}
m["aqr"] = {
"Arhâ",
4790085,
"poz-cln",
"Latn",
}
m["aqt"] = {
"Angaité",
15736037,
"sai-mas",
"Latn",
}
m["aqz"] = {
"Akuntsu",
4701960,
"tup",
"Latn",
}
m["arc"] = {
"อารามายา", -- ใช้แทน แอราเมอิก เพราะซ้ำกับกลุ่มภาษา
28602,
"sem-ara",
"Hebr, Armi, Syrc, Palm, Nbat, Phnx, Mand, Samr, Hatr, Elym",
translit = {
Armi = "Armi-translit",
Palm = "Palm-translit",
},
strip_diacritics = {
-- The first three were added by [[User:Wikitiki89]] in 2015 for use with Syriac, which has diacritics that look
-- like a diaeresis (syāmē) and macrons above and below (mṭalqānā); see Wikipedia [[w:Syriac alphabet]]. But
-- I don't know if they are actually represented using these diacritics.
Syrc = {remove_diacritics = c.macron .. c.diaer .. c.macronbelow .. u(0x0730) .. "-" .. u(0x0748)},
},
-- Hebr display_text, strip_diacritics, sort_key in [[Module:scripts/data]]
-- Samr strip_diacritics, sort_key in [[Module:scripts/data]]; previously no sort_key for Samr, presumably a mistake
-- Phnx translit in [[Module:scripts/data]] (NOTE: not present before, presumably an accidental omission)
}
m["ard"] = {
"Arabana",
3507959,
"aus-kar",
"Latn",
}
m["are"] = {
"Western Arrernte",
12645549,
"aus-pam",
"Latn",
}
m["arh"] = {
"Arhuaco",
2640621,
"cba",
"Latn",
}
m["ari"] = {
"Arikara",
56539,
"cdd",
"Latn",
strip_diacritics = {remove_diacritics = c.acute},
}
m["arj"] = {
"Arapaso",
9627356,
"sai-tuc",
"Latn",
}
m["ark"] = {
"Arikapú",
3446640,
"sai-mje",
"Latn",
}
m["arl"] = {
"Arabela",
2591221,
"sai-zap",
"Latn",
}
m["arn"] = {
"Mapudungun",
33730,
"sai-ara",
"Latn",
}
m["aro"] = {
"Araona",
958414,
"sai-tac",
"Latn",
}
m["arp"] = {
"Arapaho",
56417,
"alg-ara",
"Latn",
}
m["arq"] = {
"อาหรับแบบแอลจีเรีย",
56499,
"sem-arb",
"Arab",
strip_diacritics = "ar-stripdiacritics",
}
m["arr"] = {
"Arara-Karo",
35539,
"tup",
"Latn",
}
m["ars"] = {
"Najdi Arabic",
56574,
"sem-arb",
"Arab",
strip_diacritics = "ar-stripdiacritics",
}
m["aru"] = {
"Arua",
2746221,
"auf",
"Latn",
}
m["arv"] = {
"Arbore",
56883,
"cus-eas",
"Latn",
}
m["arw"] = {
"โลโกโน",
2655664,
"awd-taa",
"Latn",
}
m["arx"] = {
"Aruá",
3507907,
"tup",
"Latn",
}
m["ary"] = {
"อาหรับแบบโมร็อกโก",
56426,
"sem-arb",
"Arab",
strip_diacritics = "ar-stripdiacritics",
}
m["arz"] = {
"อาหรับแบบอียิปต์",
29919,
"sem-arb",
"Arab",
strip_diacritics = "ar-stripdiacritics",
}
m["asa"] = {
"Pare",
36403,
"bnt-par",
"Latn",
}
m["asb"] = {
"Assiniboine",
2591288,
"sio-dkt",
"Latn",
}
m["asc"] = {
"Casuarina Coast Asmat",
11732046,
"ngf-ask",
"Latn",
}
m["ase"] = {
"มืออเมริกัน",
14759,
"sgn",
"Sgnw",
}
m["asf"] = {
"Auslan",
29525,
"sgn",
"Latn", -- when documented
}
m["asg"] = {
"Cishingini",
35199,
"nic-kam",
"Latn",
}
m["ash"] = {
"Abishira",
2871740,
"qfa-dis", -- extinct, poorly documented; isolate or in a proposed Tequiraca-Canichana family by Kaufman (1994)
"Latn",
}
m["asi"] = {
"Buruwai",
5001031,
"ngf-ask",
"Latn",
}
m["asj"] = {
"Nsari",
36418,
"nic-bbe",
"Latn",
}
m["ask"] = {
"Ashkun",
29379,
"nur-sou",
"Arab, Latn",
}
m["asl"] = {
"Asilulu",
12473347,
"poz-cma",
"Latn",
}
m["asn"] = {
"Xingú Asuriní",
8044571,
"tup-gua",
"Latn",
}
m["aso"] = {
"Dano",
5220979,
"ngf-kag",
"Latn",
}
m["asp"] = {
"Algerian Sign Language",
3135421,
"sgn",
}
m["asq"] = {
"Austrian Sign Language",
36668,
"sgn",
"Latn", -- when documented
}
m["asr"] = {
"Asuri",
3504321,
"mun",
"Latn", -- when documented
}
m["ass"] = {
"Ipulo",
35408,
"nic-tvc",
"Latn",
}
m["ast"] = {
"อัสตูเรียส",
29507,
"roa-asl",
"Latn",
}
m["asu"] = {
"Tocantins Asurini",
32041490,
"tup-gua",
"Latn",
}
m["asv"] = {
"Asoa",
56296,
"csu-maa",
"Latn",
}
m["asw"] = {
"Australian Aboriginal Sign Language",
955216,
"sgn",
"Latn", -- when documented
}
m["asx"] = {
"Muratayak",
11732766,
"ngf-fin",
"Latn",
}
m["asy"] = {
"Yaosakor Asmat",
16113158,
"ngf-ask",
"Latn",
}
m["asz"] = {
"As",
2866218,
"poz-hce",
"Latn",
}
m["ata"] = {
"Pele-Ata",
56511,
"qfa-dis", -- Papuan; possibly in a putative West New Britain family, or an isolate
"Latn",
}
m["atb"] = {
"Zaiwa",
56594,
"tbq-brm",
"Latn, Lisu", -- also Hani?
-- Lisu translit, sort_key in [[Module:scripts/data]]
}
m["atc"] = {
"Atsahuaca",
4817730,
"sai-pan",
"Latn",
}
m["atd"] = {
"Ata Manobo",
12627315,
"mno",
"Latn",
}
m["ate"] = {
"Atemble",
4813055,
"ngf-wso",
"Latn",
}
m["atg"] = {
"Okpela",
7082551,
"alv-yek",
"Latn",
}
m["ati"] = {
"Attié",
34844,
"alv-lag",
"Latn",
}
m["atj"] = {
"Atikamekw",
56590,
"alg",
"Latn",
ancestors = "cr",
}
m["atk"] = {
"Ati",
3217458,
"phi",
"Latn",
}
m["atl"] = {
"Mount Iraya Agta",
6921430,
"phi",
"Latn",
}
m["atm"] = {
"Ata",
4812603,
"phi",
"Latn",
}
m["ato"] = {
"Atong (Cameroon)",
34824,
"nic-grs",
"Latn",
}
m["atp"] = {
"Pudtol Atta",
12640726,
"phi",
"Latn",
}
m["atq"] = {
"Aralle-Tabulahan",
4783889,
"poz-ssw",
"Latn",
}
m["atr"] = {
"Waimiri-Atroari",
56865,
"sai-car",
"Latn",
}
m["ats"] = {
"Gros Ventre",
56628,
"alg-ara",
"Latn",
}
m["att"] = {
"Pamplona Atta",
12639245,
"phi",
"Latn",
}
m["atu"] = {
"Reel",
7306882,
"sdv-dnu",
"Latn",
}
m["atv"] = {
"อัลไตเหนือ",
2640863,
"trk-ssb",
"Cyrl",
translit = "Altai-translit",
}
m["atw"] = {
"Atsugewi",
56718,
"nai-pal",
"Latn",
}
m["atx"] = {
"Arutani",
56609,
nil,
"Latn",
}
m["aty"] = {
"อาเนตยูม",
2379113,
"poz-vns",
"Latn",
}
m["atz"] = {
"Arta",
3508067,
"phi",
"Latn",
}
m["aua"] = {
"Asumboa",
4811870,
"poz-tem",
"Latn",
}
m["aub"] = {
"Alugu",
12626798,
"tbq-urp",
"Latn", -- also Hani?
}
m["auc"] = {
"Huaorani",
758570,
"qfa-iso",
"Latn",
}
m["aud"] = {
"Anuta",
35326,
"poz-pnp",
"Latn",
}
m["aug"] = {
"Aguna",
34733,
"alv-gbe",
"Latn",
}
m["auh"] = {
"Aushi",
2872082,
"bnt-sbi",
"Latn",
}
m["aui"] = {
"Anuki",
3508132,
"poz-ocw",
"Latn",
}
m["auj"] = {
"Awjila",
56398,
"ber",
"Latn, Arab, Tfng",
}
m["auk"] = {
"Heyo",
3504295,
"paa-tor",
"Latn",
}
m["aul"] = {
"Aulua",
427300,
"poz-vnc",
"Latn",
}
m["aum"] = {
"Asu",
34798,
"alv-ngb",
"Latn",
}
m["aun"] = {
"Molmo One",
12637224,
"paa-tor",
"Latn",
}
m["auo"] = {
"Auyokawa",
56247,
"cdc-wst",
"Latn",
}
m["aup"] = {
"Makayam",
6738863,
"paa-ani",
"Latn",
}
m["auq"] = {
"Anus",
23855,
"poz-ocw",
"Latn",
}
m["aur"] = {
"Aruek",
3504279,
"paa-tor",
"Latn",
}
m["aut"] = {
"Austral",
2669261,
"poz-pep",
"Latn",
}
m["auu"] = {
"Auye",
4827334,
"ngf-pan",
"Latn",
}
m["auw"] = {
"Awyi",
3513326,
"paa-brd",
"Latn",
}
m["aux"] = {
"Aurá",
3507995,
"tup-gua",
"Latn",
}
m["auy"] = {
"Auyana",
2873211,
"ngf-kag",
"Latn",
}
m["auz"] = {
"อาหรับแบบอุซเบกิสถาน",
3399507,
"sem-arb",
"Arab",
strip_diacritics = "ar-stripdiacritics",
}
m["avb"] = {
"Avau",
12627412,
"poz-ocw",
"Latn",
}
m["avd"] = {
"Alviri-Vidari",
3327357,
"xme",
"fa-Arab",
ancestors = "xme-mid",
}
m["avi"] = {
"Avikam",
34840,
"alv-lag",
"Latn",
}
m["avk"] = {
"โคทาวา",
1377116,
"art",
"Latn",
type = "appendix-constructed",
}
m["avm"] = {
"Angkamuthi",
62603022,
"aus-pmn",
"Latn",
}
m["avn"] = {
"Avatime",
34796,
"alv-ktg",
"Latn",
}
m["avo"] = {
"Agavotaguerra",
3508007,
"awd",
"Latn",
}
m["avs"] = {
"Aushiri",
3409318,
"sai-zap",
"Latn",
}
m["avt"] = {
"Au",
3446608,
"paa-tor",
"Latn",
}
m["avu"] = {
"Avokaya",
56685,
"csu-mma",
"Latn",
}
m["avv"] = {
"Avá-Canoeiro",
4829584,
"tup-gua",
"Latn",
}
m["awa"] = {
"อวัธ",
29579,
"inc-hie",
"Deva, Kthi, fa-Arab",
ancestors = "inc-oaw",
translit = {
Deva = "Deva-translit",
Kthi = "Kthi-translit",
},
}
m["awb"] = {
"Awa (New Guinea)",
2874650,
"ngf-kag",
"Latn",
}
m["awc"] = {
"Cicipu",
35193,
"nic-kam",
"Latn",
}
m["awe"] = {
"Awetí",
4830038,
"tup",
"Latn",
}
m["awg"] = {
"Anguthimri",
4764288,
"aus-pam",
"Latn",
}
m["awh"] = {
"Awbono",
3446684,
"paa-baw",
"Latn",
}
m["awi"] = {
"Aekyom",
3399691,
"paa-kae",
"Latn",
}
m["awk"] = {
"Awabakal",
3449138,
"aus-pam",
"Latn",
}
m["awm"] = {
"Arawum",
4784537,
"ngf-rai",
"Latn",
}
m["awn"] = {
"Awngi",
34934,
"cus-cen",
"Ethi",
}
m["awo"] = {
"Awak",
3446643,
"alv-wjk",
"Latn",
}
m["awr"] = {
"Awera",
56379,
"paa-lkp",
"Latn",
}
m["aws"] = {
"South Awyu",
12633986,
"ngf-gaw",
"Latn",
}
m["awt"] = {
"Araweté",
4784535,
"tup-gua",
"Latn",
}
m["awu"] = {
"Central Awyu",
12628801,
"ngf-gaw",
"Latn",
}
m["awv"] = {
"Jair Awyu",
16110177,
"ngf-gaw",
"Latn",
}
m["aww"] = {
"Awun",
56369,
"paa-spk",
"Latn",
}
m["awx"] = {
"Awara",
2874670,
"ngf-fin",
"Latn",
}
m["awy"] = {
"Edera Awyu",
12630425,
"ngf-gaw",
"Latn",
}
m["axb"] = {
"Abipón",
11252539,
"sai-guc",
"Latn",
}
m["axe"] = {
"Ayerrerenge",
16112737,
"aus-pam",
"Latn",
}
m["axg"] = {
"Arára (Mato Grosso)",
3446660,
nil,
"Latn",
}
m["axk"] = {
"Aka (Central Africa)",
11010149,
"bnt-ngn",
"Latn",
}
m["axl"] = {
"Lower Southern Aranda",
6693295,
"aus-pam",
"Latn",
}
m["axm"] = {
"อาร์มีเนียกลาง",
4438498,
"hyx",
"Armn",
ancestors = "xcl",
-- Armn translit in [[Module:scripts/data]]
override_translit = true,
strip_diacritics = {
remove_diacritics = "՞՜՛՟",
from = {"եւ", "ՙ", "՚"},
to = {"և", "ʻ", "’"}
}
}
m["axx"] = {
"Xârâgurè",
8045635,
"poz-cln",
"Latn",
}
m["aya"] = {
"Awar",
56876,
"paa-ram",
"Latn",
}
m["ayb"] = {
"Ayizo",
34841,
"alv-pph",
"Latn",
}
m["ayd"] = {
"Ayabadhu",
3509164,
"aus-pmn",
"Latn",
}
m["aye"] = {
"Ayere",
34788,
"alv-aah",
"Latn",
}
m["ayg"] = {
"Nyanga (Togo)",
35446,
"alv-gng",
"Latn",
}
m["ayi"] = {
"Leyigha",
3914492,
"nic-uce",
"Latn",
}
m["ayk"] = {
"Akuku",
3450179,
"alv-nwd",
"Latn",
}
m["ayl"] = {
"อาหรับแบบลิเบีย",
56503,
"sem-arb",
"Arab",
strip_diacritics = "ar-stripdiacritics",
}
m["ayn"] = {
"อาหรับแบบเยเมน",
1686766,
"sem-arb",
"Arab, Hebr",
strip_diacritics = {
Arab = "ar-stripdiacritics",
},
-- Hebr display_text, strip_diacritics, sort_key in [[Module:scripts/data]]
}
m["ayo"] = {
"Ayoreo",
56634,
"sai-zam",
"Latn",
}
m["ayp"] = {
"อาหรับแบบเมโสโปเตเมียเหนือ",
56577,
"sem-arb",
"Arab",
ancestors = "acm",
strip_diacritics = "ar-stripdiacritics",
}
m["ayq"] = {
"Ayi",
56449,
"paa-spk",
"Latn",
}
m["ays"] = {
"Sorsogon Ayta",
7563752,
"phi",
"Latn",
}
m["ayt"] = {
"Bataan Ayta",
4921648,
"phi",
"Latn",
}
m["ayu"] = {
"Ayu",
34786,
"alv",
"Latn",
}
m["ayy"] = {
"Tayabas Ayta",
7689745,
"phi",
"Latn",
}
m["ayz"] = {
"Maybrat",
4830892,
"paa-mbr",
-- either an isolate; grouped with Abun and the West Bird's Head family; or in the putative West Papuan family
"Latn",
}
m["aza"] = {
"Azha",
4832486,
"tbq-axi",
"Latn",
}
m["azd"] = {
"Eastern Durango Nahuatl",
16115449,
"azc-dur",
"Latn",
}
m["azg"] = {
"San Pedro Amuzgos Amuzgo",
35092,
"omq",
"Latn",
}
m["azm"] = {
"Ipalapa Amuzgo",
12633013,
"omq",
"Latn",
}
m["azn"] = {
"Western Durango Nahuatl",
12645553,
"azc-dur",
"Latn",
}
m["azo"] = {
"Awing",
34856,
"nic-nge",
"Latn",
}
m["azt"] = {
"Faire Atta",
12630884,
"phi",
"Latn",
}
m["azz"] = {
"Highland Puebla Nahuatl",
12953754,
"azc-nah",
"Latn",
}
return require("Module:languages").finalizeData(m, "language")
7h8k7ctcb8t8byj5fd8ajlapo2lzllf
มอดูล:languages/data/2
828
36387
5720751
5719180
2026-04-21T07:00:45Z
OctraBot
3198
บอต: แทนที่ข้อความโดยอัตโนมัติ (-standardChars +standard_chars)
5720751
Scribunto
text/plain
local m_langdata = require("Module:languages/data")
-- Loaded on demand, as it may not be needed (depending on the data).
local function u(...)
u = require("Module:string utilities").char
return u(...)
end
local c = m_langdata.chars
local p = m_langdata.puaChars
local s = m_langdata.shared
-- Ideally, we want to move these into [[Module:languages/data]], but because (a) it's necessary to use require on that module, and (b) they're only used in this data module, it's less memory-efficient to do that at the moment. If it becomes possible to use mw.loadData, then these should be moved there.
s["de-Latn-sortkey"] = {
remove_diacritics = c.grave .. c.acute .. c.circ .. c.diaer .. c.ringabove,
from = {"æ", "œ", "ß"},
to = {"ae", "oe", "ss"}
}
s["de-Latn-standardchars"] = "AaÄäBbCcDdEeFfGgHhIiJjKkLlMmNnOoÖöPpQqRrSsẞßTtUuÜüVvWwXxYyZz"
s["ka-stripdiacritics"] = {remove_diacritics = c.circ}
s["no-sortkey"] = {
remove_diacritics = c.grave .. c.acute .. c.circ .. c.tilde .. c.macron .. c.dacute .. c.caron .. c.cedilla,
remove_exceptions = {"å"},
from = {"æ", "ø", "å"},
to = {"z" .. p[1], "z" .. p[2], "z" .. p[3]}
}
s["no-standardchars"] = "AaBbDdEeFfGgHhIiJjKkLlMmNnOoPpRrSsTtUuVvYyÆæØøÅå" .. c.punc
s["tg-stripdiacritics"] = {remove_diacritics = c.grave .. c.acute}
s["tk-stripdiacritics"] = {remove_diacritics = c.macron}
local m = {}
m["aa"] = {
"อาฟาร์",
27811,
"cus-eas",
"Latn, Ethi",
strip_diacritics = {
Latn = {remove_diacritics = c.acute},
},
}
m["ab"] = {
"อับคาเซีย",
5111,
"cau-abz",
"Cyrl, Geor, Latn",
translit = {
Cyrl = "ab-translit",
-- Geor translit in [[Module:scripts/data]]
},
override_translit = true,
display_text = {
Cyrl = s["cau-Cyrl-displaytext"]
},
strip_diacritics = {
Cyrl = {
remove_diacritics = c.acute,
from = {"^а%-"},
to = {"а"},
},
Latn = s["cau-Latn-stripdiacritics"],
},
sort_key = {
Cyrl = {
from = {
"х'ә", -- 3 chars
"гь", "гә", "ӷь", "ҕь", "ӷә", "ҕә", "дә", "ё", "жь", "жә", "ҙә", "ӡә", "ӡ'", "кь", "кә", "қь", "қә", "ҟь", "ҟә", "ҫә", "тә", "ҭә", "ф'", "хь", "хә", "х'", "ҳә", "ць", "цә", "ц'", "ҵә", "ҵ'", "шь", "шә", "џь", -- 2 chars
"ӷ", "ҕ", "ҙ", "ӡ", "қ", "ҟ", "ԥ", "ҧ", "ҫ", "ҭ", "ҳ", "ҵ", "ҷ", "ҽ", "ҿ", "ҩ", "џ", "ә", -- 1 char
"^а",
},
to = {
"х" .. p[4],
"г" .. p[1], "г" .. p[2], "г" .. p[5], "г" .. p[6], "г" .. p[7], "г" .. p[8], "д" .. p[1], "е" .. p[1], "ж" .. p[1], "ж" .. p[2], "з" .. p[2], "з" .. p[4], "з" .. p[5], "к" .. p[1], "к" .. p[2], "к" .. p[4], "к" .. p[5], "к" .. p[7], "к" .. p[8], "с" .. p[2], "т" .. p[1], "т" .. p[3], "ф" .. p[1], "х" .. p[1], "х" .. p[2], "х" .. p[3], "х" .. p[6], "ц" .. p[1], "ц" .. p[2], "ц" .. p[3], "ц" .. p[5], "ц" .. p[6], "ш" .. p[1], "ш" .. p[2], "ы" .. p[3],
"г" .. p[3], "г" .. p[4], "з" .. p[1], "з" .. p[3], "к" .. p[3], "к" .. p[6], "п" .. p[1], "п" .. p[2], "с" .. p[1], "т" .. p[2], "х" .. p[5], "ц" .. p[4], "ч" .. p[1], "ч" .. p[2], "ч" .. p[3], "ы" .. p[1], "ы" .. p[2], "ь" .. p[1],
"",
}
},
},
}
m["ae"] = {
"อเวสตะ",
29572,
"ira-cen",
"Avst, Gujr",
translit = {
Avst = "Avst-translit"
},
}
m["af"] = {
"อาฟรีกานส์",
14196,
"gmw-frk",
"Latn, Arab",
ancestors = "nl",
sort_key = {
Latn = {
remove_diacritics = c.grave .. c.acute .. c.circ .. c.tilde .. c.diaer .. c.ringabove .. c.cedilla .. "'",
from = {"['ʼ]n"},
to = {"n" .. p[1]}
}
},
}
m["ak"] = {
"อาคัน",
28026,
"alv-ctn",
"Latn",
}
m["am"] = {
"อัมฮารา",
28244,
"sem-eth",
"Ethi",
translit = "Ethi-translit",
}
m["an"] = {
"อารากอน",
8765,
"roa-nar",
"Latn",
}
m["ar"] = {
"อาหรับ",
13955,
"sem-arb",
"Arab, Hebr, Syrc, Brai, Nbat",
translit = {
Arab = "ar-translit"
},
strip_diacritics = {
Arab = "ar-stripdiacritics",
},
-- Hebr display_text, strip_diacritics, sort_key in [[Module:scripts/data]]
}
m["as"] = {
"อัสสัม",
29401,
"inc-bas",
"as-Beng",
ancestors = "inc-mas",
translit = "Beng-translit",
}
m["av"] = {
"อะวาร์",
29561,
"cau-ava",
"Cyrl, Latn, Arab",
ancestors = "oav",
translit = {
Cyrl = "cau-nec-translit",
Arab = "ar-translit",
},
override_translit = true,
display_text = {
Cyrl = s["cau-Cyrl-displaytext"],
},
strip_diacritics = {
Cyrl = s["cau-Cyrl-stripdiacritics"],
Latn = s["cau-Latn-stripdiacritics"],
},
sort_key = {
Cyrl = {
from = {"гъ", "гь", "гӏ", "ё", "кк", "къ", "кь", "кӏ", "лъ", "лӏ", "тӏ", "хх", "хъ", "хь", "хӏ", "цӏ", "чӏ"},
to = {"г" .. p[1], "г" .. p[2], "г" .. p[3], "е" .. p[1], "к" .. p[1], "к" .. p[2], "к" .. p[3], "к" .. p[4], "л" .. p[1], "л" .. p[2], "т" .. p[1], "х" .. p[1], "х" .. p[2], "х" .. p[3], "х" .. p[4], "ц" .. p[1], "ч" .. p[1]}
},
},
}
m["ay"] = {
"ไอย์มารา",
4627,
"sai-aym",
"Latn",
}
m["az"] = {
"อาเซอร์ไบจาน",
9292,
"trk-ogz",
"Latn, Cyrl, fa-Arab",
ancestors = "trk-oat",
dotted_dotless_i = true,
strip_diacritics = {
Latn = {
from = {"ʼ"},
to = {"'"},
},
["fa-Arab"] = {
module = "ar-stripdiacritics",
["from"] = {
"ۆ",
"ۇ",
"وْ",
"ڲ",
"ؽ",
},
["to"] = {
"و",
"و",
"و",
"گ",
"ی",
},
},
},
display_text = {
Latn = {
from = {"'"},
to = {"ʼ"}
}
},
sort_key = {
Latn = {
from = {
"i", -- Ensure "i" comes after "ı".
"ç", "ə", "ğ", "x", "ı", "q", "ö", "ş", "ü", "w"
},
to = {
"i" .. p[1],
"c" .. p[1], "e" .. p[1], "g" .. p[1], "h" .. p[1], "i", "k" .. p[1], "o" .. p[1], "s" .. p[1], "u" .. p[1], "z" .. p[1]
}
},
Cyrl = {
from = {"ғ", "ә", "ы", "ј", "ҝ", "ө", "ү", "һ", "ҹ"},
to = {"г" .. p[1], "е" .. p[1], "и" .. p[1], "и" .. p[2], "к" .. p[1], "о" .. p[1], "у" .. p[1], "х" .. p[1], "ч" .. p[1]}
},
},
}
m["ba"] = {
"แบชเคียร์",
13389,
"trk-kbu",
"Cyrl",
translit = "ba-translit",
override_translit = true,
sort_key = {
from = {"ғ", "ҙ", "ё", "ҡ", "ң", "ө", "ҫ", "ү", "һ", "ә"},
to = {"г" .. p[1], "д" .. p[1], "е" .. p[1], "к" .. p[1], "н" .. p[1], "о" .. p[1], "с" .. p[1], "у" .. p[1], "х" .. p[1], "э" .. p[1]}
},
}
m["be"] = {
"เบลารุส",
9091,
"zle",
"Cyrl, Latn",
ancestors = "zle-mbe",
translit = {
Cyrl = "be-translit-Thai",
},
strip_diacritics = {
Cyrl = {
remove_diacritics = c.grave .. c.acute,
},
Latn = {
remove_diacritics = c.grave .. c.acute,
remove_exceptions = {"Ć", "ć", "Ń", "ń", "Ś", "ś", "Ź", "ź"},
},
},
sort_key = {
Cyrl = {
remove_diacritics = c.grave .. c.acute,
from = {"ґ", "ё", "і", "ў"},
to = {"г" .. p[1], "е" .. p[1], "и" .. p[1], "у" .. p[1]}
},
Latn = {
remove_diacritics = c.grave .. c.acute,
remove_exceptions = {"Ć", "ć", "Ń", "ń", "Ś", "ś", "Ź", "ź"},
from = {"ć", "č", "dz", "dź", "dž", "ch", "ł", "ń", "ś", "š", "ŭ", "ź", "ž"},
to = {"c" .. p[1], "c" .. p[2], "d" .. p[1], "d" .. p[2], "d" .. p[3], "h" .. p[1], "l" .. p[1], "n" .. p[1], "s" .. p[1], "s" .. p[2], "u" .. p[1], "z" .. p[1], "z" .. p[2]}
},
},
standard_chars = {
Cyrl = "АаБбВвГгДдЕеЁёЖжЗзІіЙйКкЛлМмНнОоПпРрСсТтУуЎўФфХхЦцЧчШшЫыЬьЭэЮюЯя",
Latn = "AaBbCcĆćČčDdEeFfGgHhIiJjKkLlŁłMmNnŃńOoPpRrSsŚśŠšTtUuŬŭVvYyZzŹźŽž",
(c.punc:gsub("'", "")) -- Exclude apostrophe.
},
}
m["bg"] = {
"บัลแกเรีย",
7918,
"zls",
"Cyrl",
ancestors = "cu-bgm",
translit = "bg-translit",
strip_diacritics = {
remove_diacritics = c.grave .. c.acute,
remove_exceptions = {"%f[^%z%s]ѝ%f[%z%s]"},
},
sort_key = {
remove_diacritics = c.grave .. c.acute,
remove_exceptions = {"%f[^%z%s]ѝ%f[%z%s]"},
},
standard_chars = "АаБбВвГгДдЕеЖжЗзИиЙйКкЛлМмНнОоПпРрСсТтУуФфХхЦцЧчШшЩщЪъЬьЮюЯя" .. c.punc,
}
m["bh"] = {
"พิหาร",
135305,
"inc-eas",
"Deva",
translit = "Deva-translit",
}
m["bi"] = {
"บิสลามา",
35452,
"crp",
"Latn",
ancestors = "en",
}
m["bm"] = {
"บัมบารา",
33243,
"dmn-emn",
"Latn, Nkoo",
sort_key = {
Latn = {
from = {"ɛ", "ɲ", "ŋ", "ɔ"},
to = {"e" .. p[1], "n" .. p[1], "n" .. p[2], "o" .. p[1]}
},
},
}
m["bn"] = {
"เบงกอล",
9610,
"inc-bas",
"Beng, Newa",
ancestors = "inc-mbn",
translit = {
Beng = "Beng-translit"
},
}
m["bo"] = {
"ทิเบต",
34271,
"sit-tib",
"Tibt", -- sometimes Deva?
ancestors = "xct",
override_translit = true,
-- Tibt translit, display_text, strip_diacritics, sort_key in [[Module:scripts/data]]
}
m["br"] = {
"เบรอตง",
12107,
"cel-brs",
"Latn",
ancestors = "xbm",
sort_key = {
from = {"ch", "c['ʼ’]h"},
to = {"c" .. p[1], "c" .. p[2]}
},
}
m["ca"] = {
"กาตาลา",
7026,
"roa-ocr",
"Latn",
ancestors = "roa-oca",
sort_key = {remove_diacritics = c.grave .. c.acute .. c.diaer .. c.cedilla .. "·"},
standard_chars = "AaÀàBbCcÇçDdEeÉéÈèFfGgHhIiÍíÏïJjLlMmNnOoÓóÒòPpQqRrSsTtUuÚúÜüVvXxYyZz·" .. c.punc,
}
m["ce"] = {
"เชเชน",
33350,
"cau-vay",
"Cyrl, Latn, Arab",
translit = {
Cyrl = "cau-nec-translit",
Arab = "ar-translit",
},
override_translit = true,
display_text = {
Cyrl = s["cau-Cyrl-displaytext"]
},
strip_diacritics = {
Cyrl = s["cau-Cyrl-stripdiacritics"],
Latn = s["cau-Latn-stripdiacritics"],
},
sort_key = {
Cyrl = {
from = {"аь", "гӏ", "ё", "кх", "къ", "кӏ", "оь", "пӏ", "тӏ", "уь", "хь", "хӏ", "цӏ", "чӏ", "юь", "яь"},
to = {"а" .. p[1], "г" .. p[1], "е" .. p[1], "к" .. p[1], "к" .. p[2], "к" .. p[3], "о" .. p[1], "п" .. p[1], "т" .. p[1], "у" .. p[1], "х" .. p[1], "х" .. p[2], "ц" .. p[1], "ч" .. p[1], "ю" .. p[1], "я" .. p[1]}
},
},
}
m["ch"] = {
"ชามอร์โร",
33262,
"poz",
"Latn",
sort_key = {
remove_diacritics = "'",
from = {"å", "ch", "ñ", "ng"},
to = {"a" .. p[1], "c" .. p[1], "n" .. p[1], "n" .. p[2]}
},
}
m["co"] = {
"คอร์ซิกา",
33111,
"roa-itr",
"Latn",
sort_key = {
from = {"chj", "ghj", "sc", "sg"},
to = {"c" .. p[1], "g" .. p[1], "s" .. p[1], "s" .. p[2]}
},
standard_chars = "AaÀàBbCcDdEeÈèFfGgHhIiÌìÏïJjLlMmNnOoÒòPpQqRrSsTtUuÙùÜüVvZz" .. c.punc,
}
m["cr"] = {
"ครี",
33390,
"alg",
"Latn, Cans",
translit = {
Cans = "cr-translit"
},
}
m["cs"] = {
"เช็ก",
9056,
"zlw",
"Latn",
ancestors = "cs-ear",
sort_key = {
from = {"á", "č", "ď", "é", "ě", "ch", "í", "ň", "ó", "ř", "š", "ť", "ú", "ů", "ý", "ž"},
to = {"a" .. p[1], "c" .. p[1], "d" .. p[1], "e" .. p[1], "e" .. p[2], "h" .. p[1], "i" .. p[1], "n" .. p[1], "o" .. p[1], "r" .. p[1], "s" .. p[1], "t" .. p[1], "u" .. p[1], "u" .. p[2], "y" .. p[1], "z" .. p[1]}
},
standard_chars = "AaÁáBbCcČčDdĎďEeÉéĚěFfGgHhIiÍíJjKkLlMmNnŇňOoÓóPpRrŘřSsŠšTtŤťUuÚúŮůVvYyÝýZzŽž" .. c.punc,
}
m["cu"] = {
"สลาวอนิกคริสตจักรเก่า",
35499,
"zls",
"Cyrs, Glag, Zname",
translit = {
Cyrs = "Cyrs-translit",
Glag = "Glag-translit"
},
-- Cyrs strip_diacritics, sort_key in [[Module:scripts/data]]
}
m["cv"] = {
"ชูวัช",
33348,
"trk-ogr",
"Cyrl",
ancestors = "cv-mid",
translit = "cv-translit",
override_translit = true,
sort_key = {
from = {"ӑ", "ё", "ӗ", "ҫ", "ӳ"},
to = {"а" .. p[1], "е" .. p[1], "е" .. p[2], "с" .. p[1], "у" .. p[1]}
},
}
m["cy"] = {
"เวลส์",
9309,
"cel-brw",
"Latn",
ancestors = "wlm",
sort_key = {
remove_diacritics = c.grave .. c.acute .. c.circ .. c.diaer .. "'",
from = {"ch", "dd", "ff", "ng", "ll", "ph", "rh", "th"},
to = {"c" .. p[1], "d" .. p[1], "f" .. p[1], "g" .. p[1], "l" .. p[1], "p" .. p[1], "r" .. p[1], "t" .. p[1]}
},
standard_chars = "ÂâAaBbCcDdEeÊêFfGgHhIiÎîLlMmNnOoÔôPpRrSsTtUuÛûWwŴŵYyŶŷ" .. c.punc,
}
m["da"] = {
"เดนมาร์ก",
9035,
"gmq-eas",
"Latn",
ancestors = "gmq-oda",
sort_key = {
remove_diacritics = c.grave .. c.acute .. c.circ .. c.tilde .. c.macron .. c.dacute .. c.caron .. c.cedilla,
remove_exceptions = {"å"},
from = {"æ", "ø", "å"},
to = {"z" .. p[1], "z" .. p[2], "z" .. p[3]}
},
standard_chars = "AaBbDdEeFfGgHhIiJjKkLlMmNnOoPpRrSsTtUuVvYyÆæØøÅå" .. c.punc,
}
m["de"] = {
"เยอรมัน",
188,
"gmw-hgm",
"Latn, Latf, Brai",
ancestors = "de-ear",
sort_key = {
Latn = s["de-Latn-sortkey"],
Latf = s["de-Latn-sortkey"],
},
standard_chars = {
Latn = s["de-Latn-standardchars"],
Latf = s["de-Latn-standardchars"],
Brai = c.braille,
c.punc
}
}
m["dv"] = {
"มัลดีฟส์",
32656,
"inc-ins",
"Thaa, Diak",
translit = {
Thaa = "Thaa-translit",
Diak = "Diak-translit",
},
override_translit = true,
}
m["dz"] = {
"ซองคา",
33081,
"sit-tib",
"Tibt",
ancestors = "xct",
override_translit = true,
-- Tibt translit, display_text, strip_diacritics, sort_key in [[Module:scripts/data]]
}
m["ee"] = {
"เอเว",
30005,
"alv-gbe",
"Latn",
sort_key = {
remove_diacritics = c.tilde,
from = {"ɖ", "dz", "ɛ", "ƒ", "gb", "ɣ", "kp", "ny", "ŋ", "ɔ", "ts", "ʋ"},
to = {"d" .. p[1], "d" .. p[2], "e" .. p[1], "f" .. p[1], "g" .. p[1], "g" .. p[2], "k" .. p[1], "n" .. p[1], "n" .. p[2], "o" .. p[1], "t" .. p[1], "v" .. p[1]}
},
}
m["el"] = {
"กรีก",
9129,
"grk",
"Grek, Polyt, Brai",
ancestors = "el-kth",
translit = "el-translit",
override_translit = true,
-- Grek and Polyt display_text, strip_diacritics, sort_key in [[Module:scripts/data]]
standard_chars = {
Grek = "΅·ͺ΄ΑαΆάΒβΓγΔδΕεέΈΖζΗηΉήΘθΙιΊίΪϊΐΚκΛλΜμΝνΞξΟοΌόΠπΡρΣσςΤτΥυΎύΫϋΰΦφΧχΨψΩωΏώ",
Brai = c.braille,
c.punc
},
}
m["en"] = {
"อังกฤษ",
1860,
"gmw-ang",
"Latn, Brai, Shaw, Dsrt", -- entries in Shaw or Dsrt might require prior discussion
wikimedia_codes = "en, simple",
ancestors = "en-ear",
sort_key = {
Latn = {
-- Many of these are needed for sorting language names.
remove_diacritics = "'\"%-%.,%s·ʻʼ" .. c.diacritics,
-- These are found in pagenames.
from = {"[ɒæ🅱¢©ᴄðđəǝɜɡħʜıɨłŋɲøɔœꝑꝓꝕßʋ]"},
to = {{
["ɒ"] = "a", ["æ"] = "ae", ["🅱"] = "b", ["¢"] = "c", ["©"] = "c",
["ᴄ"] = "c", ["ð"] = "d", ["đ"] = "d", ["ə"] = "e", ["ǝ"] = "e",
["ɜ"] = "e", ["ɡ"] = "g", ["ħ"] = "h", ["ʜ"] = "h", ["ı"] = "i",
["ɨ"] = "i", ["ł"] = "l", ["ŋ"] = "n", ["ɲ"] = "n", ["ø"] = "o",
["ɔ"] = "o", ["œ"] = "oe", ["ꝑ"] = "p", ["ꝓ"] = "p", ["ꝕ"] = "p",
["ß"] = "ss", ["ʋ"] = "v",
}},
},
},
standard_chars = {
Latn = "AaBbCcDdEeFfGgHhIiJjKkLlMmNnOoPpQqRrSsTtUuVvWwXxYyZz",
Brai = c.braille,
c.punc
},
}
m["eo"] = {
"เอสเปรันโต",
143,
"art",
"Latn",
--translit = "eo-translit", -- already handled in Module:headword & Module:links
sort_key = {
remove_diacritics = c.grave .. c.acute,
from = {"ĉ", "ĝ", "ĥ", "ĵ", "ŝ", "ŭ"},
to = {"c" .. p[1], "g" .. p[1], "h" .. p[1], "j" .. p[1], "s" .. p[1], "u" .. p[1]}
},
standard_chars = "AaBbCcĈĉDdEeFfGgĜĝHhĤĥIiJjĴĵKkLlMmNnOoPpRrSsŜŝTtUuŬŭVvZz" .. c.punc,
}
m["es"] = {
"สเปน",
1321,
"roa-cas",
"Latn, Brai",
ancestors = "es-ear",
--translit = "es-translit", -- already handled in Module:headword & Module:links
sort_key = {
Latn = {
remove_exceptions = {"ñ"},
remove_diacritics = c.grave .. c.acute .. c.circ .. c.tilde .. c.macron .. c.diaer .. c.cedilla,
from = {"ª", "æ", "ñ", "º", "œ"},
to = {"a", "ae", "n" .. p[1], "o", "oe"}
},
},
standard_chars = {
Latn = "AaÁáBbCcDdEeÉéFfGgHhIiÍíJjLlMmNnÑñOoÓóPpQqRrSsTtUuÚúÜüVvXxYyZz",
Brai = c.braille,
c.punc
},
}
m["et"] = {
"เอสโตเนีย",
9072,
"urj-fin",
"Latn",
sort_key = {
from = {
"š", "ž", "õ", "ä", "ö", "ü", -- 2 chars
"z" -- 1 char
},
to = {
"s" .. p[1], "s" .. p[3], "w" .. p[1], "w" .. p[2], "w" .. p[3], "w" .. p[4],
"s" .. p[2]
}
},
standard_chars = "AaBbDdEeFfGgHhIiJjKkLlMmNnOoPpRrSsTtUuVvÕõÄäÖöÜü" .. c.punc,
}
m["eu"] = {
"บาสก์",
8752,
"euq",
"Latn",
sort_key = {
from = {"ç", "ñ"},
to = {"c" .. p[1], "n" .. p[1]}
},
standard_chars = "AaBbDdEeFfGgHhIiJjKkLlMmNnÑñOoPpRrSsTtUuXxZz" .. c.punc,
}
m["fa"] = {
"เปอร์เซีย",
9168,
"ira-swi",
"fa-Arab, Hebr",
ancestors = "fa-cls",
strip_diacritics = {
["fa-Arab"] = {
-- character "ۂ" code U+06C2 to "ه" and "هٔ" (U+0647 + U+0654) to "ه"; hamzatu l-waṣli to a regular alif
from = {"هٔ", "ٱ"}, -- character "ۂ" code U+06C2 to "ه"; hamzatu l-waṣli to a regular alif
to = {"ه", "ا"},
remove_diacritics = c.fathatan .. c.dammatan .. c.kasratan .. c.fatha .. c.damma .. c.kasra .. c.shadda .. c.sukun .. c.superalef,
},
},
-- Hebr display_text, strip_diacritics, sort_key in [[Module:scripts/data]]
}
m["ff"] = {
"ฟูลา",
33454,
"alv-fwo",
"Latn, Adlm",
}
m["fi"] = {
"ฟินแลนด์",
1412,
"urj-fin",
"Latn",
display_text = {
from = {"'"},
to = {"’"}
},
strip_diacritics = { -- used to indicate gemination of the next consonant
remove_diacritics = "ˣ",
from = {"’"},
to = {"'"},
},
sort_key = { -- [[Appendix:Finnish alphabet#Collation]] + "aͤ" and "oͤ" as historical variants of "ä" and "ö".
remove_diacritics = "'’:" .. c.diacritics,
remove_exceptions = {
"a[" .. c.ringabove .. c.diaer .. c.small_e .. "]", -- åäaͤ
"o[" .. c.diaer .. c.tilde .. c.dacute .. c.small_e .. "]", -- öõőoͤ
"u[" .. c.diaer .. c.dacute .. "]" -- üű
},
from = {"æ", "[ðđ]", "ł", "ŋ", "œ", "ß", "þ", "u[" .. c.diaer .. c.dacute .. "]", "å", "aͤ", "o[" .. c.tilde .. c.dacute .. c.small_e .. "]", "ø", "(.)['%-]"},
to = {"ae", "d", "l", "n", "oe", "ss", "th", "y", "z" .. p[1], "ä", "ö", "ö", "%1"}
},
standard_chars = "AaBbDdEeFfGgHhIiJjKkLlMmNnOoPpRrSsTtUuVvYyÄäÖö" .. c.punc,
}
m["fj"] = {
"ฟีจี",
33295,
"poz-pcc",
"Latn",
}
m["fo"] = {
"แฟโร",
25258,
"gmq-ins",
"Latn",
sort_key = {
from = {"á", "ð", "í", "ó", "ú", "ý", "æ", "ø"},
to = {"a" .. p[1], "d" .. p[1], "i" .. p[1], "o" .. p[1], "u" .. p[1], "y" .. p[1], "z" .. p[1], "z" .. p[2]}
},
standard_chars = "AaÁáBbDdÐðEeFfGgHhIiÍíJjKkLlMmNnOoÓóPpRrSsTtUuÚúVvYyÝýÆæØø" .. c.punc,
}
m["fr"] = {
"ฝรั่งเศส",
150,
"roa-oil",
"Latn, Brai",
ancestors = "frm",
sort_key = {
Latn = s["roa-oil-sortkey"]
},
standard_chars = {
Latn = "AaÀàÂâBbCcÇçDdEeÉéÈèÊêËëFfGgHhIiÎîÏïJjLlMmNnOoÔôŒœPpQqRrSsTtUuÙùÛûÜüVvXxYyZz",
Brai = c.braille,
c.punc
},
}
m["fy"] = {
"ฟรีเชียตะวันตก",
27175,
"gmw-fri",
"Latn",
sort_key = {
remove_diacritics = c.grave .. c.acute .. c.circ .. c.diaer,
from = {"y"},
to = {"i"}
},
standard_chars = "AaâäàÆæBbCcDdEeéêëèFfGgHhIiïìYyỳJjKkLlMmNnOoôöòPpRrSsTtUuúûüùVvWwZz" .. c.punc,
}
m["ga"] = {
"ไอริช",
9142,
"cel-gae",
"Latn, Latg",
ancestors = "mga",
sort_key = {
remove_diacritics = c.acute,
from = {"ḃ", "ċ", "ḋ", "ḟ", "ġ", "ṁ", "ṗ", "ṡ", "ṫ"},
to = {"bh", "ch", "dh", "fh", "gh", "mh", "ph", "sh", "th"}
},
standard_chars = "AaÁáBbCcDdEeÉéFfGgHhIiÍíLlMmNnOoÓóPpRrSsTtUuÚúVv" .. c.punc,
}
m["gd"] = {
"แกลิกแบบสกอตแลนด์",
9314,
"cel-gae",
"Latn, Latg",
ancestors = "mga",
sort_key = {remove_diacritics = c.grave .. c.acute},
standard_chars = "AaÀàBbCcDdEeÈèFfGgHhIiÌìLlMmNnOoÒòPpRrSsTtUuÙù" .. c.punc,
}
m["gl"] = {
"กาลิเซีย",
9307,
"roa-gap",
"Latn",
sort_key = {
remove_diacritics = c.acute,
from = {"ñ"},
to = {"n" .. p[1]}
},
standard_chars = "AaÁáBbCcDdEeÉéFfGgHhIiÍíÏïLlMmNnÑñOoÓóPpQqRrSsTtUuÚúÜüVvXxZz" .. c.punc,
}
m["gu"] = {
"คุชราต",
5137,
"inc-wes",
"Arab, Gujr",
ancestors = "inc-mgu",
translit = {
Gujr = "Gujr-translit",
},
strip_diacritics = {
Arab = {remove_diacritics = c.fathatan .. c.dammatan .. c.kasratan .. c.fatha .. c.damma .. c.kasra .. c.kasra .. c.shadda .. c.sukun},
Gujr = {remove_diacritics = "઼"},
},
}
m["gv"] = {
"แมงซ์",
12175,
"cel-gae",
"Latn",
ancestors = "mga",
sort_key = {remove_diacritics = c.cedilla .. "-"},
standard_chars = "AaBbCcÇçDdEeFfGgHhIiJjKkLlMmNnOoPpQqRrSsTtUuVvWwYy" .. c.punc,
}
m["ha"] = {
"เฮาซา",
56475,
"cdc-wst",
"Latn, Arab",
strip_diacritics = {
Latn = {remove_diacritics = c.grave .. c.acute .. c.circ .. c.tilde .. c.macron}
},
sort_key = {
Latn = {
from = {"ɓ", "b'", "ɗ", "d'", "ƙ", "k'", "sh", "ƴ", "'y"},
to = {"b" .. p[1], "b" .. p[2], "d" .. p[1], "d" .. p[2], "k" .. p[1], "k" .. p[2], "s" .. p[1], "y" .. p[1], "y" .. p[2]}
},
},
}
m["he"] = {
"ฮีบรู",
9288,
"sem-can",
"Hebr, Phnx, Brai, Samr",
ancestors = "he-med",
-- Hebr display_text, strip_diacritics, sort_key in [[Module:scripts/data]]
-- Samr strip_diacritics, sort_key in [[Module:scripts/data]]
-- Phnx translit in [[Module:scripts/data]] (NOTE: not present before, presumably an accidental omission)
}
m["hi"] = {
"ฮินดี",
1568,
"inc-hnd",
"Deva, Kthi, Newa",
translit = {
Deva = "Deva-translit",
Kthi = "Kthi-translit",
Newa = "Newa-translit",
},
standard_chars = {
Deva = "अआइईउऊएऐओऔकखगघङचछजझञटठडढणतथदधनपफबभमयरलवशषसहत्रज्ञक्षक़ख़ग़ज़झ़ड़ढ़फ़काखागाघाङाचाछाजाझाञाटाठाडाढाणाताथादाधानापाफाबाभामायारालावाशाषासाहात्राज्ञाक्षाक़ाख़ाग़ाज़ाझ़ाड़ाढ़ाफ़ाकिखिगिघिङिचिछिजिझिञिटिठिडिढिणितिथिदिधिनिपिफिबिभिमियिरिलिविशिषिसिहित्रिज्ञिक्षिक़िख़िग़िज़िझ़िड़िढ़िफ़िकीखीगीघीङीचीछीजीझीञीटीठीडीढीणीतीथीदीधीनीपीफीबीभीमीयीरीलीवीशीषीसीहीत्रीज्ञीक्षीक़ीख़ीग़ीज़ीझ़ीड़ीढ़ीफ़ीकुखुगुघुङुचुछुजुझुञुटुठुडुढुणुतुथुदुधुनुपुफुबुभुमुयुरुलुवुशुषुसुहुत्रुज्ञुक्षुक़ुख़ुग़ुज़ुझ़ुड़ुढ़ुफ़ुकूखूगूघूङूचूछूजूझूञूटूठूडूढूणूतूथूदूधूनूपूफूबूभूमूयूरूलूवूशूषूसूहूत्रूज्ञूक्षूक़ूख़ूग़ूज़ूझ़ूड़ूढ़ूफ़ूकेखेगेघेङेचेछेजेझेञेटेठेडेढेणेतेथेदेधेनेपेफेबेभेमेयेरेलेवेशेषेसेहेत्रेज्ञेक्षेक़ेख़ेग़ेज़ेझ़ेड़ेढ़ेफ़ेकैखैगैघैङैचैछैजैझैञैटैठैडैढैणैतैथैदैधैनैपैफैबैभैमैयैरैलैवैशैषैसैहैत्रैज्ञैक्षैक़ैख़ैग़ैज़ैझ़ैड़ैढ़ैफ़ैकोखोगोघोङोचोछोजोझोञोटोठोडोढोणोतोथोदोधोनोपोफोबोभोमोयोरोलोवोशोषोसोहोत्रोज्ञोक्षोक़ोख़ोग़ोज़ोझ़ोड़ोढ़ोफ़ोकौखौगौघौङौचौछौजौझौञौटौठौडौढौणौतौथौदौधौनौपौफौबौभौमौयौरौलौवौशौषौसौहौत्रौज्ञौक्षौक़ौख़ौग़ौज़ौझ़ौड़ौढ़ौफ़ौक्ख्ग्घ्ङ्च्छ्ज्झ्ञ्ट्ठ्ड्ढ्ण्त्थ्द्ध्न्प्फ्ब्भ्म्य्र्ल्व्श्ष्स्ह्त्र्ज्ञ्क्ष्क़्ख़्ग़्ज़्झ़्ड़्ढ़्फ़्।॥०१२३४५६७८९॰",
c.punc
},
}
m["ho"] = {
"ฮีรีโมตู",
33617,
"crp",
"Latn",
ancestors = "meu",
}
m["ht"] = {
"ครีโอลเฮติ",
33491,
"crp",
"Latn",
ancestors = "ht-sdm",
sort_key = {
from = {
"oun", -- 3 chars
"an", "ch", "è", "en", "ng", "ò", "on", "ou", "ui" -- 2 chars
},
to = {
"o" .. p[4],
"a" .. p[1], "c" .. p[1], "e" .. p[1], "e" .. p[2], "n" .. p[1], "o" .. p[1], "o" .. p[2], "o" .. p[3], "u" .. p[1]
}
},
}
m["hu"] = {
"ฮังการี",
9067,
"urj-ugr",
"Latn, Hung",
ancestors = "ohu",
sort_key = {
Latn = {
from = {
"dzs", -- 3 chars
"á", "cs", "dz", "é", "gy", "í", "ly", "ny", "ó", "ö", "ő", "sz", "ty", "ú", "ü", "ű", "zs", -- 2 chars
},
to = {
"d" .. p[2],
"a" .. p[1], "c" .. p[1], "d" .. p[1], "e" .. p[1], "g" .. p[1], "i" .. p[1], "l" .. p[1], "n" .. p[1], "o" .. p[1], "o" .. p[2], "o" .. p[3], "s" .. p[1], "t" .. p[1], "u" .. p[1], "u" .. p[2], "u" .. p[3], "z" .. p[1],
}
},
},
standard_chars = {
Latn = "AaÁáBbCcDdEeÉéFfGgHhIiÍíJjKkLlMmNnOoÓóÖöŐőPpQqRrSsTtUuÚúÜüŰűVvWwXxYyZz",
c.punc
},
}
m["hy"] = {
"อาร์มีเนีย",
8785,
"hyx",
"Armn, Brai",
ancestors = "axm",
-- Armn translit in [[Module:scripts/data]]
override_translit = true,
strip_diacritics = {
Armn = {
remove_diacritics = "՛՜՞՟",
from = {"եւ", "<sup>յ</sup>", "<sup>ի</sup>", "<sup>է</sup>", "յ̵", "ՙ", "՚"},
to = {"և", "յ", "ի", "է", "ֈ", "ʻ", "’"}
},
},
sort_key = {
Armn = {
from = {
"ու", "եւ", -- 2 chars
"և" -- 1 char
},
to = {
"ւ", "եվ",
"եվ"
}
},
},
}
m["hz"] = {
"เฮเรโร",
33315,
"bnt-swb",
"Latn",
}
m["ia"] = {
"อินเทอร์ลิงกวา",
35934,
"art",
"Latn",
}
m["id"] = {
"อินโดนีเซีย",
9240,
"poz-mly",
"Latn",
ancestors = "ms",
standard_chars = "AaBbCcDdEeFfGgHhIiJjKkLlMmNnOoPpQqRrSsTtUuVvWwXxYyZz" .. c.punc,
}
m["ie"] = {
"อินเทร์ลิงเกว",
35850,
"art",
"Latn",
type = "appendix-constructed",
strip_diacritics = {remove_diacritics = c.grave .. c.acute .. c.circ},
}
m["ig"] = {
"อิกโบ",
33578,
"alv-igb",
"Latn",
strip_diacritics = {remove_diacritics = c.grave .. c.acute .. c.macron},
sort_key = {
from = {"gb", "gh", "gw", "ị", "kp", "kw", "ṅ", "nw", "ny", "ọ", "sh", "ụ"},
to = {"g" .. p[1], "g" .. p[2], "g" .. p[3], "i" .. p[1], "k" .. p[1], "k" .. p[2], "n" .. p[1], "n" .. p[2], "n" .. p[3], "o" .. p[1], "s" .. p[1], "u" .. p[1]}
},
}
m["ii"] = {
"นอซู",
34235,
"tbq-nlo",
"Yiii",
translit = "ii-translit",
}
m["ik"] = {
"Inupiaq",
27183,
"esx-inu",
"Latn",
sort_key = {
from = {
"ch", "ġ", "dj", "ḷ", "ł̣", "ñ", "ng", "r̂", "sr", "zr", -- 2 chars
"ł", "ŋ", "ʼ" -- 1 char
},
to = {
"c" .. p[1], "g" .. p[1], "h" .. p[1], "l" .. p[1], "l" .. p[3], "n" .. p[1], "n" .. p[2], "r" .. p[1], "s" .. p[1], "z" .. p[1],
"l" .. p[2], "n" .. p[2], "z" .. p[2]
}
},
}
m["io"] = {
"อีโด",
35224,
"art",
"Latn",
}
m["is"] = {
"ไอซ์แลนด์",
294,
"gmq-ins",
"Latn",
sort_key = {
from = {"á", "ð", "é", "í", "ó", "ú", "ý", "þ", "æ", "ö"},
to = {"a" .. p[1], "d" .. p[1], "e" .. p[1], "i" .. p[1], "o" .. p[1], "u" .. p[1], "y" .. p[1], "z" .. p[1], "z" .. p[2], "z" .. p[3]}
},
standard_chars = "AaÁáBbDdÐðEeÉéFfGgHhIiÍíJjKkLlMmNnOoÓóPpRrSsTtUuÚúVvXxYyÝýÞþÆæÖö" .. c.punc,
}
m["it"] = {
"อิตาลี",
652,
"roa-itr",
"Latn",
ancestors = "roa-oit",
sort_key = {remove_diacritics = c.grave .. c.acute .. c.circ .. c.diaer .. c.ringabove},
standard_chars = "AaÀàBbCcDdEeÈèÉéFfGgHhIiÌìLlMmNnOoÒòPpQqRrSsTtUuÙùVvZz" .. c.punc,
}
m["iu"] = {
"อินุกติตุต",
29921,
"esx-inu",
"Cans, Latn",
translit = {
Cans = "cr-translit"
},
override_translit = true,
}
m["ja"] = {
"ญี่ปุ่น",
5287,
"jpx",
"Jpan, Latn, Brai",
ancestors = "ja-ear",
translit = s["jpx-translit"],
link_tr = true,
display_text = s["jpx-displaytext"],
strip_diacritics = s["jpx-stripdiacritics"],
sort_key = s["jpx-sortkey"],
}
m["jv"] = {
"ชวา",
33549,
"poz",
"Latn, Java, Arab",
ancestors = "kaw",
translit = {
Java = "Java-translit"
},
link_tr = true,
strip_diacritics = {
Latn = {remove_diacritics = c.circ} -- Modern jv don't use ê
},
sort_key = {
Latn = {
from = {"å", "dh", "é", "è", "ng", "ny", "th"},
to = {"a" .. p[1], "d" .. p[1], "e" .. p[1], "e" .. p[2], "n" .. p[1], "n" .. p[2], "t" .. p[1]}
},
},
}
m["ka"] = {
"จอร์เจีย",
8108,
"ccs-gzn",
"Geor, Geok, Hebr", -- Hebr is used to write Judeo-Georgian
ancestors = "ka-mid",
-- Geor, Geok translit in [[Module:scripts/data]]
override_translit = true,
strip_diacritics = {
Geor = s["ka-stripdiacritics"],
Geok = s["ka-stripdiacritics"],
},
-- Hebr display_text, strip_diacritics, sort_key in [[Module:scripts/data]]
}
m["kg"] = {
"คองโก",
33702,
"bnt-kng",
"Latn",
}
m["ki"] = {
"เกโกโย",
33587,
"bnt-kka",
"Latn",
}
m["kj"] = {
"Kwanyama",
1405077,
"bnt-ova",
"Latn",
}
m["kk"] = {
"คาซัค",
9252,
"trk-kno",
"Cyrl, Latn, kk-Arab",
translit = {
Cyrl = {
from = {
"Ё", "ё", "Й", "й", "Нг", "нг", "Ӯ", "ӯ", -- 2 chars; are "Ӯ" and "ӯ" actually used?
"А", "а", "Ә", "ә", "Б", "б", "В", "в", "Г", "г", "Ғ", "ғ", "Д", "д", "Е", "е", "Ж", "ж", "З", "з", "И", "и", "К", "к", "Қ", "қ", "Л", "л", "М", "м", "Н", "н", "Ң", "ң", "О", "о", "Ө", "ө", "П", "п", "Р", "р", "С", "с", "Т", "т", "У", "у", "Ұ", "ұ", "Ү", "ү", "Ф", "ф", "Х", "х", "Һ", "һ", "Ц", "ц", "Ч", "ч", "Ш", "ш", "Щ", "щ", "Ъ", "ъ", "Ы", "ы", "І", "і", "Ь", "ь", "Э", "э", "Ю", "ю", "Я", "я", -- 1 char
},
to = {
"E", "e", "İ", "i", "Ñ", "ñ", "U", "u",
"A", "a", "Ä", "ä", "B", "b", "V", "v", "G", "g", "Ğ", "ğ", "D", "d", "E", "e", "J", "j", "Z", "z", "İ", "i", "K", "k", "Q", "q", "L", "l", "M", "m", "N", "n", "Ñ", "ñ", "O", "o", "Ö", "ö", "P", "p", "R", "r", "S", "s", "T", "t", "U", "u", "Ū", "ū", "Ü", "ü", "F", "f", "X", "x", "H", "h", "S", "s", "Ç", "ç", "Ş", "ş", "Ş", "ş", "", "", "Y", "y", "I", "ı", "", "", "É", "é", "Ü", "ü", "Ä", "ä",
}
}
},
-- override_translit = true,
sort_key = {
Cyrl = {
from = {"ә", "ғ", "ё", "қ", "ң", "ө", "ұ", "ү", "һ", "і"},
to = {"а" .. p[1], "г" .. p[1], "е" .. p[1], "к" .. p[1], "н" .. p[1], "о" .. p[1], "у" .. p[1], "у" .. p[2], "х" .. p[1], "ы" .. p[1]}
},
},
standard_chars = {
Cyrl = "АаӘәБбВвГгҒғДдЕеЁёЖжЗзИиЙйКкҚқЛлМмНнҢңОоӨөПпРрСсТтУуҰұҮүФфХхҺһЦцЧчШшЩщЪъЫыІіЬьЭэЮюЯя",
c.punc
},
}
m["kl"] = {
"กรีนแลนด์",
25355,
"esx-inu",
"Latn",
sort_key = {
from = {"æ", "ø", "å"},
to = {"z" .. p[1], "z" .. p[2], "z" .. p[3]}
}
}
m["km"] = {
"เขมร",
9205,
"mkh-kmr",
"Khmr",
ancestors = "xhm",
translit = "Khmr-translit",
}
m["kn"] = {
"กันนฑะ",
33673,
"dra-kan",
"Knda, Tutg",
ancestors = "dra-mkn",
-- Knda translit in [[Module:scripts/data]]
}
m["ko"] = {
"เกาหลี",
9176,
"qfa-kor",
"Kore, Brai",
ancestors = "ko-ear",
translit = {
Kore = "ko-translit",
},
-- Kore strip_diacritics in [[Module:scripts/data]]
}
m["kr"] = {
"กานูรี",
36094,
"ssa-sah",
"Latn, Arab",
-- the sortkey and strip_diacritics are only for standard Kanuri; when dialectal entries get added, someone will have to work out how the dialects should be represented orthographically
strip_diacritics = {
Latn = {remove_diacritics = c.grave .. c.acute .. c.circ .. c.breve}
},
sort_key = {
Latn = {
from = {"ǝ", "ny", "ɍ", "sh"},
to = {"e" .. p[1], "n" .. p[1], "r" .. p[1], "s" .. p[1]}
},
},
}
m["ks"] = {
"แคชเมียร์",
33552,
"inc-kas",
"ks-Arab, Deva, Shrd, Latn",
translit = {
["ks-Arab"] = "ks-Arab-translit",
Deva = "Deva-translit",
-- Shrd translit in [[Module:scripts/data]]
},
}
-- "kv" is treated as "koi", "kpv", see [[WT:LT]]
m["kw"] = {
"คอร์นวอลล์",
25289,
"cel-brs",
"Latn",
ancestors = "cnx",
sort_key = {
from = {"ch"},
to = {"c" .. p[1]}
},
}
m["ky"] = {
"คีร์กีซ",
9255,
"trk-kkp",
"Cyrl, Latn, Arab",
translit = {
Cyrl = "ky-translit"
},
override_translit = true,
sort_key = {
Cyrl = {
from = {"ё", "ң", "ө", "ү"},
to = {"е" .. p[1], "н" .. p[1], "о" .. p[1], "у" .. p[1]}
},
},
}
m["la"] = {
"ละติน",
397,
"itc-laf",
"Latn, Ital",
ancestors = "itc-ola",
--translit = "la-translit", -- already handled in Module:headword & Module:links
-- Ital translit in [[Module:scripts/data]] (NOTE: formerly not present, probably an accidental omission)
display_text = {
Latn = s["itc-Latn-displaytext"]
},
strip_diacritics = {
Latn = s["itc-Latn-stripdiacritics"]
},
sort_key = {
Latn = s["itc-Latn-sortkey"]
},
standard_chars = {
Latn = "AaBbCcDdEeFfGgHhIiLlMmNnOoPpQqRrSsTtUuVvXx",
c.punc
},
}
m["lb"] = {
"ลักเซมเบิร์ก",
9051,
"gmw-hgm",
"Latn, Brai",
ancestors = "gmw-cfr",
sort_key = {
Latn = {
from = {"ä", "ë", "é"},
to = {"z" .. p[1], "z" .. p[2], "z" .. p[3]}
},
},
}
m["lg"] = {
"ลูกันดา",
33368,
"bnt-nyg",
"Latn",
strip_diacritics = {remove_diacritics = c.acute .. c.circ},
sort_key = {
from = {"ŋ"},
to = {"n" .. p[1]}
},
}
m["li"] = {
"ลิมเบิร์ก",
102172,
"gmw-frk",
"Latn",
ancestors = "dum",
}
m["ln"] = {
"ลิงกาลา",
36217,
"bnt-bmo",
"Latn",
sort_key = {
remove_diacritics = c.acute .. c.circ .. c.caron,
from = {"ɛ", "gb", "mb", "mp", "nd", "ng", "nk", "ns", "nt", "ny", "nz", "ɔ"},
to = {"e" .. p[1], "g" .. p[1], "m" .. p[1], "m" .. p[2], "n" .. p[1], "n" .. p[2], "n" .. p[3], "n" .. p[4], "n" .. p[5], "n" .. p[6], "n" .. p[7], "o" .. p[1]}
},
}
m["lo"] = {
"ลาว",
9211,
"tai-swe",
"Laoo", -- also Tai Noi/Lao Buhan script
translit = "Laoo-translit",
--sort_key = "Laoo-sortkey",
standard_chars = "0-9ກຂຄງຈຊຍດຕຖທນບປຜຝພຟມຢຣລວສຫອຮຯ-ໝ" .. c.punc,
}
m["lt"] = {
"ลิทัวเนีย",
9083,
"bat-eas",
"Latn",
ancestors = "olt",
display_text = "lt-common",
strip_diacritics = "lt-common",
sort_key = "lt-common",
standard_chars = "AaĄąBbCcČčDdEeĘęĖėFfGgHhIiĮįYyJjKkLlMmNnOoPpRrSsŠšTtUuŲųŪūVvZzŽž" .. c.punc,
}
m["lu"] = {
"Luba-Katanga",
36157,
"bnt-lub",
"Latn",
}
m["lv"] = {
"ลัตเวีย",
9078,
"bat-eas",
"Latn",
strip_diacritics = {
-- This attempts to convert vowels with tone marks to vowels either with or without macrons. Specifically, there should be no macrons if the vowel is part of a diphthong (including resonant diphthongs such pìrksts -> pirksts not #pīrksts). What we do is first convert the vowel + tone mark to a vowel + tilde in a decomposed fashion, then remove the tilde in diphthongs, then convert the remaining vowel + tilde sequences to macroned vowels, then delete any other tilde. We leave already-macroned vowels alone: Both e.g. ar and ār occur before consonants. FIXME: This still might not be sufficient.
from = {"([Ee])" .. c.cedilla, "[" .. c.grave .. c.circ .. c.tilde .."]", "([aAeEiIoOuU])" .. c.tilde .."?([lrnmuiLRNMUI])" .. c.tilde .. "?([^aAeEiIoOuU])", "([aAeEiIoOuU])" .. c.tilde .."?([lrnmuiLRNMUI])" .. c.tilde .."?$", "([iI])" .. c.tilde .. "?([eE])" .. c.tilde .. "?", "([aAeEiIuU])" .. c.tilde, c.tilde},
to = {"%1", c.tilde, "%1%2%3", "%1%2", "%1%2", "%1" .. c.macron}
},
sort_key = {
from = {"ā", "č", "ē", "ģ", "ī", "ķ", "ļ", "ņ", "š", "ū", "ž"},
to = {"a" .. p[1], "c" .. p[1], "e" .. p[1], "g" .. p[1], "i" .. p[1], "k" .. p[1], "l" .. p[1], "n" .. p[1], "s" .. p[1], "u" .. p[1], "z" .. p[1]}
},
standard_chars = "AaĀāBbCcČčDdEeĒēFfGgĢģHhIiĪīJjKkĶķLlĻļMmNnŅņOoPpRrSsŠšTtUuŪūVvZzŽž" .. c.punc,
}
m["mg"] = {
"มาลากาซี",
7930,
"poz-bre",
"Latn, Arab",
}
m["mh"] = {
"มาร์แชลล์",
36280,
"poz-mic",
"Latn",
sort_key = {
from = {"ā", "ļ", "m̧", "ņ", "n̄", "o̧", "ō", "ū"},
to = {"a" .. p[1], "l" .. p[1], "m" .. p[1], "n" .. p[1], "n" .. p[2], "o" .. p[1], "o" .. p[2], "u" .. p[1]}
},
}
m["mi"] = {
"มาวรี",
36451,
"poz-pep",
"Latn",
sort_key = {
remove_diacritics = c.macron,
from = {"ng", "wh"},
to = {"n" .. p[1], "w" .. p[1]}
},
}
m["mk"] = {
"มาซิโดเนีย",
9296,
"zls",
"Cyrl, Polyt",
ancestors = "cu",
translit = {
Cyrl = "mk-translit",
-- FIXME: formerly no translit specified for Polyt; unclear if the default [[Module:grc-translit]] is
-- acceptable, so we disable it for now
Polyt = false,
},
strip_diacritics = {
Cyrl = {
remove_diacritics = c.acute,
remove_exceptions = {"Ѓ", "ѓ", "Ќ", "ќ"}
},
},
sort_key = {
Cyrl = {
remove_diacritics = c.grave,
remove_exceptions = {"ѓ", "ќ"},
from = {"ѓ", "ѕ", "ј", "љ", "њ", "ќ", "џ"},
to = {"д" .. p[1], "з" .. p[1], "и" .. p[1], "л" .. p[1], "н" .. p[1], "т" .. p[1], "ч" .. p[1]}
},
},
-- Polyt display_text, strip_diacritics, sort_key in [[Module:scripts/data]]
standard_chars = {
Cyrl = "АаБбВвГгДдЃѓЕеЖжЗзЅѕИиЈјКкЛлЉљМмНнЊњОоПпРрСсТтЌќУуФфХхЦцЧчЏџШш",
c.punc
},
}
m["ml"] = {
"มลยาฬัม",
36236,
"dra-mal",
"Mlym",
override_translit = true,
-- Mlym translit in [[Module:scripts/data]]
}
m["mn"] = {
"มองโกเลีย",
9246,
"xgn-cen",
"Cyrl, Mong, Latn, Brai",
ancestors = "cmg",
translit = {
Cyrl = "mn-translit",
-- Mong translit in [[Module:scripts/data]]
},
override_translit = true,
-- Mong display_text and strip_diacritics in [[Module:scripts/data]]
strip_diacritics = {
Cyrl = {remove_diacritics = c.grave .. c.acute},
},
sort_key = {
Cyrl = {
remove_diacritics = c.grave,
from = {"ё", "ө", "ү"},
to = {"е" .. p[1], "о" .. p[1], "у" .. p[1]}
},
},
standard_chars = {
Cyrl = "АаБбВвГгДдЕеЁёЖжЗзИиЙйЛлМмНнОоӨөРрСсТтУуҮүХхЦцЧчШшЫыЬьЭэЮюЯя—",
Brai = c.braille,
c.punc
},
}
-- "mo" is treated as "ro", see [[WT:LT]]
m["mr"] = {
"มราฐี",
1571,
"inc-sou",
"Deva, Modi",
ancestors = "omr",
translit = {
Deva = "Deva-translit",
Modi = "Modi-translit",
},
strip_diacritics = {
Deva = {
from = {"च़", "ज़", "झ़"},
to = {"च", "ज", "झ"}
},
},
}
m["ms"] = {
"มาเลเซีย",
9237,
"poz-mly",
"Latn, ms-Arab",
ancestors = "ms-cla",
standard_chars = {
Latn = "AaBbCcDdEeFfGgHhIiJjKkLlMmNnOoPpQqRrSsTtUuVvWwXxYyZz",
c.punc
},
}
m["mt"] = {
"มอลตา",
9166,
"sem-arb",
"Latn",
display_text = {
from = {"'"},
to = {"’"}
},
strip_diacritics = {
from = {"’"},
to = {"'"},
},
ancestors = "sqr",
sort_key = {
from = {
"ċ", "ġ", "ż", -- Convert into PUA so that decomposed form does not get caught by the next step.
"([cgz])", -- Ensure "c" comes after "ċ", "g" comes after "ġ" and "z" comes after "ż".
"g" .. p[1] .. "ħ", -- "għ" after initial conversion of "g".
p[3], p[4], "ħ", "ie", p[5] -- Convert "ċ", "ġ", "ħ", "ie", "ż" into final output.
},
to = {
p[3], p[4], p[5],
"%1" .. p[1],
"g" .. p[2],
"c", "g", "h" .. p[1], "i" .. p[1], "z"
}
},
}
m["my"] = {
"พม่า",
9228,
"tbq-brm",
"Mymr",
ancestors = "obr",
translit = "my-translit",
override_translit = true,
sort_key = {
from = {"ျ", "ြ", "ွ", "ှ", "ဿ"},
to = {"္ယ", "္ရ", "္ဝ", "္ဟ", "သ္သ"}
},
}
m["na"] = {
"นาอูรู",
13307,
"poz-mic",
"Latn",
}
m["nb"] = {
"นอร์เวย์แบบบุ๊กมอล",
25167,
"gmq",
"Latn",
wikimedia_codes = "no",
ancestors = "gmq-mno, da", -- da as an (but not the) ancestor of nb was agreed on - do not change without discussion
sort_key = s["no-sortkey"],
standard_chars = s["no-standardchars"],
}
m["nd"] = {
"Northern Ndebele",
35613,
"bnt-ngu",
"Latn",
strip_diacritics = {remove_diacritics = c.grave .. c.acute .. c.circ .. c.macron .. c.caron},
}
m["ne"] = {
"เนปาล",
33823,
"inc-pah",
"Deva, Newa",
translit = {
Deva = "Deva-translit",
Newa = "Newa-translit",
},
}
m["ng"] = {
"Ndonga",
33900,
"bnt-ova",
"Latn",
}
m["nl"] = {
"ดัตช์",
7411,
"gmw-frk",
"Latn, Brai",
ancestors = "dum",
sort_key = {
Latn = {remove_diacritics = c.grave .. c.acute .. c.circ .. c.tilde .. c.diaer .. c.ringabove .. c.cedilla .. "'"},
},
standard_chars = {
Latn = "AaBbCcDdEeFfGgHhIiJjKkLlMmNnOoPpQqRrSsTtUuVvWwXxYyZzÄäËëÏïÖöÜü",
Brai = c.braille,
c.punc
},
}
m["nn"] = {
"นอร์เวย์แบบนือนอสก์",
25164,
"gmq-wes",
"Latn",
ancestors = "gmq-mno",
strip_diacritics = {
remove_diacritics = c.grave .. c.acute,
},
sort_key = s["no-sortkey"],
standard_chars = s["no-standardchars"],
}
m["no"] = {
"นอร์เวย์",
9043,
"gmq-wes",
"Latn",
ancestors = "gmq-mno",
sort_key = s["no-sortkey"],
standard_chars = s["no-standardchars"],
}
m["nr"] = {
"Southern Ndebele",
36785,
"bnt-ngu",
"Latn",
strip_diacritics = {remove_diacritics = c.grave .. c.acute .. c.circ .. c.macron .. c.caron},
}
m["nv"] = {
"นาวาโฮ",
13310,
"apa",
"Latn, Brai",
sort_key = {
remove_diacritics = c.acute .. c.ogonek,
from = {
"chʼ", "tłʼ", "tsʼ", -- 3 chars
"ch", "dl", "dz", "gh", "hw", "kʼ", "kw", "sh", "tł", "ts", "zh", -- 2 chars
"ł", "ʼ" -- 1 char
},
to = {
"c" .. p[2], "t" .. p[2], "t" .. p[4],
"c" .. p[1], "d" .. p[1], "d" .. p[2], "g" .. p[1], "h" .. p[1], "k" .. p[1], "k" .. p[2], "s" .. p[1], "t" .. p[1], "t" .. p[3], "z" .. p[1],
"l" .. p[1], "z" .. p[2]
}
},
}
m["ny"] = {
"เจวา",
33273,
"bnt-nys",
"Latn",
strip_diacritics = {remove_diacritics = c.acute .. c.circ},
sort_key = {
from = {"ng'"},
to = {"ng"}
},
}
m["oc"] = {
"อุตซิตา",
14185,
"roa-ocr",
"Latn, Hebr",
ancestors = "pro",
sort_key = {
Latn = {
remove_diacritics = c.grave .. c.acute .. c.diaer .. c.cedilla,
from = {"([lns])·h"},
to = {"%1h"}
},
},
-- Hebr display_text, strip_diacritics, sort_key in [[Module:scripts/data]]
}
m["oj"] = {
"โอจิบเว",
33875,
"alg",
"Cans, Latn",
sort_key = {
Latn = {
from = {"aa", "ʼ", "ii", "oo", "sh", "zh"},
to = {"a" .. p[1], "h" .. p[1], "i" .. p[1], "o" .. p[1], "s" .. p[1], "z" .. p[1]}
},
},
}
m["om"] = {
"ออโรโม",
33864,
"cus-eas",
"Latn, Ethi",
}
m["or"] = {
"โอริยา",
33810,
"inc-eas",
"Orya",
ancestors = "inc-mor",
translit = "Orya-translit",
}
m["os"] = {
"ออสซีเซีย",
33968,
"xsc-sar",
"Cyrl, Geor, Latn",
ancestors = "oos",
translit = {
Cyrl = "os-translit",
-- Geor translit in [[Module:scripts/data]]
},
override_translit = true,
display_text = {
Cyrl = {
from = {"æ"},
to = {"ӕ"}
},
Latn = {
from = {"ӕ"},
to = {"æ"}
},
},
strip_diacritics = {
Cyrl = {
remove_diacritics = c.grave .. c.acute,
from = {"æ"},
to = {"ӕ"}
},
Latn = {
from = {"ӕ"},
to = {"æ"}
},
},
sort_key = {
Cyrl = {
from = {"ӕ", "гъ", "дж", "дз", "ё", "къ", "пъ", "тъ", "хъ", "цъ", "чъ"},
to = {"а" .. p[1], "г" .. p[1], "д" .. p[1], "д" .. p[2], "е" .. p[1], "к" .. p[1], "п" .. p[1], "т" .. p[1], "х" .. p[1], "ц" .. p[1], "ч" .. p[1]}
},
},
}
m["pa"] = {
"ปัญจาบ",
58635,
"inc-pan",
"Guru, pa-Arab",
ancestors = "inc-opa",
translit = {
Guru = "Guru-translit",
["pa-Arab"] = "pa-Arab-translit",
},
strip_diacritics = {
["pa-Arab"] = {
remove_diacritics = c.fathatan .. c.dammatan .. c.kasratan .. c.fatha .. c.damma .. c.kasra .. c.shadda .. c.sukun .. c.nunghunna,
from = {"ݨ", "ࣇ"},
to = {"ن", "ل"}
},
},
}
m["pi"] = {
"บาลี",
36727,
"inc-mid",
"Latn, Brah, Deva, Beng, Sinh, Mymr, Thai, Lana, Laoo, Khmr, Cakm", --and also Khom
ancestors = "sa",
translit = {
-- Brah translit in [[Module:scripts/data]]
Deva = "Deva-translit",
Beng = "Beng-translit",
Sinh = "Sinh-translit",
Mymr = "Mymr-translit",
--Thai = "pi-translit",
Lana = "Lana-translit",
Laoo = "Laoo-translit",
Khmr = "Khmr-translit",
Cakm = "Cakm-translit",
},
strip_diacritics = {
Thai = {
from = {"ึ", u(0xF700), u(0xF70F)}, -- FIXME: Not clear what's going on with the PUA characters here.
to = {"ิํ", "ฐ", "ญ"}
},
Mymr = {
remove_diacritics = c.VS01,
},
},
sort_key = { -- FIXME: This needs to be converted into the current standardized format.
from = {"ā", "ī", "ū", "ḍ", "ḷ", "m[" .. c.dotabove .. c.dotbelow .. "]", "ṅ", "ñ", "ṇ", "ṭ", "([เโ])([ก-ฮ])", "([ເໂ])([ກ-ຮ])", "ᩔ", "ᩕ", "ᩖ", "ᩘ", "([ᨭ-ᨱ])ᩛ", "([ᨷ-ᨾ])ᩛ", "ᩤ", u(0xFE00), u(0x200D)},
to = {"a~", "i~", "u~", "d~", "l~", "m~", "n~", "n~~", "n~~~", "t~", "%2%1", "%2%1", "ᩈ᩠ᩈ", "᩠ᩁ", "᩠ᩃ", "ᨦ᩠", "%1᩠ᨮ", "%1᩠ᨻ", "ᩣ"}
},
}
m["pl"] = {
"โปแลนด์",
809,
"zlw-lch",
"Latn",
ancestors = "zlw-mpl",
sort_key = {
from = {"ą", "ć", "ę", "ł", "ń", "ó", "ś", "ź", "ż"},
to = {"a" .. p[1], "c" .. p[1], "e" .. p[1], "l" .. p[1], "n" .. p[1], "o" .. p[1], "s" .. p[1], "z" .. p[1], "z" .. p[2]}
},
standard_chars = "AaĄąBbCcĆćDdEeĘęFfGgHhIiJjKkLlŁłMmNnŃńOoÓóPpRrSsŚśTtUuWwYyZzŹźŻż" .. c.punc,
}
m["ps"] = {
"ปาทาน",
58680,
"ira-pat",
"ps-Arab",
strip_diacritics = {remove_diacritics = c.fathatan .. c.dammatan .. c.kasratan .. c.fatha .. c.damma .. c.kasra .. c.shadda .. c.sukun .. c.zwarakay .. c.superalef},
}
m["pt"] = {
"โปรตุเกส",
5146,
"roa-gap",
"Latn, Brai",
sort_key = {
Latn = {
remove_diacritics = c.grave .. c.acute .. c.circ .. c.tilde .. c.macron .. c.diaer .. c.cedilla,
from = {"ª", "æ", "º", "œ"},
to = {"a", "ae", "o", "oe"}
},
},
standard_chars = {
Latn = "AaÁáÂâÃãBbCcÇçDdEeÉéÊêFfGgHhIiÍíJjLlMmNnOoÓóÔôÕõPpQqRrSsTtUuÚúVvXxZz",
Brai = c.braille,
c.punc
},
}
m["qu"] = {
"เกชัว",
5218,
"qwe",
"Latn",
}
m["rm"] = {
"โรมานช์",
13199,
"roa-rhe",
"Latn",
sort_key = {remove_diacritics = c.grave .. c.acute .. c.circ .. c.diaer .. c.small_e},
}
m["ro"] = {
"โรมาเนีย",
7913,
"roa-eas",
"Latn, Cyrl, Cyrs",
translit = {
Cyrl = "ro-translit"
},
sort_key = {
Latn = {
remove_diacritics = c.grave .. c.acute,
from = {"ă", "â", "î", "ș", "ț"},
to = {"a" .. p[1], "a" .. p[2], "i" .. p[1], "s" .. p[1], "t" .. p[1]}
},
Cyrl = {
from = {"ӂ"},
to = {"ж" .. p[1]}
},
},
-- Cyrs strip_diacritics, sort_key in [[Module:scripts/data]]; presumably not present
standard_chars = {
Latn = "AaĂăÂâBbCcDdEeFfGgHhIiÎîJjLlMmNnOoPpRrSsȘșTtȚțUuVvXxZz",
Cyrl = "АаБбВвГгДдЕеЖжӁӂЗзИиЙйКкЛлМмНнОоПпРрСсТтУуФфХхЦцЧчШшЫыЬьЭэЮюЯя",
c.punc
},
}
m["ru"] = {
"รัสเซีย",
7737,
"zle",
"Cyrl, Brai",
ancestors = "zle-mru",
translit = {
Cyrl = "ru-translit-Thai"
},
display_text = {
Cyrl = {
from = {"'"},
to = {"’"}
},
},
strip_diacritics = {
Cyrl = {
remove_diacritics = c.grave .. c.acute .. c.diaer,
remove_exceptions = {"Ё", "ё", "Ѣ̈", "ѣ̈", "Я̈", "я̈"},
from = {"’"},
to = {"'"},
},
},
sort_key = {
Cyrl = {
remove_diacritics = c.grave .. c.acute .. c.diaer,
from = {
"і", "ѣ", "ѳ", "ѵ"
},
to = {
"и" .. p[1], "ь" .. p[1], "я" .. p[2], "я" .. p[3]
}
},
},
standard_chars = {
Cyrl = "АаБбВвГгДдЕеЁёЖжЗзИиЙйКкЛлМмНнОоПпРрСсТтУуФфХхЦцЧчШшЩщЪъЫыЬьЭэЮюЯя—",
Brai = c.braille,
(c.punc:gsub("'", "")) -- Exclude apostrophe.
},
}
m["rw"] = {
"รวันดา-รุนดี",
3217514,
"bnt-glb",
"Latn",
strip_diacritics = {remove_diacritics = c.acute .. c.circ .. c.macron .. c.caron},
}
m["sa"] = {
"สันสกฤต",
11059,
"inc",
"as-Beng, Bali, Beng, Bhks, Brah, Mymr, xwo-Mong, Deva, Gujr, Guru, Gran, Hani, Java, Kthi, Knda, Kawi, Khar, Khmr, Laoo, Mlym, mnc-Mong, Marc, Modi, Mong, Nand, Newa, Orya, Phag, Ranj, Saur, Shrd, Sidd, Sinh, Soyo, Lana, Takr, Taml, Tang, Telu, Thai, Tibt, Tutg, Tirh, Zanb", --and also Khom; script codes sorted by canonical name rather than code for [[MOD:sa-convert]]
translit = {
Beng = "Beng-translit",
["as-Beng"] = "Beng-translit",
-- Brah translit in [[Module:scripts/data]]
Deva = "Deva-translit",
Gujr = "Gujr-translit",
Guru = "Guru-translit",
Java = "Java-translit",
Kthi = "Kthi-translit",
Khmr = "Khmr-translit",
Knda = "Knda-translit",
Lana = "Lana-translit",
Laoo = "Laoo-translit",
Mlym = "Mlym-translit",
Modi = "Modi-translit",
-- Mong, mnc-Mong, xwo-Mong translit in [[Module:scripts/data]]
-- NOTE: Formerly used xal-translit for transliterating xwo-Mong but that only handles Cyrillic; it has
-- code to transliterate xwo-Mong but it's broken so I've replaced it with the default xwo-translit.
Mymr = "Mymr-translit",
Orya = "Orya-translit",
-- Shrd translit in [[Module:scripts/data]]
-- Sidd translit in [[Module:scripts/data]]
Sinh = "Sinh-translit",
--Thai = "pi-translit",
Taml = "Taml-translit",
Telu = "Telu-translit",
-- Tibt translit in [[Module:scripts/data]]
},
-- Mong display_text and strip_diacritics in [[Module:scripts/data]]
-- Tibt display_text, strip_diacritics, sort_key in [[Module:scripts/data]]
strip_diacritics = {
Thai = {
from = {"ึ", u(0xF700), u(0xF70F)}, -- FIXME: Not clear what's going on with the PUA characters here.
to = {"ิํ", "ฐ", "ญ"}
},
Mymr = {
remove_diacritics = c.VS01,
},
Deva = {
remove_diacritics = c.udatta .. c.anudatta,
},
},
sort_key = {
Latn = {
from = {"ā", "ī", "ū", "ḍ", "ḷ", "ḹ", "m[" .. c.dotabove .. c.dotbelow .. "]", "ṅ", "ñ", "ṇ", "ṛ", "ṝ", "ś", "ṣ", "ṭ"},
to = {"a~", "i~", "u~", "d~", "l~", "l~~", "m~", "n~", "n~~", "n~~~", "r~", "r~~", "s~", "s~~", "t~"},
},
--Thai = "Thai-sortkey",
--Laoo = "Laoo-sortkey",
Lana = { -- Tai Tham
from = {"ᩔ", "ᩕ", "ᩖ", "ᩘ", "([ᨭ-ᨱ])ᩛ", "([ᨷ-ᨾ])ᩛ", "ᩤ"},
to = {"ᩈ᩠ᩈ", "᩠ᩁ", "᩠ᩃ", "ᨦ᩠", "%1᩠ᨮ", "%1᩠ᨻ", "ᩣ"},
},
Mymr = {
remove_diacritics = c.VS01,
},
-- FIXME: The previous sort key which mixed all scripts removed ZWJ; I don't know which script(s) this was
-- intended for and there are no other languages which remove it in the sort key AFAIK. If it needs to be
-- removed, specify the script(s) it needs to be removed under or add handling for the "all" script that applies
-- regardless of script.
--all = {
-- remove_diacritics = c.ZWJ,
--},
},
}
m["sc"] = {
"ซาร์ดิเนีย",
33976,
"roa-sou",
"Latn",
}
m["sd"] = {
"สินธ์",
33997,
"inc-snd",
"sd-Arab, Deva, Sind, Khoj",
translit = {
Deva = "Deva-translit",
Sind = "Sind-translit",
["sd-Arab"] = "sd-Arab-translit"
},
strip_diacritics = {
["sd-Arab"] = {
remove_diacritics = c.kashida .. c.fathatan .. c.dammatan .. c.kasratan .. c.fatha .. c.damma .. c.kasra .. c.shadda .. c.sukun .. c.superalef,
from = {"ٱ"},
to = {"ا"}
},
},
}
m["se"] = {
"ซามีเหนือ",
33947,
"smi",
"Latn",
display_text = {
from = {"'"},
to = {"ˈ"}
},
strip_diacritics = {remove_diacritics = c.macron .. c.dotbelow .. "'ˈ"},
sort_key = {
from = {"á", "č", "đ", "ŋ", "š", "ŧ", "ž"},
to = {"a" .. p[1], "c" .. p[1], "d" .. p[1], "n" .. p[1], "s" .. p[1], "t" .. p[1], "z" .. p[1]}
},
standard_chars = "AaÁáBbCcČčDdĐđEeFfGgHhIiJjKkLlMmNnŊŋOoPpRrSsŠšTtŦŧUuVvZzŽž" .. c.punc,
}
m["sg"] = {
"ซังโก",
33954,
"crp",
"Latn",
ancestors = "ngb",
}
m["sh"] = {
"เซอร์โบ-โครเอเชีย",
9301,
"zls",
"Latn, Cyrl, Glag, Arab",
ietf_subtag = "hbs", -- ISO 639-3 code, since "sh" is deprecated from ISO 639-1
wikimedia_codes = "sh, bs, hr, sr",
strip_diacritics = {
Latn = {
remove_diacritics = c.grave .. c.acute .. c.tilde .. c.macron .. c.dgrave .. c.invbreve,
remove_exceptions = {"Ć", "ć", "Ś", "ś", "Ź", "ź"}
},
Cyrl = {
remove_diacritics = c.grave .. c.acute .. c.tilde .. c.macron .. c.dgrave .. c.invbreve,
remove_exceptions = {"З́", "з́", "С́", "с́"}
},
},
sort_key = {
Latn = {
remove_diacritics = c.grave .. c.acute .. c.tilde .. c.macron .. c.dgrave .. c.invbreve,
remove_exceptions = {"ć", "ś", "ź"},
from = {"č", "ć", "dž", "đ", "lj", "nj", "š", "ś", "ž", "ź"},
to = {"c" .. p[1], "c" .. p[2], "d" .. p[1], "d" .. p[2], "l" .. p[1], "n" .. p[1], "s" .. p[1], "s" .. p[2], "z" .. p[1], "z" .. p[2]}
},
Cyrl = {
remove_diacritics = c.grave .. c.acute .. c.tilde .. c.macron .. c.dgrave .. c.invbreve,
remove_exceptions = {"з́", "с́"},
from = {"ђ", "з́", "ј", "љ", "њ", "с́", "ћ", "џ"},
to = {"д" .. p[1], "з" .. p[1], "и" .. p[1], "л" .. p[1], "н" .. p[1], "с" .. p[1], "т" .. p[1], "ч" .. p[1]}
},
},
standard_chars = {
Latn = "AaBbCcČčĆćDdĐđEeFfGgHhIiJjKkLlMmNnOoPpRrSsŠšTtUuVvZzŽž",
Cyrl = "АаБбВвГгДдЂђЕеЖжЗзИиЈјКкЛлЉљМмНнЊњОоПпРрСсТтЋћУуФфХхЦцЧчЏџШш",
c.punc
},
}
m["si"] = {
"สิงหล",
13267,
"inc-ins",
"Sinh",
translit = "Sinh-translit",
override_translit = true,
}
m["sk"] = {
"สโลวัก",
9058,
"zlw",
"Latn",
ancestors = "zlw-osk",
sort_key = {remove_diacritics = c.acute .. c.circ .. c.diaer .. c.caron},
standard_chars = "AaÁáÄäBbCcČčDdĎďEeÉéFfGgHhIiÍíJjKkLlĹ弾MmNnŇňOoÓóÔôPpRrŔŕSsŠšTtŤťUuÚúVvYyÝýZzŽž" .. c.punc,
}
m["sl"] = {
"สโลวีเนีย",
9063,
"zls",
"Latn",
strip_diacritics = {
remove_diacritics = c.grave .. c.acute .. c.circ .. c.macron .. c.dgrave .. c.invbreve .. c.dotbelow,
remove_exceptions = {"Ć", "ć", "Ǵ", "ǵ", "Ś", "ś", "Ź", "ź"},
from = {"Ə", "ə", "Ł", "ł"},
to = {"E", "e", "L", "l"},
},
sort_key = {
remove_diacritics = c.grave .. c.acute .. c.circ .. c.tilde .. c.macron .. c.dotabove .. c.ringabove .. c.dgrave .. c.invbreve .. c.dotbelow .. c.ringbelow .. c.ogonek,
remove_exceptions = {"ć", "ǵ", "ś", "ź"},
from = {"ä", "č", "ć", "đ", "ə", "ë", "ǧ", "ǵ", "ï", "ł", "ö", "š", "ś", "ü", "ž", "ź"},
to = {"a" .. p[1], "c" .. p[1], "c" .. p[2], "d" .. p[1], "e", "e" .. p[1], "g" .. p[1], "g" .. p[2], "i" .. p[1], "l", "o" .. p[1], "s" .. p[1], "s" .. p[2], "u" .. p[1], "z" .. p[1], "z" .. p[2]},
},
standard_chars = "AaBbCcČčDdEeFfGgHhIiJjKkLlMmNnOoPpRrSsŠšTtUuVvZzŽž" .. c.punc,
}
m["sm"] = {
"ซามัว",
34011,
"poz-pnp",
"Latn",
}
m["sn"] = {
"โชนา",
34004,
"bnt-sho",
"Latn",
strip_diacritics = {remove_diacritics = c.acute},
}
m["so"] = {
"โซมาลี",
13275,
"cus-som",
"Latn, Arab, Osma",
strip_diacritics = {
Latn = {remove_diacritics = c.grave .. c.acute .. c.circ}
},
}
m["sq"] = {
"แอลเบเนีย",
8748,
"sqj",
"Latn, Grek, ota-Arab, Elba, Todr, Vith",
translit = {
Elba = "Elba-translit",
},
-- Grek display_text, sort_key in [[Module:scripts/data]]
strip_diacritics = {
Latn = {
remove_diacritics = c.acute .. c.circ,
from = {'^[ie] (%w)', '^të (%w)'}, to = {'%1', '%1'},
},
Grek = { -- Diacritic removal from Grek-stripdiacritics excluded.
from = m_langdata.chars_substitutions["Grek-stripdiacritics"].from,
to = m_langdata.chars_substitutions["Grek-stripdiacritics"].to,
},
},
sort_key = {
Latn = {
remove_diacritics = c.acute .. c.circ .. c.tilde .. c.breve .. c.caron,
from = {'^[ie] (%w)', '^të (%w)', 'ç', 'dh', 'ë', 'gj', 'll', 'nj', 'rr', 'sh', 'th', 'xh', 'zh'},
to = {'%1', '%1', 'c'..p[1], 'd'..p[1], 'e'..p[1], 'g'..p[1], 'l'..p[1], 'n'..p[1], 'r'..p[1], 's'..p[1], 't'..p[1], 'x'..p[1], 'z'..p[1]},
}
-- TODO: Grek if the default sort key is unsuitable
},
standard_chars = {
Latn = "AaBbCcÇçDdEeËëFfGgHhIiJjKkLlMmNnOoPpQqRrSsTtUuVvXxYyZz",
c.punc
},
}
m["ss"] = {
"Swazi",
34014,
"bnt-ngu",
"Latn",
strip_diacritics = {remove_diacritics = c.grave .. c.acute .. c.circ .. c.macron .. c.caron},
}
m["st"] = {
"ซูทู",
34340,
"bnt-sts",
"Latn",
strip_diacritics = {remove_diacritics = c.grave .. c.acute .. c.circ .. c.macron .. c.caron},
}
m["su"] = {
"ซุนดา",
34002,
"poz-msa",
"Latn, Sund, Arab",
ancestors = "osn",
translit = {
Sund = "Sund-translit"
},
}
m["sv"] = {
"สวีเดน",
9027,
"gmq-eas",
"Latn",
ancestors = "gmq-osw-lat",
sort_key = {
remove_diacritics = c.grave .. c.acute .. c.circ .. c.tilde .. c.macron .. c.dacute .. c.caron .. c.cedilla .. "':",
remove_exceptions = {"å"},
from = {"ø", "æ", "œ", "ß", "å", "aͤ", "oͤ"},
to = {"o", "ae", "oe", "ss", "z" .. p[1], "ä", "ö"}
},
standard_chars = "AaBbCcDdEeFfGgHhIiJjKkLlMmNnOoPpRrSsTtUuVvXxYyÅåÄäÖö" .. c.punc,
}
m["sw"] = {
"สวาฮีลี",
7838,
"bnt-swh",
"Latn, Arab",
sort_key = {
Latn = {
from = {"ng'"},
to = {"ng" .. p[1]}
},
},
}
m["ta"] = {
"ทมิฬ",
5885,
"dra-tam",
"Taml",
ancestors = "ta-mid",
translit = "Taml-translit",
override_translit = true,
}
m["te"] = {
"เตลูกู",
8097,
"dra-tel",
"Telu",
translit = "Telu-translit",
override_translit = true,
}
m["tg"] = {
"ทาจิก",
9260,
"ira-swi",
"Cyrl, fa-Arab, Latn",
ancestors = "fa-cls",
translit = {
Cyrl = "tg-translit"
},
override_translit = true,
strip_diacritics = {
Cyrl = s["tg-stripdiacritics"],
Latn = s["tg-stripdiacritics"],
},
sort_key = {
Cyrl = {
from = {"ғ", "ё", "ӣ", "қ", "ӯ", "ҳ", "ҷ"},
to = {"г" .. p[1], "е" .. p[1], "и" .. p[1], "к" .. p[1], "у" .. p[1], "х" .. p[1], "ч" .. p[1]}
},
},
}
m["th"] = {
"ไทย",
9217,
"tai-swe",
"Thai, Khomt, Brai",
--translit = {
-- Thai = "th-translit"
--},
--sort_key = {
-- Thai = "Thai-sortkey"
--},
}
m["ti"] = {
"ทือกรึญญา",
34124,
"sem-eth",
"Ethi",
translit = "Ethi-translit",
}
m["tk"] = {
"เติร์กเมน",
9267,
"trk-ogz",
"Latn, Cyrl, Arab",
strip_diacritics = {
Latn = s["tk-stripdiacritics"],
Cyrl = s["tk-stripdiacritics"],
},
sort_key = {
Latn = {
from = {"ç", "ä", "ž", "ň", "ö", "ş", "ü", "ý"},
to = {"c" .. p[1], "e" .. p[1], "j" .. p[1], "n" .. p[1], "o" .. p[1], "s" .. p[1], "u" .. p[1], "y" .. p[1]}
},
Cyrl = {
from = {"ё", "җ", "ң", "ө", "ү", "ә"},
to = {"е" .. p[1], "ж" .. p[1], "н" .. p[1], "о" .. p[1], "у" .. p[1], "э" .. p[1]}
},
},
ancestors = "trk-eog",
}
m["tl"] = {
"ตากาล็อก",
34057,
"phi",
"Latn, Tglg",
translit = {
Tglg = "tl-translit"
},
override_translit = true,
strip_diacritics = {
Latn = {remove_diacritics = c.grave .. c.acute .. c.circ}
},
standard_chars = {
Latn = "AaBbKkDdEeGgHhIiLlMmNnOoPpRrSsTtUuWwYy",
c.punc
},
sort_key = {
Latn = "tl-sortkey",
},
}
m["tn"] = {
"สวานา",
34137,
"bnt-sts",
"Latn",
}
m["to"] = {
"ตองงา",
34094,
"poz-ton",
"Latn",
strip_diacritics = {remove_diacritics = c.acute},
sort_key = {remove_diacritics = c.macron},
}
m["tr"] = {
"ตุรกี",
256,
"trk-ogz",
"Latn",
ancestors = "ota",
dotted_dotless_i = true,
sort_key = {
from = {
-- Ignore circumflex, but account for capital Î wrongly becoming ı + circ due to dotted dotless I logic.
"ı" .. c.circ, c.circ,
"i", -- Ensure "i" comes after "ı".
"ç", "ğ", "ı", "ö", "ş", "ü"
},
to = {
"i", "",
"i" .. p[1],
"c" .. p[1], "g" .. p[1], "i", "o" .. p[1], "s" .. p[1], "u" .. p[1]
}
},
standard_chars = "AaÂâBbCcÇçDdEeFfGgĞğHhIıİiÎîJjKkLlMmNnOoÖöPpRrSsŞşTtUuÛûÜüVvYyZz" .. c.punc,
}
m["ts"] = {
"Tsonga",
34327,
"bnt-tsr",
"Latn",
}
m["tt"] = {
"ตาตาร์",
25285,
"trk-kbu",
"Cyrl, Latn, tt-Arab",
translit = {
Cyrl = "tt-translit",
["tt-Arab"] = "tt-translit"
},
--override_translit = true, -- enable override until Module code can detect Russian loans such as [[аэропорт]]
dotted_dotless_i = true,
sort_key = {
Cyrl = {
from = {"ә", "ў", "ғ", "ё", "җ", "қ", "ң", "ө", "ү", "һ"},
to = {"а" .. p[1], "в" .. p[1], "г" .. p[1], "е" .. p[1], "ж" .. p[1], "к" .. p[1], "н" .. p[1], "о" .. p[1], "у" .. p[1], "х" .. p[1]}
},
Latn = {
from = {
"i", -- Ensure "i" comes after "ı".
"ä", "ə", "ç", "ğ", "ı", "ñ", "ŋ", "ö", "ɵ", "ş", "ü"
},
to = {
"i" .. p[1],
"a" .. p[1], "a" .. p[2], "c" .. p[1], "g" .. p[1], "i", "n" .. p[1], "n" .. p[2], "o" .. p[1], "o" .. p[2], "s" .. p[1], "u" .. p[1]
}
},
},
}
-- "tw" is treated as "ak", see [[WT:LT]]
m["ty"] = {
"ตาฮีตี",
34128,
"poz-pep",
"Latn",
}
m["ug"] = {
"อุยกูร์",
13263,
"trk-kar",
"ug-Arab, Latn, Cyrl",
ancestors = "chg",
translit = {
["ug-Arab"] = "ug-translit-Thai",
--Cyrl = "ug-translit",
},
override_translit = true,
}
m["uk"] = {
"ยูเครน",
8798,
"zle",
"Cyrl",
ancestors = "zle-muk",
translit = "uk-translit-Thai",
strip_diacritics = {remove_diacritics = c.grave .. c.acute},
sort_key = {
remove_diacritics = c.grave .. c.acute,
from = {
"ї", -- 2 chars
"ґ", "є", "і" -- 1 char
},
to = {
"и" .. p[2],
"г" .. p[1], "е" .. p[1], "и" .. p[1]
}
},
standard_chars = "АаБбВвГгДдЕеЄєЖжЗзИиІіЇїЙйКкЛлМмНнОоПпРрСсТтУуФфХхЦцЧчШшЩщЬьЮюЯя" .. c.punc:gsub("'", ""), -- Exclude apostrophe.
}
m["ur"] = {
"อูรดู",
1617,
"inc-hnd",
"ur-Arab, Hebr",
translit = {
["ur-Arab"] = "ur-translit"
},
strip_diacritics = {
["ur-Arab"] = {
-- character "ۂ" code U+06C2 to "ه" and "هٔ" (U+0647 + U+0654) to "ه"; hamzatu l-waṣli to a regular alif
from = {"هٔ", "ۂ", "ٱ"},
to = {"ہ", "ہ", "ا"},
remove_diacritics = c.fathatan .. c.dammatan .. c.kasratan .. c.fatha .. c.damma .. c.kasra .. c.shadda .. c.sukun .. c.nunghunna .. c.superalef
},
},
-- Hebr display_text, strip_diacritics, sort_key in [[Module:scripts/data]]
standard_chars = {
["ur-Arab"] = "ایببپتثجچحخدذرزژسشصضطظعغفقکگلࣇڷمنݨوؤہھئٹڈڑآے",
c.punc,
},
}
m["uz"] = {
"อุซเบก",
9264,
"trk-kar",
"Latn, Cyrl, fa-Arab",
ancestors = "chg",
translit = {
Cyrl = "uz-translit"
},
sort_key = {
Latn = {
from = {"oʻ", "gʻ", "sh", "ch", "ng"},
to = {"z" .. p[1], "z" .. p[2], "z" .. p[3], "z" .. p[4], "z" .. p[5]}
},
Cyrl = {
from = {"ё", "ў", "қ", "ғ", "ҳ"},
to = {"е" .. p[1], "я" .. p[1], "я" .. p[2], "я" .. p[3], "я" .. p[4]}
},
},
strip_diacritics = {
["fa-Arab"] = "ar-stripdiacritics",
},
}
m["ve"] = {
"เวนดา",
32704,
"bnt-bso",
"Latn",
}
m["vi"] = {
"เวียดนาม",
9199,
"mkh-vie",
"Latn, Hani",
ancestors = "mkh-mvi",
--translit = {Latn = "vi-translit"}, -- already handled in Module:headword & Module:links
sort_key = {
Latn = "vi-sortkey",
Hani = "Hani-sortkey",
},
}
m["vo"] = {
"โวลาปุก",
36986,
"art",
"Latn",
}
m["wa"] = {
"วัลลูน",
34219,
"roa-oil",
"Latn",
sort_key = s["roa-oil-sortkey"],
}
m["wo"] = {
"โวลอฟ",
34257,
"alv-fwo",
"Latn, Arab, Gara",
}
m["xh"] = {
"โคซา",
13218,
"bnt-ngu",
"Latn",
strip_diacritics = {remove_diacritics = c.grave .. c.acute .. c.circ .. c.macron .. c.caron},
}
m["yi"] = {
"ยิดดิช",
8641,
"gmw-hgm",
"Hebr, Latn",
ancestors = "gmh",
translit = {
Hebr = "yi-translit",
},
-- Hebr display_text, strip_diacritics, sort_key in [[Module:scripts/data]]
}
m["yo"] = {
"โยรูบา",
34311,
"alv-yor",
"Latn, Arab",
strip_diacritics = {
Latn = {remove_diacritics = c.grave .. c.acute .. c.macron}
},
sort_key = {
Latn = {
from = {"ẹ", "ɛ", "gb", "ị", "kp", "ọ", "ɔ", "ṣ", "sh", "ụ"},
to = {"e" .. p[1], "e" .. p[1], "g" .. p[1], "i" .. p[1], "k" .. p[1], "o" .. p[1], "o" .. p[1], "s" .. p[1], "s" .. p[1], "u" .. p[1]}
},
},
}
m["za"] = {
"จ้วง",
13216,
"tai",
"Latn, Hani",
sort_key = {
Latn = "za-sortkey",
Hani = "Hani-sortkey",
},
}
m["zh"] = {
"จีน",
7850,
"zhx",
"Hants, Latn, Bopo, Nshu, Brai",
ancestors = "ltc",
generate_forms = "zh-generateforms",
translit = {
Hani = "zh-translit",
Bopo = "zh-translit",
},
sort_key = {
Hani = "Hani-sortkey"
},
}
m["zu"] = {
"ซูลู",
10179,
"bnt-ngu",
"Latn",
strip_diacritics = {remove_diacritics = c.grave .. c.acute .. c.circ .. c.macron .. c.caron},
}
return require("Module:languages").finalizeData(m, "language")
8aik8yyt7fo2dzxsbmj7ih4oxgzah6t
มอดูล:languages
828
36388
5720749
5676884
2026-04-21T07:00:44Z
OctraBot
3198
บอต: แทนที่ข้อความโดยอัตโนมัติ (-standardChars +standard_chars)
5720749
Scribunto
text/plain
--[==[ intro:
This module implements fetching of language-specific information and processing text in a given language.
===Types of languages===
There are two types of languages: full languages and etymology-only languages. The essential difference is that only
full languages appear in L2 headings in vocabulary entries, and hence categories like [[:Category:French nouns]] exist
only for full languages. Etymology-only languages have either a full language or another etymology-only language as
their parent (in the parent-child inheritance sense), and for etymology-only languages with another etymology-only
language as their parent, a full language can always be derived by following the parent links upwards. For example,
"Canadian French", code `fr-CA`, is an etymology-only language whose parent is the full language "French", code `fr`.
An example of an etymology-only language with another etymology-only parent is "Northumbrian Old English", code
`ang-nor`, which has "Anglian Old English", code `ang-ang` as its parent; this is an etymology-only language whose
parent is "Old English", code `ang`, which is a full language. (This is because Northumbrian Old English is considered
a variety of Anglian Old English.) Sometimes the parent is the "Undetermined" language, code `und`; this is the case,
for example, for "substrate" languages such as "Pre-Greek", code `qsb-grc`, and "the BMAC substrate", code `qsb-bma`.
It is important to distinguish language ''parents'' from language ''ancestors''. The parent-child relationship is one
of containment, i.e. if X is a child of Y, X is considered a variety of Y. On the other hand, the ancestor-descendant
relationship is one of descent in time. For example, "Classical Latin", code `la-cla`, and "Late Latin", code `la-lat`,
are both etymology-only languages with "Latin", code `la`, as their parents, because both of the former are varieties
of Latin. However, Late Latin does *NOT* have Classical Latin as its parent because Late Latin is *not* a variety of
Classical Latin; rather, it is a descendant. There is in fact a separate `ancestors` field that is used to express the
ancestor-descendant relationship, and Late Latin's ancestor is given as Classical Latin. It is also important to note
that sometimes an etymology-only language is actually the conceptual ancestor of its parent language. This happens,
for example, with "Old Italian" (code `roa-oit`), which is an etymology-only variant of full language "Italian" (code
`it`), and with "Old Latin" (code `itc-ola`), which is an etymology-only variant of Latin. In both cases, the full
language has the etymology-only variant listed as an ancestor. This allows a Latin term to inherit from Old Latin
using the {{tl|inh}} template (where in this template, "inheritance" refers to ancestral inheritance, i.e. inheritance
in time, rather than in the parent-child sense); likewise for Italian and Old Italian.
Full languages come in three subtypes:
* {regular}: This indicates a full language that is attested according to [[WT:CFI]] and therefore permitted in the
main namespace. There may also be reconstructed terms for the language, which are placed in the
{Reconstruction} namespace and must be prefixed with * to indicate a reconstruction. Most full languages
are natural (not constructed) languages, but a few constructed languages (e.g. Esperanto and Volapük,
among others) are also allowed in the mainspace and considered regular languages.
* {reconstructed}: This language is not attested according to [[WT:CFI]], and therefore is allowed only in the
{Reconstruction} namespace. All terms in this language are reconstructed, and must be prefixed with
*. Languages such as Proto-Indo-European and Proto-Germanic are in this category.
* {appendix-constructed}: This language is attested but does not meet the additional requirements set out for
constructed languages ([[WT:CFI#Constructed languages]]). Its entries must therefore be in
the Appendix namespace, but they are not reconstructed and therefore should not have *
prefixed in links. Most constructed languages are of this subtype.
Both full languages and etymology-only languages have a {Language} object associated with them, which is fetched using
the {getByCode} function in [[Module:languages]] to convert a language code to a {Language} object. Depending on the
options supplied to this function, etymology-only languages may or may not be accepted, and family codes may be
accepted (returning a {Family} object as described in [[Module:families]]). There are also separate {getByCanonicalName}
functions in [[Module:languages]] and [[Module:etymology languages]] to convert a language's canonical name to a
{Language} object (depending on whether the canonical name refers to a full or etymology-only language).
===Textual representations===
Textual strings belonging to a given language come in several different ''text variants'':
# The ''input text'' is what the user supplies in wikitext, in the parameters to {{tl|m}}, {{tl|l}}, {{tl|ux}},
{{tl|t}}, {{tl|lang}} and the like.
# The ''corrected input text'' is the input text with some corrections and/or normalizations applied, such as
bad-character replacements for certain languages, like replacing `l` or `1` to [[palochka]] in some languages written
in Cyrillic. (FIXME: This currently goes under the name ''display text'' but that will be repurposed below. Also,
[[User:Surjection]] suggests renaming this to ''normalized input text'', but "normalized" is used in a different sense
in [[Module:usex]].)
# The ''display text'' is the text in the form as it will be displayed to the user. This is what appears in headwords,
in usexes, in displayed internal links, etc. This can include accent marks that are removed to form the stripped
display text (see below), as well as embedded bracketed links that are variously processed further. The display text
is generated from the corrected input text by applying language-specific transformations; for most languages, there
will be no such transformations. The general reason for having a difference between input and display text is to allow
for extra information in the input text that is not displayed to the user but is sent to the transliteration module.
Note that having different display and input text is only supported currently through special-casing but will be
generalized. Examples of transformations are: (1) Removing the {{cd|^}} that is used in certain East Asian (and
possibly other unicameral) languages to indicate capitalization of the transliteration (which is currently
special-cased); (2) for Korean, removing or otherwise processing hyphens (which is currently special-cased); (3) for
Arabic, removing a ''sukūn'' diacritic placed over a ''tāʔ marbūṭa'' (like this: ةْ) to indicate that the
''tāʔ marbūṭa'' is pronounced and transliterated as /t/ instead of being silent [NOTE, NOT IMPLEMENTED YET]; (4) for
Thai and Khmer, converting space-separated words to bracketed words and resolving respelling substitutions such as
`[กรีน/กฺรีน]`, which indicate how to transliterate given words [NOTE, NOT IMPLEMENTED YET except in language-specific
templates like {{tl|th-usex}}].
## The ''right-resolved display text'' is the result of removing brackets around one-part embedded links and resolving
two-part embedded links into their right-hand components (i.e. converting two-part links into the displayed form).
The process of right-resolution is what happens when you call {{cd|remove_links()}} in [[Module:links]] on some text.
When applied to the display text, it produces exactly what the user sees, without any link markup.
# The ''stripped display text'' is the result of applying diacritic-stripping to the display text.
## The ''left-resolved stripped display text'' [NEED BETTER NAME] is the result of applying left-resolution to the
stripped display text, i.e. similar to right-resolution but resolving two-part embedded links into their left-hand
components (i.e. the linked-to page). If the display text refers to a single page, the resulting of applying
diacritic stripping and left-resolution produces the ''logical pagename''.
# The ''physical pagename text'' is the result of converting the stripped display text into physical page links. If the
stripped display text contains embedded links, the left side of those links is converted into physical page links;
otherwise, the entire text is considered a pagename and converted in the same fashion. The conversion does three
things: (1) converts characters not allowed in pagenames into their "unsupported title" representation, e.g.
{{cd|Unsupported titles/`gt`}} in place of the logical name {{cd|>}}; (2) handles certain special-cased
unsupported-title logical pagenames, such as {{cd|Unsupported titles/Space}} in place of {{cd|[space]}} and
{{cd|Unsupported titles/Ancient Greek dish}} in place of a very long Greek name for a gourmet dish as found in
Aristophanes; (3) converts "mammoth" pagenames such as [[a]] into their appropriate split component, e.g.
[[a/languages A to L]].
# The ''source translit text'' is the text as supplied to the language-specific {{cd|transliterate()}} method. The form
of the source translit text may need to be language-specific, e.g Thai and Khmer will need the corrected input text,
whereas other languages may need to work off the display text. [FIXME: It's still unclear to me how embedded bracketed
links are handled in the existing code.] In general, embedded links need to be right-resolved (see above), but when
this happens is unclear to me [FIXME]. Some languages have a chop-up-and-paste-together scheme that sends parts of the
text through the transliterate mechanism, and for others (those listed with "cont" in {{cd|substitution}} in
[[Module:languages/data]]) they receive the full input text, but preprocessed in certain ways. (The wisdom of this is
still unclear to me.)
# The ''transliterated text'' (or ''transliteration'') is the result of transliterating the source translit text. Unlike
for all the other text variants except the transcribed text, it is always in the Latin script.
# The ''transcribed text'' (or ''transcription'') is the result of transcribing the source translit text, where
"transcription" here means a close approximation to the phonetic form of the language in languages (e.g. Akkadian,
Sumerian, Ancient Egyptian, maybe Tibetan) that have a wide difference between the written letters and spoken form.
Unlike for all the other text variants other than the transliterated text, it is always in the Latin script.
Currently, the transcribed text is always supplied manually be the user; there is no such thing as a
{{cd|transcribe()}} method on language objects.
# The ''sort key'' is the text used in sort keys for determining the placing of pages in categories they belong to. The
sort key is generated from the pagename or a specified ''sort base'' by lowercasing, doing language-specific
transformations and then uppercasing the result. If the sort base is supplied and is generated from input text, it
needs to be converted to display text, have embedded links removed through right-resolution and have
diacritic-stripping applied.
# There are other text variants that occur in usexes (specifically, there are normalized variants of several of the
above text variants), but we can skip them for now.
The following methods exist on {Language} objects to convert between different text variants:
# {correctInputText} (currently called {makeDisplayText}): This converts input text to corrected input text.
# {stripDiacritics}: This converts to stripped display text. [FIXME: This needs some rethinking. In particular,
{stripDiacritics} is sometimes called on input text, corrected input text or display text (in various paths inside of
[[Module:links]], and, in the case of input text, usually from other modules). We need to make sure we don't try to
convert input text to display text twice, but at the same time we need to support calling it directly on input text
since so many modules do this. This means we need to add a parameter indicating whether the passed-in text is input,
corrected input, or display text; if the former two, we call {correctInputText} ourselves.]
# {logicalToPhysical}: This converts logical pagenames to physical pagenames.
# {transliterate}: This appears to convert input text with embedded brackets removed into a transliteration.
[FIXME: This needs some rethinking. In particular, it calls {processDisplayText} on its input, which won't work
for Thai and Khmer, so we may need language-specific flags indicating whether to pass the input text directly to the
language transliterate method. In addition, I'm not sure how embedded links are handled in the existing translit code;
a lot of callers remove the links themselves before calling {transliterate()}, which I assume is wrong.]
# {makeSortKey}: This converts display text (?) to a sort key. [FIXME: Clarify this.]
]==]
local export = {}
local debug_track_module = "Module:debug/track"
local etymology_languages_data_module = "Module:etymology languages/data"
local families_module = "Module:families"
local headword_page_module = "Module:headword/page"
local json_module = "Module:JSON"
local language_like_module = "Module:language-like"
local languages_data_module = "Module:languages/data"
local languages_data_patterns_module = "Module:languages/data/patterns"
local links_data_module = "Module:links/data"
local load_module = "Module:load"
local scripts_module = "Module:scripts"
local scripts_data_module = "Module:scripts/data"
local string_encode_entities_module = "Module:string/encode entities"
local string_pattern_escape_module = "Module:string/patternEscape"
local string_replacement_escape_module = "Module:string/replacementEscape"
local string_utilities_module = "Module:string utilities"
local table_module = "Module:table"
local utilities_module = "Module:utilities"
local wikimedia_languages_module = "Module:wikimedia languages"
local mw = mw
local string = string
local table = table
local char = string.char
local concat = table.concat
local find = string.find
local floor = math.floor
local get_by_code -- Defined below.
local get_data_module_name -- Defined below.
local get_extra_data_module_name -- Defined below.
local getmetatable = getmetatable
local gmatch = string.gmatch
local gsub = string.gsub
local insert = table.insert
local ipairs = ipairs
local is_known_language_tag = mw.language.isKnownLanguageTag
local make_object -- Defined below.
local match = string.match
local next = next
local pairs = pairs
local remove = table.remove
local require = require
local select = select
local setmetatable = setmetatable
local sub = string.sub
local type = type
local unstrip = mw.text.unstrip
-- Loaded as needed by findBestScript.
local Hans_chars
local Hant_chars
local function check_object(...)
check_object = require(utilities_module).check_object
return check_object(...)
end
local function debug_track(...)
debug_track = require(debug_track_module)
return debug_track(...)
end
local function decode_entities(...)
decode_entities = require(string_utilities_module).decode_entities
return decode_entities(...)
end
local function decode_uri(...)
decode_uri = require(string_utilities_module).decode_uri
return decode_uri(...)
end
local function deep_copy(...)
deep_copy = require(table_module).deepCopy
return deep_copy(...)
end
local function encode_entities(...)
encode_entities = require(string_encode_entities_module)
return encode_entities(...)
end
local function get_L2_sort_key(...)
get_L2_sort_key = require(headword_page_module).get_L2_sort_key
return get_L2_sort_key(...)
end
local function get_script(...)
get_script = require(scripts_module).getByCode
return get_script(...)
end
local function find_best_script_without_lang(...)
find_best_script_without_lang = require(scripts_module).findBestScriptWithoutLang
return find_best_script_without_lang(...)
end
local function get_family(...)
get_family = require(families_module).getByCode
return get_family(...)
end
local function get_plaintext(...)
get_plaintext = require(utilities_module).get_plaintext
return get_plaintext(...)
end
local function get_wikimedia_lang(...)
get_wikimedia_lang = require(wikimedia_languages_module).getByCode
return get_wikimedia_lang(...)
end
local function keys_to_list(...)
keys_to_list = require(table_module).keysToList
return keys_to_list(...)
end
local function list_to_set(...)
list_to_set = require(table_module).listToSet
return list_to_set(...)
end
local function load_data(...)
load_data = require(load_module).load_data
return load_data(...)
end
local function make_family_object(...)
make_family_object = require(families_module).makeObject
return make_family_object(...)
end
local function pattern_escape(...)
pattern_escape = require(string_pattern_escape_module)
return pattern_escape(...)
end
local function replacement_escape(...)
replacement_escape = require(string_replacement_escape_module)
return replacement_escape(...)
end
local function safe_require(...)
safe_require = require(load_module).safe_require
return safe_require(...)
end
local function shallow_copy(...)
shallow_copy = require(table_module).shallowCopy
return shallow_copy(...)
end
local function split(...)
split = require(string_utilities_module).split
return split(...)
end
local function to_json(...)
to_json = require(json_module).toJSON
return to_json(...)
end
local function u(...)
u = require(string_utilities_module).char
return u(...)
end
local function ugsub(...)
ugsub = require(string_utilities_module).gsub
return ugsub(...)
end
local function ulen(...)
ulen = require(string_utilities_module).len
return ulen(...)
end
local function ulower(...)
ulower = require(string_utilities_module).lower
return ulower(...)
end
local function umatch(...)
umatch = require(string_utilities_module).match
return umatch(...)
end
local function uupper(...)
uupper = require(string_utilities_module).upper
return uupper(...)
end
local function track(page)
debug_track("languages/" .. page)
return true
end
local function normalize_code(code)
return load_data(languages_data_module).aliases[code] or code
end
local function check_inputs(self, check, default, ...)
local n = select("#", ...)
if n == 0 then
return false
end
local ret = check(self, (...))
if ret ~= nil then
return ret
elseif n > 1 then
local inputs = {...}
for i = 2, n do
ret = check(self, inputs[i])
if ret ~= nil then
return ret
end
end
end
return default
end
local function make_link(self, target, display)
local prefix, main
if self:getFamilyCode() == "qfa-sub" then
prefix, main = display:match("^(the )(.*)")
if not prefix then
prefix, main = display:match("^(a )(.*)")
end
end
return (prefix or "") .. "[[" .. target .. "|" .. (main or display) .. "]]"
end
-- Convert risky characters to HTML entities, which minimizes interference once returned (e.g. for "sms:a", "<!-- -->" etc.).
local function escape_risky_characters(text)
-- Spacing characters in isolation generally need to be escaped in order to be properly processed by the MediaWiki software.
if umatch(text, "^%s*$") then
return encode_entities(text, text)
end
return encode_entities(text, "!#%&*+/:;<=>?@[\\]_{|}")
end
-- Temporarily convert various formatting characters to PUA to prevent them from being disrupted by the substitution process.
local function doTempSubstitutions(text, subbedChars, keepCarets, noTrim)
-- Clone so that we don't insert any extra patterns into the table in package.loaded. For some reason, using require seems to keep memory use down; probably because the table is always cloned.
local patterns = shallow_copy(require(languages_data_patterns_module))
if keepCarets then
insert(patterns, "((\\+)%^)")
insert(patterns, "((%^))")
end
-- Ensure any whitespace at the beginning and end is temp substituted, to prevent it from being accidentally trimmed. We only want to trim any final spaces added during the substitution process (e.g. by a module), which means we only do this during the first round of temp substitutions.
if not noTrim then
insert(patterns, "^([\128-\191\244]*(%s+))")
insert(patterns, "((%s+)[\128-\191\244]*)$")
end
-- Pre-substitution, of "[[" and "]]", which makes pattern matching more accurate.
text = gsub(text, "%f[%[]%[%[", "\1"):gsub("%f[%]]%]%]", "\2")
local i = #subbedChars
for _, pattern in ipairs(patterns) do
-- Patterns ending in \0 stand are for things like "[[" or "]]"), so the inserted PUA are treated as breaks between terms by modules that scrape info from pages.
local term_divider
pattern = gsub(pattern, "%z$", function(divider)
term_divider = divider == "\0"
return ""
end)
text = gsub(text, pattern, function(...)
local m = {...}
local m1New = m[1]
for k = 2, #m do
local n = i + k - 1
subbedChars[n] = m[k]
local byte2 = floor(n / 4096) % 64 + (term_divider and 128 or 136)
local byte3 = floor(n / 64) % 64 + 128
local byte4 = n % 64 + 128
m1New = gsub(m1New, pattern_escape(m[k]), "\244" .. char(byte2) .. char(byte3) .. char(byte4), 1)
end
i = i + #m - 1
return m1New
end)
end
text = gsub(text, "\1", "%[%["):gsub("\2", "%]%]")
return text, subbedChars
end
-- Reinsert any formatting that was temporarily substituted.
local function undoTempSubstitutions(text, subbedChars)
for i = 1, #subbedChars do
local byte2 = floor(i / 4096) % 64 + 128
local byte3 = floor(i / 64) % 64 + 128
local byte4 = i % 64 + 128
text = gsub(text, "\244[" .. char(byte2) .. char(byte2+8) .. "]" .. char(byte3) .. char(byte4),
replacement_escape(subbedChars[i]))
end
text = gsub(text, "\1", "%[%["):gsub("\2", "%]%]")
return text
end
-- Check if the raw text is an unsupported title, and if so return that. Otherwise, remove HTML entities. We do the pre-conversion to avoid loading the unsupported title list unnecessarily.
local function checkNoEntities(self, text)
local textNoEnc = decode_entities(text)
if textNoEnc ~= text and load_data(links_data_module).unsupported_titles[text] then
return text
else
return textNoEnc
end
end
-- If no script object is provided (or if it's invalid or None), get one.
local function checkScript(text, self, sc)
if not check_object("script", true, sc) or sc:getCode() == "None" then
return self:findBestScript(text)
end
return sc
end
local function normalize(text, sc)
text = sc:fixDiscouragedSequences(text)
return sc:toFixedNFD(text)
end
-- Subfunction of iterateSectionSubstitutions(). Process an individual chunk of text according to the specifications in
-- `substitution_data`. The input parameters are all as in the documentation of iterateSectionSubstitutions() except for
-- `recursed`, which is set to true if we called ourselves recursively to process a script-specific setting or
-- script-wide fallback. Returns two values: the processed text and the actual substitution data used to do the
-- substitutions (same as the `actual_substitution_data` return value to iterateSectionSubstitutions()).
local function doSubstitutions(self, text, sc, substitution_data, data_field, function_name, recursed)
-- BE CAREFUL in this function because the value at any level can be `false`, which causes no processing to be done
-- and blocks any further fallback processing.
local actual_substitution_data = substitution_data
-- If there are language-specific substitutes given in the data module, use those.
if type(substitution_data) == "table" then
-- If a script is specified, run this function with the script-specific data before continuing.
local sc_code = sc:getCode()
local has_substitution_data = false
if substitution_data[sc_code] ~= nil then
has_substitution_data = true
if substitution_data[sc_code] then
text, actual_substitution_data = doSubstitutions(self, text, sc, substitution_data[sc_code], data_field,
function_name, true)
end
-- Hant, Hans and Hani are usually treated the same, so add a special case to avoid having to specify each one
-- separately.
elseif sc_code:match("^Han") and substitution_data.Hani ~= nil then
has_substitution_data = true
if substitution_data.Hani then
text, actual_substitution_data = doSubstitutions(self, text, sc, substitution_data.Hani, data_field,
function_name, true)
end
-- Substitution data with key 1 in the outer table may be given as a fallback.
elseif substitution_data[1] ~= nil then
has_substitution_data = true
if substitution_data[1] then
text, actual_substitution_data = doSubstitutions(self, text, sc, substitution_data[1], data_field,
function_name, true)
end
end
-- Iterate over all strings in the "from" subtable, and gsub with the corresponding string in "to". We work with
-- the NFD decomposed forms, as this simplifies many substitutions.
if substitution_data.from then
has_substitution_data = true
for i, from in ipairs(substitution_data.from) do
-- Normalize each loop, to ensure multi-stage substitutions work correctly.
text = sc:toFixedNFD(text)
text = ugsub(text, sc:toFixedNFD(from), substitution_data.to[i] or "")
end
end
if substitution_data.remove_diacritics then
has_substitution_data = true
text = sc:toFixedNFD(text)
-- Convert exceptions to PUA.
local remove_exceptions, substitutes = substitution_data.remove_exceptions
if remove_exceptions then
substitutes = {}
local i = 0
for _, exception in ipairs(remove_exceptions) do
exception = sc:toFixedNFD(exception)
text = ugsub(text, exception, function(m)
i = i + 1
local subst = u(0x80000 + i)
substitutes[subst] = m
return subst
end)
end
end
-- Strip diacritics.
text = ugsub(text, "[" .. substitution_data.remove_diacritics .. "]", "")
-- Convert exceptions back.
if remove_exceptions then
text = text:gsub("\242[\128-\191]*", substitutes)
end
end
if not has_substitution_data and sc._data[data_field] then
-- If language-specific sort key (etc.) is nil, fall back to script-wide sort key (etc.).
text, actual_substitution_data = doSubstitutions(self, text, sc, sc._data[data_field], data_field,
function_name, true)
end
elseif type(substitution_data) == "string" then
-- If there is a dedicated function module, use that.
local module = safe_require("Module:" .. substitution_data)
if module then
-- TODO: translit functions should take objects, not codes.
-- TODO: translit functions should be called with form NFD.
if function_name == "tr" then
if not module[function_name] then
error(("Internal error: Module [[%s]] has no function named 'tr'"):format(substitution_data))
end
text = module[function_name](text, self._code, sc:getCode())
elseif function_name == "stripDiacritics" then
-- FIXME, get rid of this arm after renaming makeEntryName -> stripDiacritics.
if module[function_name] then
text = module[function_name](sc:toFixedNFD(text), self, sc)
elseif module.makeEntryName then
text = module.makeEntryName(sc:toFixedNFD(text), self, sc)
else
error(("Internal error: Module [[%s]] has no function named 'stripDiacritics' or 'makeEntryName'"
):format(substitution_data))
end
else
if not module[function_name] then
error(("Internal error: Module [[%s]] has no function named '%s'"):format(
substitution_data, function_name))
end
text = module[function_name](sc:toFixedNFD(text), self, sc)
end
else
error("Substitution data '" .. substitution_data .. "' does not match an existing module.")
end
elseif substitution_data == nil and sc._data[data_field] then
-- If language-specific sort key (etc.) is nil, fall back to script-wide sort key (etc.).
text, actual_substitution_data = doSubstitutions(self, text, sc, sc._data[data_field], data_field,
function_name, true)
end
-- Don't normalize to NFC if this is the inner loop or if a module returned nil.
if recursed or not text then
return text, actual_substitution_data
end
-- Fix any discouraged sequences created during the substitution process, and normalize into the final form.
return sc:toFixedNFC(sc:fixDiscouragedSequences(text)), actual_substitution_data
end
-- Split the text into sections, based on the presence of temporarily substituted formatting characters, then iterate
-- over each section to apply substitutions (e.g. transliteration or diacritic stripping). This avoids putting PUA
-- characters through language-specific modules, which may be unequipped for them. This function is passed the following
-- values:
-- * `self` (the Language object);
-- * `text` (the text to process);
-- * `sc` (the script of the text, which must be specified; callers should call checkScript() as needed to autodetect the
-- script of the text if not given explicitly by the user);
-- * `subbedChars` (an array of the same length as the text, indicating which characters have been substituted and by
-- what, or {nil} if no substitutions are to happen);
-- * `keepCarets` (DOCUMENT ME);
-- * `substitution_data` (the data indicating which substitutions to apply, taken directly from `data_field` in the
-- language's data structure in a submodule of [[Module:languages/data]]);
-- * `data_field` (the data field from which `substitution_data` was fetched, such as "sort_key" or "strip_diacritics");
-- * `function_name` (the name of the function to call to do the substitution, in case `substitution_data` specifies a
-- module to do the substitution);
-- * `notrim` (don't trim whitespace at the edges of `text`; set when computing the sort key, because whitespace at the
-- beginning of a sort key is significant and causes the resulting page to be sorted at the beginning of the category
-- it's in).
-- Returns three values:
-- (1) the processed text;
-- (2) the value of `subbedChars` that was passed in, possibly modified with additional character substitutions; will be
-- {nil} if {nil} was passed in;
-- (3) the actual substitution data that was used to apply substitutions to `text`; this may be different from the value
-- of `substitution_data` passed in if that value recursively specified script-specific substitutions or if no
-- substitution data could be found in the language-specific data (e.g. {nil} was passed in or a structure was passed
-- in that had no setting for the script given in `sc`), but a script-wide fallback value was set; currently it is
-- only used by makeSortKey().
local function iterateSectionSubstitutions(self, text, sc, subbedChars, keepCarets, substitution_data, data_field,
function_name, notrim)
local sections
-- See [[Module:languages/data]].
if not find(text, "\244") or load_data(languages_data_module).substitution[self._code] == "cont" then
sections = {text}
else
sections = split(text, "\244[\128-\143][\128-\191]*", true)
end
local actual_substitution_data
for _, section in ipairs(sections) do
-- Don't bother processing empty strings or whitespace (which may also not be handled well by dedicated
-- modules).
if gsub(section, "%s+", "") ~= "" then
local sub, this_actual_substitution_data = doSubstitutions(self, section, sc, substitution_data, data_field,
function_name)
actual_substitution_data = this_actual_substitution_data
-- Second round of temporary substitutions, in case any formatting was added by the main substitution
-- process. However, don't do this if the section contains formatting already (as it would have had to have
-- been escaped to reach this stage, and therefore should be given as raw text).
if sub and subbedChars then
local noSub
for _, pattern in ipairs(require(languages_data_patterns_module)) do
if match(section, pattern .. "%z?") then
noSub = true
end
end
if not noSub then
sub, subbedChars = doTempSubstitutions(sub, subbedChars, keepCarets, true)
end
end
if not sub then
text = sub
break
end
text = sub and gsub(text, pattern_escape(section), replacement_escape(sub), 1) or text
end
end
if not notrim then
-- Trim, unless there are only spacing characters, while ignoring any final formatting characters.
-- Do not trim sort keys because spaces at the beginning are significant.
text = text and text:gsub("^([\128-\191\244]*)%s+(%S)", "%1%2"):gsub("(%S)%s+([\128-\191\244]*)$", "%1%2") or
nil
end
return text, subbedChars, actual_substitution_data
end
-- Process carets (and any escapes). Default to simple removal, if no pattern/replacement is given.
local function processCarets(text, pattern, repl)
local rep
repeat
text, rep = gsub(text, "\\\\(\\*^)", "\3%1")
until rep == 0
return (text:gsub("\\^", "\4")
:gsub(pattern or "%^", repl or "")
:gsub("\3", "\\")
:gsub("\4", "^"))
end
-- Remove carets if they are used to capitalize parts of transliterations (unless they have been escaped).
local function removeCarets(text, sc)
if not sc:hasCapitalization() and sc:isTransliterated() and text:find("^", 1, true) then
return processCarets(text)
else
return text
end
end
local Language = {}
--[==[Returns the language code of the language. Example: {{code|lua|"fr"}} for French.]==]
function Language:getCode()
return self._code
end
--[==[Returns the canonical name of the language. This is the name used to represent that language on Wiktionary, and is guaranteed to be unique to that language alone. Example: {{code|lua|"French"}} for French.]==]
function Language:getCanonicalName()
local name = self._name
if name == nil then
name = self._data[1]
self._name = name
end
return name
end
--[==[
Return the display form of the language. The display form of a language, family or script is the form it takes when
appearing as the <code><var>source</var></code> in categories such as <code>English terms derived from
<var>source</var></code> or <code>English given names from <var>source</var></code>, and is also the displayed text
in {makeCategoryLink()} links. For full and etymology-only languages, this is the same as the canonical name, but
for families, it reads <code>"<var>name</var> languages"</code> (e.g. {"Indo-Iranian languages"}), and for scripts,
it reads <code>"<var>name</var> script"</code> (e.g. {"Arabic script"}).
]==]
function Language:getDisplayForm()
local form = self._displayForm
if form == nil then
form = self:getCanonicalName()
-- Add article and " substrate" to substrates that lack them.
if self:getFamilyCode() == "qfa-sub" then
if not (sub(form, 1, 4) == "the " or sub(form, 1, 2) == "a ") then
form = "a " .. form
end
if not match(form, " [Ss]ubstrate") then
form = form .. " substrate"
end
end
self._displayForm = form
end
return form
end
--[==[Returns the value which should be used in the HTML lang= attribute for tagged text in the language.]==]
function Language:getHTMLAttribute(sc, region)
local code = self._code
if not find(code, "-", 1, true) then
return code .. "-" .. sc:getCode() .. (region and "-" .. region or "")
end
local parent = self:getParent()
region = region or match(code, "%f[%u][%u-]+%f[%U]")
if parent then
return parent:getHTMLAttribute(sc, region)
end
-- TODO: ISO family codes can also be used.
return "mis-" .. sc:getCode() .. (region and "-" .. region or "")
end
--[==[Returns a table of the aliases that the language is known by, excluding the canonical name. Aliases are synonyms for the language in question. The names are not guaranteed to be unique, in that sometimes more than one language is known by the same name. Example: {{code|lua|{"High German", "New High German", "Deutsch"} }} for [[:Category:German language|German]].]==]
function Language:getAliases()
self:loadInExtraData()
return require(language_like_module).getAliases(self)
end
--[==[
Return a table of the known subvarieties of a given language, excluding subvarieties that have been given
explicit etymology-only language codes. The names are not guaranteed to be unique, in that sometimes a given name
refers to a subvariety of more than one language. Example: {{code|lua|{"Southern Aymara", "Central Aymara"} }} for
[[:Category:Aymara language|Aymara]]. Note that the returned value can have nested tables in it, when a subvariety
goes by more than one name. Example: {{code|lua|{"North Azerbaijani", "South Azerbaijani", {"Afshar", "Afshari",
"Afshar Azerbaijani", "Afchar"}, {"Qashqa'i", "Qashqai", "Kashkay"}, "Sonqor"} }} for
[[:Category:Azerbaijani language|Azerbaijani]]. Here, for example, Afshar, Afshari, Afshar Azerbaijani and Afchar
all refer to the same subvariety, whose preferred name is Afshar (the one listed first). To avoid a return value
with nested tables in it, specify a non-{{code|lua|nil}} value for the <code>flatten</code> parameter; in that case,
the return value would be {{code|lua|{"North Azerbaijani", "South Azerbaijani", "Afshar", "Afshari",
"Afshar Azerbaijani", "Afchar", "Qashqa'i", "Qashqai", "Kashkay", "Sonqor"} }}.
]==]
function Language:getVarieties(flatten)
self:loadInExtraData()
return require(language_like_module).getVarieties(self, flatten)
end
--[==[Returns a table of the "other names" that the language is known by, which are listed in the <code>otherNames</code> field. It should be noted that the <code>otherNames</code> field itself is deprecated, and entries listed there should eventually be moved to either <code>aliases</code> or <code>varieties</code>.]==]
function Language:getOtherNames() -- To be eventually removed, once there are no more uses of the `otherNames` field.
self:loadInExtraData()
return require(language_like_module).getOtherNames(self)
end
--[==[
Return a combined table of the canonical name, aliases, varieties and other names of a given language.]==]
function Language:getAllNames()
self:loadInExtraData()
return require(language_like_module).getAllNames(self)
end
--[==[Returns a table of types as a lookup table (with the types as keys).
The possible types are
* {language}: This is a language, either full or etymology-only.
* {full}: This is a "full" (not etymology-only) language, i.e. the union of {regular}, {reconstructed} and
{appendix-constructed}. Note that the types {full} and {etymology-only} also exist for families, so if you
want to check specifically for a full language and you have an object that might be a family, you should
use {{lua|hasType("language", "full")}} and not simply {{lua|hasType("full")}}.
* {etymology-only}: This is an etymology-only (not full) language, whose parent is another etymology-only
language or a full language. Note that the types {full} and {etymology-only} also exist for
families, so if you want to check specifically for an etymology-only language and you have an
object that might be a family, you should use {{lua|hasType("language", "etymology-only")}}
and not simply {{lua|hasType("etymology-only")}}.
* {regular}: This indicates a full language that is attested according to [[WT:CFI]] and therefore permitted
in the main namespace. There may also be reconstructed terms for the language, which are placed in
the {Reconstruction} namespace and must be prefixed with * to indicate a reconstruction. Most full
languages are natural (not constructed) languages, but a few constructed languages (e.g. Esperanto
and Volapük, among others) are also allowed in the mainspace and considered regular languages.
* {reconstructed}: This language is not attested according to [[WT:CFI]], and therefore is allowed only in the
{Reconstruction} namespace. All terms in this language are reconstructed, and must be prefixed
with *. Languages such as Proto-Indo-European and Proto-Germanic are in this category.
* {appendix-constructed}: This language is attested but does not meet the additional requirements set out for
constructed languages ([[WT:CFI#Constructed languages]]). Its entries must therefore
be in the Appendix namespace, but they are not reconstructed and therefore should
not have * prefixed in links.
]==]
function Language:getTypes()
local types = self._types
if types == nil then
types = {language = true}
if self:getFullCode() == self._code then
types.full = true
else
types["etymology-only"] = true
end
for t in gmatch(self._data.type, "[^,]+") do
types[t] = true
end
self._types = types
end
return types
end
--[==[Given a list of types as strings, returns true if the language has all of them.]==]
function Language:hasType(...)
Language.hasType = require(language_like_module).hasType
return self:hasType(...)
end
--[==[Returns a table containing <code>WikimediaLanguage</code> objects (see [[Module:wikimedia languages]]), which represent languages and their codes as they are used in Wikimedia projects for interwiki linking and such. More than one object may be returned, as a single Wiktionary language may correspond to multiple Wikimedia languages. For example, Wiktionary's single code <code>sh</code> (Serbo-Croatian) maps to four Wikimedia codes: <code>sh</code> (Serbo-Croatian), <code>bs</code> (Bosnian), <code>hr</code> (Croatian) and <code>sr</code> (Serbian).
The code for the Wikimedia language is retrieved from the <code>wikimedia_codes</code> property in the data modules. If that property is not present, the code of the current language is used. If none of the available codes is actually a valid Wikimedia code, an empty table is returned.]==]
function Language:getWikimediaLanguages()
local wm_langs = self._wikimediaLanguageObjects
if wm_langs == nil then
local codes = self:getWikimediaLanguageCodes()
wm_langs = {}
for i = 1, #codes do
wm_langs[i] = get_wikimedia_lang(codes[i])
end
self._wikimediaLanguageObjects = wm_langs
end
return wm_langs
end
function Language:getWikimediaLanguageCodes()
local wm_langs = self._wikimediaLanguageCodes
if wm_langs == nil then
wm_langs = self._data.wikimedia_codes
if wm_langs then
wm_langs = split(wm_langs, ",", true, true)
else
local code = self._code
if is_known_language_tag(code) then
wm_langs = {code}
else
-- Inherit, but only if no codes are specified in the data *and*
-- the language code isn't a valid Wikimedia language code.
local parent = self:getParent()
wm_langs = parent and parent:getWikimediaLanguageCodes() or {}
end
end
self._wikimediaLanguageCodes = wm_langs
end
return wm_langs
end
--[==[
Returns the name of the Wikipedia article for the language. `project` specifies the language and project to retrieve
the article from, defaulting to {"enwiki"} for the English Wikipedia. Normally if specified it should be the project
code for a specific-language Wikipedia e.g. "zhwiki" for the Chinese Wikipedia, but it can be any project, including
non-Wikipedia ones. If the project is the English Wikipedia and the property {wikipedia_article} is present in the data
module it will be used first. In all other cases, a sitelink will be generated from {:getWikidataItem} (if set). The
resulting value (or lack of value) is cached so that subsequent calls are fast. If no value could be determined, and
`noCategoryFallback` is {false}, {:getCategoryName} is used as fallback; otherwise, {nil} is returned. Note that if
`noCategoryFallback` is {nil} or omitted, it defaults to {false} if the project is the English Wikipedia, otherwise
to {true}. In other words, under normal circumstances, if the English Wikipedia article couldn't be retrieved, the
return value will fall back to a link to the language's category, but this won't normally happen for any other project.
]==]
function Language:getWikipediaArticle(noCategoryFallback, project)
Language.getWikipediaArticle = require(language_like_module).getWikipediaArticle
return self:getWikipediaArticle(noCategoryFallback, project)
end
function Language:makeWikipediaLink()
return make_link(self, "w:" .. self:getWikipediaArticle(), self:getCanonicalName())
end
--[==[Returns the name of the Wikimedia Commons category page for the language.]==]
function Language:getCommonsCategory()
Language.getCommonsCategory = require(language_like_module).getCommonsCategory
return self:getCommonsCategory()
end
--[==[Returns the Wikidata item id for the language or <code>nil</code>. This corresponds to the the second field in the data modules.]==]
function Language:getWikidataItem()
Language.getWikidataItem = require(language_like_module).getWikidataItem
return self:getWikidataItem()
end
--[==[Returns a table of <code>Script</code> objects for all scripts that the language is written in. See [[Module:scripts]].]==]
function Language:getScripts()
local scripts = self._scriptObjects
if scripts == nil then
local codes = self:getScriptCodes()
if codes[1] == "All" then
scripts = load_data(scripts_data_module)
else
scripts = {}
for i = 1, #codes do
scripts[i] = get_script(codes[i])
end
end
self._scriptObjects = scripts
end
return scripts
end
--[==[Returns the table of script codes in the language's data file.]==]
function Language:getScriptCodes()
local scripts = self._scriptCodes
if scripts == nil then
scripts = self._data[4]
if scripts then
local codes, n = {}, 0
for code in gmatch(scripts, "[^,]+") do
n = n + 1
-- Special handling of "Hants", which represents "Hani", "Hant" and "Hans" collectively.
if code == "Hants" then
codes[n] = "Hani"
codes[n + 1] = "Hant"
codes[n + 2] = "Hans"
n = n + 2
else
codes[n] = code
end
end
scripts = codes
else
scripts = {"None"}
end
self._scriptCodes = scripts
end
return scripts
end
--[==[Given some text, this function iterates through the scripts of a given language and tries to find the script that best matches the text. It returns a {{code|lua|Script}} object representing the script. If no match is found at all, it returns the {{code|lua|None}} script object.]==]
function Language:findBestScript(text, forceDetect)
if not text or text == "" or text == "-" then
return get_script("None")
end
-- Differs from table returned by getScriptCodes, as Hants is not normalized into its constituents.
local codes = self._bestScriptCodes
if codes == nil then
codes = self._data[4]
codes = codes and split(codes, ",", true, true) or {"None"}
self._bestScriptCodes = codes
end
local first_sc = codes[1]
if first_sc == "All" then
return find_best_script_without_lang(text)
end
local codes_len = #codes
if not (forceDetect or first_sc == "Hants" or codes_len > 1) then
first_sc = get_script(first_sc)
local charset = first_sc.characters
return charset and umatch(text, "[" .. charset .. "]") and first_sc or get_script("None")
end
-- Remove all formatting characters.
text = get_plaintext(text)
-- Remove all spaces and any ASCII punctuation. Some non-ASCII punctuation is script-specific, so can't be removed.
text = ugsub(text, "[%s!\"#%%&'()*,%-./:;?@[\\%]_{}]+", "")
if #text == 0 then
return get_script("None")
end
-- Try to match every script against the text,
-- and return the one with the most matching characters.
local bestcount, bestscript, length = 0
for i = 1, codes_len do
local sc = codes[i]
-- Special case for "Hants", which is a special code that represents whichever of "Hant" or "Hans" best matches, or "Hani" if they match equally. This avoids having to list all three. In addition, "Hants" will be treated as the best match if there is at least one matching character, under the assumption that a Han script is desirable in terms that contain a mix of Han and other scripts (not counting those which use Jpan or Kore).
if sc == "Hants" then
local Hani = get_script("Hani")
if not Hant_chars then
Hant_chars = load_data("Module:zh/data/ts")
Hans_chars = load_data("Module:zh/data/st")
end
local t, s, found = 0, 0
-- This is faster than using mw.ustring.gmatch directly.
for ch in gmatch((ugsub(text, "[" .. Hani.characters .. "]", "\255%0")), "\255(.[\128-\191]*)") do
found = true
if Hant_chars[ch] then
t = t + 1
if Hans_chars[ch] then
s = s + 1
end
elseif Hans_chars[ch] then
s = s + 1
else
t, s = t + 1, s + 1
end
end
if found then
if t == s then
return Hani
end
return get_script(t > s and "Hant" or "Hans")
end
else
sc = get_script(sc)
if not length then
length = ulen(text)
end
-- Count characters by removing everything in the script's charset and comparing to the original length.
local charset = sc.characters
local count = charset and length - ulen((ugsub(text, "[" .. charset .. "]+", ""))) or 0
if count >= length then
return sc
elseif count > bestcount then
bestcount = count
bestscript = sc
end
end
end
-- Return best matching script, or otherwise None.
return bestscript or get_script("None")
end
--[==[Returns a <code>Family</code> object for the language family that the language belongs to. See [[Module:families]].]==]
function Language:getFamily()
local family = self._familyObject
if family == nil then
family = self:getFamilyCode()
-- If the value is nil, it's cached as false.
family = family and get_family(family) or false
self._familyObject = family
end
return family or nil
end
--[==[Returns the family code in the language's data file.]==]
function Language:getFamilyCode()
local family = self._familyCode
if family == nil then
-- If the value is nil, it's cached as false.
family = self._data[3] or false
self._familyCode = family
end
return family or nil
end
function Language:getFamilyName()
local family = self._familyName
if family == nil then
family = self:getFamily()
-- If the value is nil, it's cached as false.
family = family and family:getCanonicalName() or false
self._familyName = family
end
return family or nil
end
do
local function check_family(self, family)
if type(family) == "table" then
family = family:getCode()
end
if self:getFamilyCode() == family then
return true
end
local self_family = self:getFamily()
if self_family:inFamily(family) then
return true
-- If the family isn't a real family (e.g. creoles) check any ancestors.
elseif self_family:inFamily("qfa-not") then
local ancestors = self:getAncestors()
for _, ancestor in ipairs(ancestors) do
if ancestor:inFamily(family) then
return true
end
end
end
end
--[==[Check whether the language belongs to `family` (which can be a family code or object). A list of objects can be given in place of `family`; in that case, return true if the language belongs to any of the specified families. Note that some languages (in particular, certain creoles) can have multiple immediate ancestors potentially belonging to different families; in that case, return true if the language belongs to any of the specified families.]==]
function Language:inFamily(...)
if self:getFamilyCode() == nil then
return false
end
return check_inputs(self, check_family, false, ...)
end
end
function Language:getParent()
local parent = self._parentObject
if parent == nil then
parent = self:getParentCode()
-- If the value is nil, it's cached as false.
parent = parent and get_by_code(parent, nil, true, true) or false
self._parentObject = parent
end
return parent or nil
end
function Language:getParentCode()
local parent = self._parentCode
if parent == nil then
-- If the value is nil, it's cached as false.
parent = self._data.parent or false
self._parentCode = parent
end
return parent or nil
end
function Language:getParentName()
local parent = self._parentName
if parent == nil then
parent = self:getParent()
-- If the value is nil, it's cached as false.
parent = parent and parent:getCanonicalName() or false
self._parentName = parent
end
return parent or nil
end
function Language:getParentChain()
local chain = self._parentChain
if chain == nil then
chain = {}
local parent, n = self:getParent(), 0
while parent do
n = n + 1
chain[n] = parent
parent = parent:getParent()
end
self._parentChain = chain
end
return chain
end
do
local function check_lang(self, lang)
for _, parent in ipairs(self:getParentChain()) do
if (type(lang) == "string" and lang or lang:getCode()) == parent:getCode() then
return true
end
end
end
function Language:hasParent(...)
return check_inputs(self, check_lang, false, ...)
end
end
--[==[
If the language is etymology-only, this iterates through parents until a full language or family is found, and the
corresponding object is returned. If the language is a full language, then it simply returns itself.
]==]
function Language:getFull()
local full = self._fullObject
if full == nil then
full = self:getFullCode()
full = full == self._code and self or get_by_code(full)
self._fullObject = full
end
return full
end
--[==[
If the language is an etymology-only language, this iterates through parents until a full language or family is
found, and the corresponding code is returned. If the language is a full language, then it simply returns the
language code.
]==]
function Language:getFullCode()
return self._fullCode or self._code
end
--[==[
If the language is an etymology-only language, this iterates through parents until a full language or family is
found, and the corresponding canonical name is returned. If the language is a full language, then it simply returns
the canonical name of the language.
]==]
function Language:getFullName()
local full = self._fullName
if full == nil then
full = self:getFull():getCanonicalName()
self._fullName = full
end
return full
end
--[==[Returns a table of <code class="nf">Language</code> objects for all languages that this language is directly descended from. Generally this is only a single language, but creoles, pidgins and mixed languages can have multiple ancestors.]==]
function Language:getAncestors()
local ancestors = self._ancestorObjects
if ancestors == nil then
ancestors = {}
local ancestor_codes = self:getAncestorCodes()
if #ancestor_codes > 0 then
for _, ancestor in ipairs(ancestor_codes) do
insert(ancestors, get_by_code(ancestor, nil, true))
end
else
local fam = self:getFamily()
local protoLang = fam and fam:getProtoLanguage() or nil
-- For the cases where the current language is the proto-language
-- of its family, or an etymology-only language that is ancestral to that
-- proto-language, we need to step up a level higher right from the
-- start.
if protoLang and (
protoLang:getCode() == self._code or
(self:hasType("etymology-only") and protoLang:hasAncestor(self))
) then
fam = fam:getFamily()
protoLang = fam and fam:getProtoLanguage() or nil
end
while not protoLang and not (not fam or fam:getCode() == "qfa-not") do
fam = fam:getFamily()
protoLang = fam and fam:getProtoLanguage() or nil
end
insert(ancestors, protoLang)
end
self._ancestorObjects = ancestors
end
return ancestors
end
do
-- Avoid a language being its own ancestor via class inheritance. We only need to check for this if the language has inherited an ancestor table from its parent, because we never want to drop ancestors that have been explicitly set in the data.
-- Recursively iterate over ancestors until we either find self or run out. If self is found, return true.
local function check_ancestor(self, lang)
local codes = lang:getAncestorCodes()
if not codes then
return nil
end
for i = 1, #codes do
local code = codes[i]
if code == self._code then
return true
end
local anc = get_by_code(code, nil, true)
if check_ancestor(self, anc) then
return true
end
end
end
--[==[Returns a table of <code class="nf">Language</code> codes for all languages that this language is directly descended from. Generally this is only a single language, but creoles, pidgins and mixed languages can have multiple ancestors.]==]
function Language:getAncestorCodes()
if self._ancestorCodes then
return self._ancestorCodes
end
local data = self._data
local codes = data.ancestors
if codes == nil then
codes = {}
self._ancestorCodes = codes
return codes
end
codes = split(codes, ",", true, true)
self._ancestorCodes = codes
-- If there are no codes or the ancestors weren't inherited data, there's nothing left to check.
if #codes == 0 or self:getData(false, "raw").ancestors ~= nil then
return codes
end
local i, code = 1
while i <= #codes do
code = codes[i]
if check_ancestor(self, self) then
remove(codes, i)
else
i = i + 1
end
end
return codes
end
end
--[==[Given a list of language objects or codes, returns true if at least one of them is an ancestor. This includes any etymology-only children of that ancestor. If the language's ancestor(s) are etymology-only languages, it will also return true for those language parent(s) (e.g. if Vulgar Latin is the ancestor, it will also return true for its parent, Latin). However, a parent is excluded from this if the ancestor is also ancestral to that parent (e.g. if Classical Persian is the ancestor, Persian would return false, because Classical Persian is also ancestral to Persian).]==]
function Language:hasAncestor(...)
local function iterateOverAncestorTree(node, func, parent_check)
local ancestors = node:getAncestors()
local ancestorsParents = {}
for _, ancestor in ipairs(ancestors) do
-- When checking the parents of the other language, and the ancestor is also a parent, skip to the next ancestor, so that we exclude any etymology-only children of that parent that are not directly related (see below).
local ret = (parent_check or not node:hasParent(ancestor)) and
func(ancestor) or iterateOverAncestorTree(ancestor, func, parent_check)
if ret then
return ret
end
end
-- Check the parents of any ancestors. We don't do this if checking the parents of the other language, so that we exclude any etymology-only children of those parents that are not directly related (e.g. if the ancestor is Vulgar Latin and we are checking New Latin, we want it to return false because they are on different ancestral branches. As such, if we're already checking the parent of New Latin (Latin) we don't want to compare it to the parent of the ancestor (Latin), as this would be a false positive; it should be one or the other).
if not parent_check then
return nil
end
for _, ancestor in ipairs(ancestors) do
local ancestorParents = ancestor:getParentChain()
for _, ancestorParent in ipairs(ancestorParents) do
if ancestorParent:getCode() == self._code or ancestorParent:hasAncestor(ancestor) then
break
else
insert(ancestorsParents, ancestorParent)
end
end
end
for _, ancestorParent in ipairs(ancestorsParents) do
local ret = func(ancestorParent)
if ret then
return ret
end
end
end
local function do_iteration(otherlang, parent_check)
-- otherlang can't be self
if (type(otherlang) == "string" and otherlang or otherlang:getCode()) == self._code then
return false
end
repeat
if iterateOverAncestorTree(
self,
function(ancestor)
return ancestor:getCode() == (type(otherlang) == "string" and otherlang or otherlang:getCode())
end,
parent_check
) then
return true
elseif type(otherlang) == "string" then
otherlang = get_by_code(otherlang, nil, true)
end
otherlang = otherlang:getParent()
parent_check = false
until not otherlang
end
local parent_check = true
for _, otherlang in ipairs{...} do
local ret = do_iteration(otherlang, parent_check)
if ret then
return true
end
end
return false
end
do
local function construct_node(lang, memo)
local branch, ancestors = {lang = lang:getCode()}
memo[lang:getCode()] = branch
for _, ancestor in ipairs(lang:getAncestors()) do
if ancestors == nil then
ancestors = {}
end
insert(ancestors, memo[ancestor:getCode()] or construct_node(ancestor, memo))
end
branch.ancestors = ancestors
return branch
end
function Language:getAncestorChain()
local chain = self._ancestorChain
if chain == nil then
chain = construct_node(self, {})
self._ancestorChain = chain
end
return chain
end
end
function Language:getAncestorChainOld()
local chain = self._ancestorChain
if chain == nil then
chain = {}
local step = self
while true do
local ancestors = step:getAncestors()
step = #ancestors == 1 and ancestors[1] or nil
if not step then
break
end
insert(chain, step)
end
self._ancestorChain = chain
end
return chain
end
local function fetch_descendants(self, fmt)
local descendants, family = {}, self:getFamily()
-- Iterate over all three datasets.
for _, data in ipairs{
require("Module:languages/code to canonical name"),
require("Module:etymology languages/code to canonical name"),
require("Module:families/code to canonical name"),
} do
for code in pairs(data) do
local lang = get_by_code(code, nil, true, true)
-- Test for a descendant. Earlier tests weed out most candidates, while the more intensive tests are only used sparingly.
if (
code ~= self._code and -- Not self.
lang:inFamily(family) and -- In the same family.
(
family:getProtoLanguageCode() == self._code or -- Self is the protolanguage.
self:hasDescendant(lang) or -- Full hasDescendant check.
(lang:getFullCode() == self._code and not self:hasAncestor(lang)) -- Etymology-only child which isn't an ancestor.
)
) then
if fmt == "object" then
insert(descendants, lang)
elseif fmt == "code" then
insert(descendants, code)
elseif fmt == "name" then
insert(descendants, lang:getCanonicalName())
end
end
end
end
return descendants
end
function Language:getDescendants()
local descendants = self._descendantObjects
if descendants == nil then
descendants = fetch_descendants(self, "object")
self._descendantObjects = descendants
end
return descendants
end
function Language:getDescendantCodes()
local descendants = self._descendantCodes
if descendants == nil then
descendants = fetch_descendants(self, "code")
self._descendantCodes = descendants
end
return descendants
end
function Language:getDescendantNames()
local descendants = self._descendantNames
if descendants == nil then
descendants = fetch_descendants(self, "name")
self._descendantNames = descendants
end
return descendants
end
do
local function check_lang(self, lang)
if type(lang) == "string" then
lang = get_by_code(lang, nil, true)
end
if lang:hasAncestor(self) then
return true
end
end
function Language:hasDescendant(...)
return check_inputs(self, check_lang, false, ...)
end
end
local function fetch_children(self, fmt)
local m_etym_data = require(etymology_languages_data_module)
local self_code, children = self._code, {}
for code, lang in pairs(m_etym_data) do
local _lang = lang
repeat
local parent = _lang.parent
if parent == self_code then
if fmt == "object" then
insert(children, get_by_code(code, nil, true))
elseif fmt == "code" then
insert(children, code)
elseif fmt == "name" then
insert(children, lang[1])
end
break
end
_lang = m_etym_data[parent]
until not _lang
end
return children
end
function Language:getChildren()
local children = self._childObjects
if children == nil then
children = fetch_children(self, "object")
self._childObjects = children
end
return children
end
function Language:getChildrenCodes()
local children = self._childCodes
if children == nil then
children = fetch_children(self, "code")
self._childCodes = children
end
return children
end
function Language:getChildrenNames()
local children = self._childNames
if children == nil then
children = fetch_children(self, "name")
self._childNames = children
end
return children
end
function Language:hasChild(...)
local lang = ...
if not lang then
return false
elseif type(lang) == "string" then
lang = get_by_code(lang, nil, true)
end
if lang:hasParent(self) then
return true
end
return self:hasChild(select(2, ...))
end
--[==[Returns the name of the main category of that language. Example: {{code|lua|"French language"}} for French, whose category is at [[:Category:French language]]. Unless optional argument <code>nocap</code> is given, the language name at the beginning of the returned value will be capitalized. This capitalization is correct for category names, but not if the language name is lowercase and the returned value of this function is used in the middle of a sentence.]==]
function Language:getCategoryName(nocap)
local name = self._categoryName
if name == nil then
name = self:getCanonicalName()
-- If a substrate, omit any leading article.
if self:getFamilyCode() == "qfa-sub" then
name = name:gsub("^the ", ""):gsub("^a ", "")
end
-- Only add " language" if a full language.
if self:hasType("full") then
-- Unless the canonical name already ends with "language", "lect" or their derivatives, add " language".
--if not (match(name, "[Ll]anguage$") or match(name, "[Ll]ect$")) then
if not (match(name, "^ภาษา") or match(name, "^ภาษณ์")) then
name = "ภาษา" .. name
end
end
self._categoryName = name
end
if nocap then
return name
end
return mw.getContentLanguage():ucfirst(name)
end
--[==[Creates a link to the category; the link text is the canonical name.]==]
function Language:makeCategoryLink()
return make_link(self, ":Category:" .. self:getCategoryName(), self:getDisplayForm())
end
function Language:getStandardCharacters(sc)
local standard_chars = self._data.standard_chars
if type(standard_chars) ~= "table" then
return standard_chars
elseif sc and type(sc) ~= "string" then
check_object("script", nil, sc)
sc = sc:getCode()
end
if (not sc) or sc == "None" then
local scripts = {}
for _, script in pairs(standard_chars) do
insert(scripts, script)
end
return concat(scripts)
end
if standard_chars[sc] then
return standard_chars[sc] .. (standard_chars[1] or "")
end
end
--[==[
Strip diacritics from display text `text` (in a language-specific fashion), which is in the script `sc`. If `sc` is
omitted or {nil}, the script is autodetected. This also strips certain punctuation characters from the end and (in the
case of Spanish upside-down question mark and exclamation points) from the beginning; strips any whitespace at the
end of the text or between the text and final stripped punctuation characters; and applies some language-specific
Unicode normalizations to replace discouraged characters with their prescribed alternatives. Return the stripped text.
]==]
function Language:stripDiacritics(text, sc)
if (not text) or text == "" then
return text
end
sc = checkScript(text, self, sc)
text = normalize(text, sc)
-- FIXME, rename makeEntryName to stripDiacritics and get rid of second and third return values
-- everywhere
text, _, _ = iterateSectionSubstitutions(self, text, sc, nil, nil,
self._data.strip_diacritics or self._data.entry_name, "strip_diacritics", "stripDiacritics")
text = umatch(text, "^[¿¡]?(.-[^%s%p].-)%s*[؟?!;՛՜ ՞ ՟?!︖︕।॥။၊་།]?$") or text
return text
end
--[==[
Convert a ''logical'' pagename (the pagename as it appears to the user, after diacritics and punctuation have been
stripped) to a ''physical'' pagename (the pagename as it appears in the MediaWiki database). Reasons for a difference
between the two are (a) unsupported titles such as `[ ]` (with square brackets in them), `#` (pound/hash sign) and
`¯\_(ツ)_/¯` (with underscores), as well as overly long titles of various sorts; (b) "mammoth" pages that are split into
parts (e.g. `a`, which is split into physical pagenames `a/languages A to L` and `a/languages M to Z`). For almost all
purposes, you should work with logical and not physical pagenames. But there are certain use cases that require physical
pagenames, such as checking the existence of a page or retrieving a page's contents.
`pagename` is the logical pagename to be converted. `is_reconstructed_or_appendix` indicates whether the page is in the
`Reconstruction` or `Appendix` namespaces. If it is omitted or has the value {nil}, the pagename is checked for an
initial asterisk, and if found, the page is assumed to be a `Reconstruction` page. Setting a value of `false` or `true`
to `is_reconstructed_or_appendix` disables this check and allows for mainspace pagenames that begin with an asterisk.
]==]
function Language:logicalToPhysical(pagename, is_reconstructed_or_appendix)
-- FIXME: This probably shouldn't happen but it happens when makeEntryName() receives nil.
if pagename == nil then
track("nil-passed-to-logicalToPhysical")
return nil
end
local initial_asterisk
if is_reconstructed_or_appendix == nil then
local pagename_minus_initial_asterisk
initial_asterisk, pagename_minus_initial_asterisk = pagename:match("^(%*)(.*)$")
if pagename_minus_initial_asterisk then
is_reconstructed_or_appendix = true
pagename = pagename_minus_initial_asterisk
elseif self:hasType("appendix-constructed") then
is_reconstructed_or_appendix = true
end
end
if not is_reconstructed_or_appendix then
-- Check if the pagename is a listed unsupported title.
local unsupportedTitles = load_data(links_data_module).unsupported_titles
if unsupportedTitles[pagename] then
return "Unsupported titles/" .. unsupportedTitles[pagename]
end
end
-- Set `unsupported` as true if certain conditions are met.
local unsupported
-- Check if there's an unsupported character. \239\191\189 is the replacement character U+FFFD, which can't be typed
-- directly here due to an abuse filter. Unix-style dot-slash notation is also unsupported, as it is used for
-- relative paths in links, as are 3 or more consecutive tildes. Note: match is faster with magic
-- characters/charsets; find is faster with plaintext.
if (
match(pagename, "[#<>%[%]_{|}]") or
find(pagename, "\239\191\189") or
match(pagename, "%f[^%z/]%.%.?%f[%z/]") or
find(pagename, "~~~")
) then
unsupported = true
-- If it looks like an interwiki link.
elseif find(pagename, ":") then
local prefix = gsub(pagename, "^:*(.-):.*", ulower)
if (
load_data("Module:data/namespaces")[prefix] or
load_data("Module:data/interwikis")[prefix]
) then
unsupported = true
end
end
-- Escape unsupported characters so they can be used in titles. ` is used as a delimiter for this, so a raw use of
-- it in an unsupported title is also escaped here to prevent interference; this is only done with unsupported
-- titles, though, so inclusion won't in itself mean a title is treated as unsupported (which is why it's excluded
-- from the earlier test).
if unsupported then
-- FIXME: This conversion needs to be different for reconstructed pages with unsupported characters. There
-- aren't any currently, but if there ever are, we need to fix this e.g. to put them in something like
-- Reconstruction:Proto-Indo-European/Unsupported titles/`lowbar``num`.
local unsupported_characters = load_data(links_data_module).unsupported_characters
pagename = pagename:gsub("[#<>%[%]_`{|}\239]\191?\189?", unsupported_characters)
:gsub("%f[^%z/]%.%.?%f[%z/]", function(m)
return (gsub(m, "%.", "`period`"))
end)
:gsub("~~~+", function(m)
return (gsub(m, "~", "`tilde`"))
end)
pagename = "ชื่อไม่รองรับ/" .. pagename
elseif not is_reconstructed_or_appendix then
-- Check if this is a mammoth page. If so, which subpage should we link to?
local m_links_data = load_data(links_data_module)
local mammoth_page_type = m_links_data.mammoth_pages[pagename]
if mammoth_page_type then
local canonical_name = self:getFullName()
if canonical_name ~= "ร่วม" and canonical_name ~= "อังกฤษ" then
local this_subpage
local L2_sort_key = get_L2_sort_key(canonical_name)
for _, subpage_spec in ipairs(m_links_data.mammoth_page_subpage_types[mammoth_page_type]) do
-- unpack() fails utterly on data loaded using mw.loadData() even if offsets are given
local subpage, pattern = subpage_spec[1], subpage_spec[2]
if pattern == true or L2_sort_key:match(pattern) then
this_subpage = subpage
break
end
end
if not this_subpage then
error(("Internal error: Bad data in mammoth_page_subpage_pages in [[Module:links/data]] for mammoth page %s, type %s; last entry didn't have 'true' in it"):format(
pagename, mammoth_page_type))
end
pagename = pagename .. "/" .. this_subpage
end
end
end
return (initial_asterisk or "") .. pagename
end
--[==[
Strip the diacritics from a display pagename and convert the resulting logical pagename into a physical pagename.
This allows you, for example, to retrieve the contents of the page or check its existence. WARNING: This is deprecated
and will be going away. It is a simple composition of `self:stripDiacritics` and `self:logicalToPhysical`; most callers
only want the former, and if you need both, call them both yourself.
`text` and `sc` are as in `self:stripDiacritics`, and `is_reconstructed_or_appendix` is as in `self:logicalToPhysical`.
]==]
function Language:makeEntryName(text, sc, is_reconstructed_or_appendix)
return self:logicalToPhysical(self:stripDiacritics(text, sc), is_reconstructed_or_appendix)
end
--[==[Generates alternative forms using a specified method, and returns them as a table. If no method is specified, returns a table containing only the input term.]==]
function Language:generateForms(text, sc)
local generate_forms = self._data.generate_forms
if generate_forms == nil then
return {text}
end
sc = checkScript(text, self, sc)
return require("Module:" .. self._data.generate_forms).generateForms(text, self, sc)
end
--[==[Creates a sort key for the given stripped text, following the rules appropriate for the language. This removes
diacritical marks from the stripped text if they are not considered significant for sorting, and may perform some other
changes. Any initial hyphen is also removed, and anything in parentheses is removed as well.
The <code>sort_key</code> setting for each language in the data modules defines the replacements made by this function, or it gives the name of the module that takes the stripped text and returns a sortkey.]==]
function Language:makeSortKey(text, sc)
if (not text) or text == "" then
return text
end
if match(text, "<[^<>]+>") then
track("track HTML tag")
end
-- Remove directional characters, bold, italics, soft hyphens, strip markers and HTML tags.
-- FIXME: Partly duplicated with remove_formatting() in [[Module:links]].
text = ugsub(text, "[\194\173\226\128\170-\226\128\174\226\129\166-\226\129\169]", "")
text = text:gsub("('*)'''(.-'*)'''", "%1%2"):gsub("('*)''(.-'*)''", "%1%2")
text = gsub(unstrip(text), "<[^<>]+>", "")
text = decode_uri(text, "PATH")
text = checkNoEntities(self, text)
-- Remove initial hyphens and * unless the term only consists of spacing + punctuation characters.
text = ugsub(text, "^([-]*)[-־ـ᠊*]+([-]*)(.*[^%s%p].*)", "%1%2%3")
sc = checkScript(text, self, sc)
text = normalize(text, sc)
text = removeCarets(text, sc)
-- For languages with dotted dotless i, ensure that "İ" is sorted as "i", and "I" is sorted as "ı".
if self:hasDottedDotlessI() then
text = gsub(text, "I\204\135", "i") -- decomposed "İ"
:gsub("I", "ı")
text = sc:toFixedNFD(text)
end
-- Convert to lowercase, make the sortkey, then convert to uppercase. Where the language has dotted dotless i, it is
-- usually not necessary to convert "i" to "İ" and "ı" to "I" first, because "I" will always be interpreted as
-- conventional "I" (not dotless "İ") by any sorting algorithms, which will have been taken into account by the
-- sortkey substitutions themselves. However, if no sortkey substitutions have been specified, then conversion is
-- necessary so as to prevent "i" and "ı" both being sorted as "I".
--
-- An exception is made for scripts that (sometimes) sort by scraping page content, as that means they are sensitive
-- to changes in capitalization (as it changes the target page).
if not sc:sortByScraping() then
text = ulower(text)
end
local actual_substitution_data
-- Don't trim whitespace here because it's significant at the beginning of a sort key or sort base.
text, _, actual_substitution_data = iterateSectionSubstitutions(self, text, sc, nil, nil, self._data.sort_key,
"sort_key", "makeSortKey", "notrim")
if not sc:sortByScraping() then
if self:hasDottedDotlessI() and not actual_substitution_data then
text = text:gsub("ı", "I"):gsub("i", "İ")
text = sc:toFixedNFC(text)
end
text = uupper(text)
end
-- Remove parentheses, as long as they are either preceded or followed by something.
text = gsub(text, "(.)[()]+", "%1"):gsub("[()]+(.)", "%1")
text = escape_risky_characters(text)
return text
end
--[==[Create the form used as as a basis for display text and transliteration. FIXME: Rename to correctInputText().]==]
local function processDisplayText(text, self, sc, keepCarets, keepPrefixes)
local subbedChars = {}
text, subbedChars = doTempSubstitutions(text, subbedChars, keepCarets)
text = decode_uri(text, "PATH")
text = checkNoEntities(self, text)
sc = checkScript(text, self, sc)
text = normalize(text, sc)
text, subbedChars = iterateSectionSubstitutions(self, text, sc, subbedChars, keepCarets, self._data.display_text,
"display_text", "makeDisplayText")
text = removeCarets(text, sc)
-- Remove any interwiki link prefixes (unless they have been escaped or this has been disabled).
if find(text, ":") and not keepPrefixes then
local rep
repeat
text, rep = gsub(text, "\\\\(\\*:)", "\3%1")
until rep == 0
text = gsub(text, "\\:", "\4")
while true do
local prefix = gsub(text, "^(.-):.+", function(m1)
return (gsub(m1, "\244[\128-\191]*", ""))
end)
-- Check if the prefix is an interwiki, though ignore capitalised Wiktionary:, which is a namespace.
if not prefix or prefix == text or prefix == "Wiktionary"
or not (load_data("Module:data/interwikis")[ulower(prefix)] or prefix == "") then
break
end
text = gsub(text, "^(.-):(.*)", function(m1, m2)
local ret = {}
for subbedChar in gmatch(m1, "\244[\128-\191]*") do
insert(ret, subbedChar)
end
return concat(ret) .. m2
end)
end
text = gsub(text, "\3", "\\"):gsub("\4", ":")
end
return text, subbedChars
end
--[==[Make the display text (i.e. what is displayed on the page).]==]
function Language:makeDisplayText(text, sc, keepPrefixes)
if not text or text == "" then
return text
end
local subbedChars
text, subbedChars = processDisplayText(text, self, sc, nil, keepPrefixes)
text = escape_risky_characters(text)
return undoTempSubstitutions(text, subbedChars)
end
--[==[Transliterates the text from the given script into the Latin script (see
[[Wiktionary:Transliteration and romanization]]). The language must have the <code>translit</code> property for this to
work; if it is not present, {{code|lua|nil}} is returned.
The <code>sc</code> parameter is handled by the transliteration module, and how it is handled is specific to that
module. Some transliteration modules may tolerate {{code|lua|nil}} as the script, others require it to be one of the
possible scripts that the module can transliterate, and will throw an error if it's not one of them. For this reason,
the <code>sc</code> parameter should always be provided when writing non-language-specific code.
The <code>module_override</code> parameter is used to override the default module that is used to provide the
transliteration. This is useful in cases where you need to demonstrate a particular module in use, but there is no
default module yet, or you want to demonstrate an alternative version of a transliteration module before making it
official. It should not be used in real modules or templates, only for testing. All uses of this parameter are tracked
by [[Wiktionary:Tracking/languages/module_override]].
'''Known bugs''':
* This function assumes {tr(s1) .. tr(s2) == tr(s1 .. s2)}. When this assertion fails, wikitext markups like <nowiki>'''</nowiki> can cause wrong transliterations.
* HTML entities like <code>&apos;</code>, often used to escape wikitext markups, do not work.
]==]
function Language:transliterate(text, sc, module_override)
-- If there is no text, or the language doesn't have transliteration data and there's no override, return nil.
if not text or text == "" or text == "-" then
return text
end
-- If the script is not transliteratable (and no override is given), return nil.
sc = checkScript(text, self, sc)
if not (sc:isTransliterated() or module_override) then
-- temporary tracking to see if/when this gets triggered
track("non-transliterable")
track("non-transliterable/" .. self._code)
track("non-transliterable/" .. sc:getCode())
track("non-transliterable/" .. sc:getCode() .. "/" .. self._code)
return nil
end
-- Remove any strip markers.
text = unstrip(text)
-- Do not process the formatting into PUA characters for certain languages.
local processed = load_data(languages_data_module).substitution[self._code] ~= "none"
-- Get the display text with the keepCarets flag set.
local subbedChars
if processed then
text, subbedChars = processDisplayText(text, self, sc, true)
end
-- Transliterate (using the module override if applicable).
text, subbedChars = iterateSectionSubstitutions(self, text, sc, subbedChars, true, module_override or
self._data.translit, "translit", "tr")
if not text then
return nil
end
-- Incomplete transliterations return nil.
local charset = sc.characters
if charset and umatch(text, "[" .. charset .. "]") then
-- Remove any characters in Latin, which includes Latin characters also included in other scripts (as these are
-- false positives), as well as any PUA substitutions. Anything remaining should only be script code "None"
-- (e.g. numerals).
local check_text = ugsub(text, "[" .. get_script("Latn").characters .. "-]+", "")
-- Set none_is_last_resort_only flag, so that any non-None chars will cause a script other than "None" to be
-- returned.
if find_best_script_without_lang(check_text, true):getCode() ~= "None" then
return nil
end
end
if processed then
text = escape_risky_characters(text)
text = undoTempSubstitutions(text, subbedChars)
end
-- If the script does not use capitalization, then capitalize any letters of the transliteration which are
-- immediately preceded by a caret (and remove the caret).
if text and not sc:hasCapitalization() and text:find("^", 1, true) then
text = processCarets(text, "%^([\128-\191\244]*%*?)([^\128-\191\244][\128-\191]*)", function(m1, m2)
return m1 .. uupper(m2)
end)
end
-- Track module overrides.
if module_override ~= nil then
track("module_override")
end
return text
end
do
local function handle_language_spec(self, spec, sc)
local ret = self["_" .. spec]
if ret == nil then
ret = self._data[spec]
if type(ret) == "string" then
ret = list_to_set(split(ret, ",", true, true))
end
self["_" .. spec] = ret
end
if type(ret) == "table" then
ret = ret[sc:getCode()]
end
return not not ret
end
function Language:overrideManualTranslit(sc)
return handle_language_spec(self, "override_translit", sc)
end
function Language:link_tr(sc)
return handle_language_spec(self, "link_tr", sc)
end
end
--[==[Returns {{code|lua|true}} if the language has a transliteration module, or {{code|lua|false}} if it doesn't.]==]
function Language:hasTranslit()
return not not self._data.translit
end
--[==[Returns {{code|lua|true}} if the language uses the letters I/ı and İ/i, or {{code|lua|false}} if it doesn't.]==]
function Language:hasDottedDotlessI()
return not not self._data.dotted_dotless_i
end
function Language:toJSON(opts)
local strip_diacritics, strip_diacritics_patterns, strip_diacritics_remove_diacritics = self._data.strip_diacritics
if strip_diacritics then
if strip_diacritics.from then
strip_diacritics_patterns = {}
for i, from in ipairs(strip_diacritics.from) do
insert(strip_diacritics_patterns, {from = from, to = strip_diacritics.to[i] or ""})
end
end
strip_diacritics_remove_diacritics = strip_diacritics.remove_diacritics
end
-- mainCode should only end up non-nil if dontCanonicalizeAliases is passed to make_object().
-- props should either contain zero-argument functions to compute the value, or the value itself.
local props = {
ancestors = function() return self:getAncestorCodes() end,
canonicalName = function() return self:getCanonicalName() end,
categoryName = function() return self:getCategoryName("nocap") end,
code = self._code,
mainCode = self._mainCode,
parent = function() return self:getParentCode() end,
full = function() return self:getFullCode() end,
stripDiacriticsPatterns = strip_diacritics_patterns,
stripDiacriticsRemoveDiacritics = strip_diacritics_remove_diacritics,
family = function() return self:getFamilyCode() end,
aliases = function() return self:getAliases() end,
varieties = function() return self:getVarieties() end,
otherNames = function() return self:getOtherNames() end,
scripts = function() return self:getScriptCodes() end,
type = function() return keys_to_list(self:getTypes()) end,
wikimediaLanguages = function() return self:getWikimediaLanguageCodes() end,
wikidataItem = function() return self:getWikidataItem() end,
wikipediaArticle = function() return self:getWikipediaArticle(true) end,
}
local ret = {}
for prop, val in pairs(props) do
if not opts.skip_fields or not opts.skip_fields[prop] then
if type(val) == "function" then
ret[prop] = val()
else
ret[prop] = val
end
end
end
-- Use `deep_copy` when returning a table, so that there are no editing restrictions imposed by `mw.loadData`.
return opts and opts.lua_table and deep_copy(ret) or to_json(ret, opts)
end
function export.getDataModuleName(code)
local letter = match(code, "^(%l)%l%l?$")
return "Module:" .. (
letter == nil and "languages/data/exceptional" or
#code == 2 and "languages/data/2" or
"languages/data/3/" .. letter
)
end
get_data_module_name = export.getDataModuleName
function export.getExtraDataModuleName(code)
return get_data_module_name(code) .. "/extra"
end
get_extra_data_module_name = export.getExtraDataModuleName
do
local function make_stack(data)
local key_types = {
[2] = "unique",
aliases = "unique",
otherNames = "unique",
type = "append",
varieties = "unique",
wikipedia_article = "unique",
wikimedia_codes = "unique"
}
local function __index(self, k)
local stack, key_type = getmetatable(self), key_types[k]
-- Data that isn't inherited from the parent.
if key_type == "unique" then
local v = stack[stack[make_stack]][k]
if v == nil then
local layer = stack[0]
if layer then -- Could be false if there's no extra data.
v = layer[k]
end
end
return v
-- Data that is appended by each generation.
elseif key_type == "append" then
local parts, offset, n = {}, 0, stack[make_stack]
for i = 1, n do
local part = stack[i][k]
if part == nil then
offset = offset + 1
else
parts[i - offset] = part
end
end
return offset ~= n and concat(parts, ",") or nil
end
local n = stack[make_stack]
while true do
local layer = stack[n]
if not layer then -- Could be false if there's no extra data.
return nil
end
local v = layer[k]
if v ~= nil then
return v
end
n = n - 1
end
end
local function __newindex()
error("table is read-only")
end
local function __pairs(self)
-- Iterate down the stack, caching keys to avoid duplicate returns.
local stack, seen = getmetatable(self), {}
local n = stack[make_stack]
local iter, state, k, v = pairs(stack[n])
return function()
repeat
repeat
k = iter(state, k)
if k == nil then
n = n - 1
local layer = stack[n]
if not layer then -- Could be false if there's no extra data.
return nil
end
iter, state, k = pairs(layer)
end
until not (k == nil or seen[k])
-- Get the value via a lookup, as the one returned by the
-- iterator will be the raw value from the current layer,
-- which may not be the one __index will return for that
-- key. Also memoize the key in `seen` (even if the lookup
-- returns nil) so that it doesn't get looked up again.
-- TODO: store values in `self`, avoiding the need to create
-- the `seen` table. The iterator will need to iterate over
-- `self` with `next` first to find these on future loops.
v, seen[k] = self[k], true
until v ~= nil
return k, v
end
end
local __ipairs = require(table_module).indexIpairs
function make_stack(data)
local stack = {
data,
[make_stack] = 1, -- stores the length and acts as a sentinel to confirm a given metatable is a stack.
__index = __index,
__newindex = __newindex,
__pairs = __pairs,
__ipairs = __ipairs,
}
stack.__metatable = stack
return setmetatable({}, stack), stack
end
return make_stack(data)
end
local function get_stack(data)
local stack = getmetatable(data)
return stack and type(stack) == "table" and stack[make_stack] and stack or nil
end
--[==[
<span style="color: var(--wikt-palette-red,#BA0000)">This function is not for use in entries or other content pages.</span>
Returns a blob of data about the language. The format of this blob is undocumented, and perhaps unstable; it's intended for things like the module's own unit-tests, which are "close friends" with the module and will be kept up-to-date as the format changes. If `extra` is set, any extra data in the relevant `/extra` module will be included. (Note that it will be included anyway if it has already been loaded into the language object.) If `raw` is set, then the returned data will not contain any data inherited from parent objects.
-- Do NOT use these methods!
-- All uses should be pre-approved on the talk page!
]==]
function Language:getData(extra, raw)
if extra then
self:loadInExtraData()
end
local data = self._data
-- If raw is not set, just return the data.
if not raw then
return data
end
local stack = get_stack(data)
-- If there isn't a stack or its length is 1, return the data. Extra data (if any) will be included, as it's stored at key 0 and doesn't affect the reported length.
if stack == nil then
return data
end
local n = stack[make_stack]
if n == 1 then
return data
end
local extra = stack[0]
-- If there isn't any extra data, return the top layer of the stack.
if extra == nil then
return stack[n]
end
-- If there is, return a new stack which has the top layer at key 1 and the extra data at key 0.
data, stack = make_stack(stack[n])
stack[0] = extra
return data
end
function Language:loadInExtraData()
-- Only full languages have extra data.
if not self:hasType("language", "full") then
return
end
local data = self._data
-- If there's no stack, create one.
local stack = get_stack(self._data)
if stack == nil then
data, stack = make_stack(data)
-- If already loaded, return.
elseif stack[0] ~= nil then
return
end
self._data = data
-- Load extra data from the relevant module and add it to the stack at key 0, so that the __index and __pairs metamethods will pick it up, since they iterate down the stack until they run out of layers.
local code = self._code
local modulename = get_extra_data_module_name(code)
-- No data cached as false.
stack[0] = modulename and load_data(modulename)[code] or false
end
--[==[Returns the name of the module containing the language's data. Currently, this is always [[Module:scripts/data]].]==]
function Language:getDataModuleName()
local name = self._dataModuleName
if name == nil then
name = self:hasType("etymology-only") and etymology_languages_data_module or
get_data_module_name(self._mainCode or self._code)
self._dataModuleName = name
end
return name
end
--[==[Returns the name of the module containing the language's data. Currently, this is always [[Module:scripts/data]].]==]
function Language:getExtraDataModuleName()
local name = self._extraDataModuleName
if name == nil then
name = not self:hasType("etymology-only") and get_extra_data_module_name(self._mainCode or self._code) or false
self._extraDataModuleName = name
end
return name or nil
end
function export.makeObject(code, data, dontCanonicalizeAliases)
local data_type = type(data)
if data_type ~= "table" then
error(("bad argument #2 to 'makeObject' (table expected, got %s)"):format(data_type))
end
-- Convert any aliases.
local input_code = code
code = normalize_code(code)
input_code = dontCanonicalizeAliases and input_code or code
local parent
if data.parent then
parent = get_by_code(data.parent, nil, true, true)
else
parent = Language
end
parent.__index = parent
local lang = {_code = input_code}
-- This can only happen if dontCanonicalizeAliases is passed to make_object().
if code ~= input_code then
lang._mainCode = code
end
local parent_data = parent._data
if parent_data == nil then
-- Full code is the same as the code.
lang._fullCode = parent._code or code
else
-- Copy full code.
lang._fullCode = parent._fullCode
local stack = get_stack(parent_data)
if stack == nil then
parent_data, stack = make_stack(parent_data)
end
-- Insert the input data as the new top layer of the stack.
local n = stack[make_stack] + 1
data, stack[n], stack[make_stack] = parent_data, data, n
end
lang._data = data
return setmetatable(lang, parent)
end
make_object = export.makeObject
end
--[==[Finds the language whose code matches the one provided. If it exists, it returns a <code class="nf">Language</code> object representing the language. Otherwise, it returns {{code|lua|nil}}, unless <code class="n">paramForError</code> is given, in which case an error is generated. If <code class="n">paramForError</code> is {{code|lua|true}}, a generic error message mentioning the bad code is generated; otherwise <code class="n">paramForError</code> should be a string or number specifying the parameter that the code came from, and this parameter will be mentioned in the error message along with the bad code. If <code class="n">allowEtymLang</code> is specified, etymology-only language codes are allowed and looked up along with normal language codes. If <code class="n">allowFamily</code> is specified, language family codes are allowed and looked up along with normal language codes.]==]
function export.getByCode(code, paramForError, allowEtymLang, allowFamily)
-- Track uses of paramForError, ultimately so it can be removed, as error-handling should be done by [[Module:parameters]], not here.
if paramForError ~= nil then
track("paramForError")
end
if type(code) ~= "string" then
local typ
if not code then
typ = "nil"
elseif check_object("language", true, code) then
typ = "a language object"
elseif check_object("family", true, code) then
typ = "a family object"
else
typ = "a " .. type(code)
end
error("The function getByCode expects a string as its first argument, but received " .. typ .. ".")
end
local m_data = load_data(languages_data_module)
if m_data.aliases[code] or m_data.track[code] then
track(code)
end
local norm_code = normalize_code(code)
-- Get the data, checking for etymology-only languages if allowEtymLang is set.
local data = load_data(get_data_module_name(norm_code))[norm_code] or
allowEtymLang and load_data(etymology_languages_data_module)[norm_code]
-- If no data was found and allowFamily is set, check the family data. If the main family data was found, make the object with [[Module:families]] instead, as family objects have different methods. However, if it's an etymology-only family, use make_object in this module (which handles object inheritance), and the family-specific methods will be inherited from the parent object.
if data == nil and allowFamily then
data = load_data("Module:families/data")[norm_code]
if data ~= nil then
if data.parent == nil then
return make_family_object(norm_code, data)
elseif not allowEtymLang then
data = nil
end
end
end
local retval = code and data and make_object(code, data)
if not retval and paramForError then
require("Module:languages/errorGetBy").code(code, paramForError, allowEtymLang, allowFamily)
end
return retval
end
get_by_code = export.getByCode
--[==[Finds the language whose canonical name (the name used to represent that language on Wiktionary) or other name matches the one provided. If it exists, it returns a <code class="nf">Language</code> object representing the language. Otherwise, it returns {{code|lua|nil}}, unless <code class="n">paramForError</code> is given, in which case an error is generated. If <code class="n">allowEtymLang</code> is specified, etymology-only language codes are allowed and looked up along with normal language codes. If <code class="n">allowFamily</code> is specified, language family codes are allowed and looked up along with normal language codes.
The canonical name of languages should always be unique (it is an error for two languages on Wiktionary to share the same canonical name), so this is guaranteed to give at most one result.
This function is powered by [[Module:languages/canonical names]], which contains a pre-generated mapping of full-language canonical names to codes. It is generated by going through the [[:Category:Language data modules]] for full languages. When <code class="n">allowEtymLang</code> is specified for the above function, [[Module:etymology languages/canonical names]] may also be used, and when <code class="n">allowFamily</code> is specified for the above function, [[Module:families/canonical names]] may also be used.]==]
function export.getByCanonicalName(name, errorIfInvalid, allowEtymLang, allowFamily)
local byName = load_data("Module:languages/canonical names")
local code = byName and byName[name]
if not code and allowEtymLang then
byName = load_data("Module:etymology languages/canonical names")
code = byName and byName[name] or
byName[gsub(name, " [Ss]ubstrate$", "")] or
byName[gsub(name, "^a ", "")] or
byName[gsub(name, "^a ", ""):gsub(" [Ss]ubstrate$", "")] or
-- For etymology families like "ira-pro".
-- FIXME: This is not ideal, as it allows " languages" to be appended to any etymology-only language, too.
byName[match(name, "^กลุ่มภาษา(.*)$")]
end
if not code and allowFamily then
byName = load_data("Module:families/canonical names")
code = byName[name] or byName[match(name, "^กลุ่มภาษา(.*)$")]
end
local retval = code and get_by_code(code, errorIfInvalid, allowEtymLang, allowFamily)
if not retval and errorIfInvalid then
require("Module:languages/errorGetBy").canonicalName(name, allowEtymLang, allowFamily)
end
return retval
end
--[==[Used by [[Module:languages/data/2]] (et al.) and [[Module:etymology languages/data]], [[Module:families/data]], [[Module:scripts/data]] and [[Module:writing systems/data]] to finalize the data into the format that is actually returned.]==]
function export.finalizeData(data, main_type, variety)
local fields = {"type"}
if main_type == "language" then
insert(fields, 4) -- script codes
insert(fields, "ancestors")
insert(fields, "link_tr")
insert(fields, "override_translit")
insert(fields, "wikimedia_codes")
elseif main_type == "script" then
insert(fields, 3) -- writing system codes
end -- Families and writing systems have no extra fields to process.
local fields_len = #fields
for _, entity in next, data do
if variety then
-- Move parent from 3 to "parent" and family from "family" to 3. These are different for the sake of convenience, since very few varieties have the family specified, whereas all of them have a parent.
entity.parent, entity[3], entity.family = entity[3], entity.family
-- Give the type "regular" iff not a variety and no other types are assigned.
elseif not (entity.type or entity.parent) then
entity.type = "regular"
end
for i = 1, fields_len do
local key = fields[i]
local field = entity[key]
if field and type(field) == "string" then
entity[key] = gsub(field, "%s*,%s*", ",")
end
end
end
return data
end
--[==[For backwards compatibility only; modules should require the error themselves.]==]
function export.err(lang_code, param, code_desc, template_tag, not_real_lang)
return require("Module:languages/error")(lang_code, param, code_desc, template_tag, not_real_lang)
end
return export
s12b02rmox9ex4qh204ad7z74n6xdk3
ท่อง
0
36466
5720722
1895413
2026-04-21T04:21:18Z
Apisite
10648
5720722
wikitext
text/x-wiki
{{also/auto}}
== ภาษาไทย ==
=== รากศัพท์ ===
ร่วมเชื้อสายกับ{{cog|lo|ທ່ອງ}}
=== การออกเสียง ===
{{th-pron|ท็่อง}}
=== คำกริยา ===
{{th-verb}}
# [[เดิน]][[ก้าว]][[ไป]][[ใน]][[น้ำ]]
#: {{ux|th|ท่องน้ำ}}
# [[ว่า]][[ซ้ำ]] ๆ [[ให้]][[จำ]][[ได้]]
#: {{ux|th|ท่องหนังสือ}}
{{topics|th|การศึกษา}}
4c8wahj1eyr4tfqxwm5u7lgspssc77j
rhino
0
43557
5720711
2018970
2026-04-21T02:09:34Z
OctraBot
3198
เก็บกวาด
5720711
wikitext
text/x-wiki
== ภาษาอังกฤษ ==
=== การออกเสียง ===
* {{IPA|en|/ˈɹaɪ.nəʊ/|a=UK}}
* {{enPR|rīʹnō|a=US}}, {{IPA|en|/ˈɹaɪ.noʊ/}}
* {{audio|en|En-au-rhino.ogg|a=AU}}
* {{rhymes|en|aɪnəʊ|s=2}}
* {{homophones|en|RINO}}
=== รากศัพท์ 1 ===
{{unk|en}}
==== รูปแบบอื่น ====
* {{alt|en|rino}}
==== คำนาม ====
{{en-noun|-}}
# {{lb|en|slang|now|rare}} [[เงิน]] {{defdate|from 17th c.}}
#* {{quote-text|en|year=1792|author=w:Thomas Holcroft|title=Anne St. Ives|section=vol. III.52
|passage=When so be as a man has no money, why then, a savin and exceptin your onnur's reverence, a's but a poor dog. But when so be as a man as got the '''rhino''', why then a may begin to hold up his head.}}
#* {{quote-text|en|year=1835|author=w:Frederick Marryat|title=The Pacha of Many Tales
|passage=There I fell in with Betsy, and as she proved a regular out and outer, I spliced her, and a famous wedding we had of it, as long as the '''rhino''' lasted.}}
#* {{RQ:Joyce Ulysses|chapter=Episode 12: The Cyclops|passage=—Here you are, says Alf, chucking out the '''rhino'''. Talking about hanging, I'll show you something you never saw}}
=== รากศัพท์ 2 ===
{{clipping|en|rhinoceros}}
==== คำนาม ====
{{en-noun}}
# {{lb|en|colloquial}} [[แรด]] {{defdate|from 19th c.}}
#* {{quote-book|en|year=1932|year_published=1965|author=w:Delos W. Lovelace|title=[[w:King Kong (1933 film)|King Kong]]|page=24|passage=‘We were getting a grand shot of a charging '''rhino''' when the cameraman got scared and bolted. The fathead!’}}
#* {{quote-journal|en|year=1961|month=October|title=Talking of Trains: B.R. exile at work?|journal=Trains Illustrated|page=586|text=This cutting from an East African newspaper caught our eye last month: "The up mail train from Mombasa was held up for an hour at Kibwezi by an angry '''rhino''' on Monday night."}}
===== ลูกคำ =====
{{col2|en|black rhino|Indian rhino|Javan rhino|Sumatran rhino|white rhino|woolly rhino|Merck's rhino|narrow-nosed rhino|rhinolike|rhinoless|rhino beetle|rhino ferry}}
== ภาษาฝรั่งเศส ==
=== รากศัพท์ ===
{{clipping|fr|rhinocéros}}
=== การออกเสียง ===
* {{fr-IPA}}
* {{audio|fr|LL-Q150 (fra)-Bananax47-rhino.wav|a=<<France>> (<<Agen>>)}}
=== คำนาม ===
{{fr-noun|m}}
# {{lb|fr|informal}} [[แรด]]
{{C|fr|แรด}}
4lnkp9hkr6o3zdzxgclu33607m5mgam
5720712
5720711
2026-04-21T02:09:43Z
OctraBot
3198
เรียงลำดับหัวเรื่องภาษา
5720712
wikitext
text/x-wiki
== ภาษาฝรั่งเศส ==
=== รากศัพท์ ===
{{clipping|fr|rhinocéros}}
=== การออกเสียง ===
* {{fr-IPA}}
* {{audio|fr|LL-Q150 (fra)-Bananax47-rhino.wav|a=<<France>> (<<Agen>>)}}
=== คำนาม ===
{{fr-noun|m}}
# {{lb|fr|informal}} [[แรด]]
{{C|fr|แรด}}
== ภาษาอังกฤษ ==
=== การออกเสียง ===
* {{IPA|en|/ˈɹaɪ.nəʊ/|a=UK}}
* {{enPR|rīʹnō|a=US}}, {{IPA|en|/ˈɹaɪ.noʊ/}}
* {{audio|en|En-au-rhino.ogg|a=AU}}
* {{rhymes|en|aɪnəʊ|s=2}}
* {{homophones|en|RINO}}
=== รากศัพท์ 1 ===
{{unk|en}}
==== รูปแบบอื่น ====
* {{alt|en|rino}}
==== คำนาม ====
{{en-noun|-}}
# {{lb|en|slang|now|rare}} [[เงิน]] {{defdate|from 17th c.}}
#* {{quote-text|en|year=1792|author=w:Thomas Holcroft|title=Anne St. Ives|section=vol. III.52
|passage=When so be as a man has no money, why then, a savin and exceptin your onnur's reverence, a's but a poor dog. But when so be as a man as got the '''rhino''', why then a may begin to hold up his head.}}
#* {{quote-text|en|year=1835|author=w:Frederick Marryat|title=The Pacha of Many Tales
|passage=There I fell in with Betsy, and as she proved a regular out and outer, I spliced her, and a famous wedding we had of it, as long as the '''rhino''' lasted.}}
#* {{RQ:Joyce Ulysses|chapter=Episode 12: The Cyclops|passage=—Here you are, says Alf, chucking out the '''rhino'''. Talking about hanging, I'll show you something you never saw}}
=== รากศัพท์ 2 ===
{{clipping|en|rhinoceros}}
==== คำนาม ====
{{en-noun}}
# {{lb|en|colloquial}} [[แรด]] {{defdate|from 19th c.}}
#* {{quote-book|en|year=1932|year_published=1965|author=w:Delos W. Lovelace|title=[[w:King Kong (1933 film)|King Kong]]|page=24|passage=‘We were getting a grand shot of a charging '''rhino''' when the cameraman got scared and bolted. The fathead!’}}
#* {{quote-journal|en|year=1961|month=October|title=Talking of Trains: B.R. exile at work?|journal=Trains Illustrated|page=586|text=This cutting from an East African newspaper caught our eye last month: "The up mail train from Mombasa was held up for an hour at Kibwezi by an angry '''rhino''' on Monday night."}}
===== ลูกคำ =====
{{col2|en|black rhino|Indian rhino|Javan rhino|Sumatran rhino|white rhino|woolly rhino|Merck's rhino|narrow-nosed rhino|rhinolike|rhinoless|rhino beetle|rhino ferry}}
ehgbnyi3w2j91nbrcyodp4zt3lchj6y
ท่องเที่ยว
0
43721
5720720
1510353
2026-04-21T03:51:45Z
Ai Ku Karng
17824
/* ภาษาไทย */
5720720
wikitext
text/x-wiki
== ภาษาไทย ==
{{wp}}
=== รากศัพท์ ===
{{com|th|ท่อง|เที่ยว}}; ร่วมเชื้อสายกับ{{cog|lo|ທ່ອງທ່ຽວ}}
=== การออกเสียง ===
{{th-pron|ท็่อง-เที่ยว}}
=== คำกริยา ===
{{th-verb}}
# เที่ยว[[ไป]]
# {{lb|th|กฎ}} [[เดินทาง]][[จาก]][[ท้องที่]][[อัน]][[เป็น]][[ถิ่น]][[ที่อยู่]][[โดย]][[ปรกติ]][[ของ]][[ตน]]ไป[[ยัง]]ท้องที่[[อื่น]]เป็น[[การ]][[ชั่วคราว]][[ด้วย]][[ความ]][[สมัครใจ]] [[และ]]ด้วย[[วัตถุประสงค์]]อัน[[มิ]][[ใช่]][[เพื่อ]]ไป[[ประกอบ]][[อาชีพ]][[หรือ]][[หา]][[รายได้]]
lva0rodmnzzi4ibbttycivya3bxyiux
มอดูล:category tree/หัวข้อ/สถานที่
828
44394
5720685
5688585
2026-04-21T01:19:42Z
OctraBot
3198
5720685
Scribunto
text/plain
local labels = {}
local handlers = {}
local m_table = require("Module:table")
local en_utilities_module = "Module:en-utilities"
local string_utilities_module = "Module:string utilities"
local m_locations = require("Module:place/locations")
local m_placetypes = require("Module:place/placetypes")
local placetype_data = m_placetypes.placetype_data
local internal_error = m_locations.internal_error
local dump = mw.dumpObject
local insert = table.insert
local concat = table.concat
local is_callable = require("Module:fun").is_callable
--[==[ intro:
This module is part of the category tree code and contains code to generate the descriptions of place-related categories
such as [[Category:de:Hokkaido Prefecture, Japan]], [[Category:es:Cities in France]],
[[Category:pt:Municipalities of Tocantins, Brazil]], etc.). Note that this module doesn't actually create the
categories; that must be done separately, with the text "{{tl|auto cat}}" as the definition of the category. (This
process should automatically happen periodically for non-empty categories, because they will appear in
[[Special:WantedCategories]] and a bot will periodically examine that list and create any needed category.)
There are two ways that category descriptions are specified: (1) by manually adding an entry to the `labels` table,
keyed by the label (the category minus the language code) with a value consisting of a Lua table specifying the
description text and the category's parents; (2) through handlers (pieces of Lua code) added to the `handlers` list,
which recognize labels of a specific type (e.g. `Cities in France`) and generate the appropriate specification for that
label on-the-fly.
See [[Module:place]] for an introduction to the terminology associated with places along with a list of all the relevant
modules, along with for more specific information on types of toponyms and placetypes and how their categorization
works.
]==]
local function lcfirst(label)
return mw.getContentLanguage():lcfirst(label)
end
local function gsub_literally(str, from, to)
local m_strutils = require(string_utilities_module)
return (str:gsub(m_strutils.pattern_escape(from), m_strutils.replacement_escape(to)))
end
--ห้ามแปล class
local class_to_bare_category_parent = {
["polity"] = "องค์การทางการเมือง",
["subpolity"] = "political divisions",
["settlement"] = "การตั้งถิ่นฐาน",
["non-admin settlement"] = "การตั้งถิ่นฐาน",
["capital"] = "capital cities",
["natural feature"] = "natural features",
["man-made structure"] = "man-made structures",
["geographic region"] = "geographic and cultural areas",
}
--ห้ามแปล class
local class_is_political_division = {
["polity"] = true, -- strictly false but there are placetypes ambiguous between polity and subpolity
["subpolity"] = true,
["settlement"] = true,
["non-admin settlement"] = false,
["capital"] = true,
["natural feature"] = false,
["man-made structure"] = false,
["geographic region"] = false,
["generic place"] = false,
}
local capital_cat_to_placetype = {}
for placetype, capital_cat in pairs(m_placetypes.placetype_to_capital_cat) do
capital_cat_to_placetype[capital_cat] = placetype
end
-- Handler for bare categories for all types of capitals. This needs to precede the handler for bare placetype
-- categories as some of the types of capitals exist as placetypes as well.
insert(handlers, function(label)
label = lcfirst(label)
local capital_placetype = capital_cat_to_placetype[label]
if capital_placetype then
local pl_placetype = m_placetypes.pluralize_placetype(capital_placetype)
local linkdesc = m_placetypes.get_placetype_display_form(pl_placetype, "top-level")
if linkdesc == nil then
internal_error("Unrecognized placetype %s when processing label %s", capital_placetype, label)
end
if linkdesc == false then
mw.log(("Display form for pl_placetype %s is false, can't categorize"):format(dump(pl_placetype)))
return nil
end
return {
type = "name",
topic = label,
description = "{{{langname}}} names of [[capital]]s of " .. linkdesc .. ".",
parents = {"capital cities"},
}
end
end)
-- Handler for bare placetype categories. FIXME: Add wpcat= and commonscat= info. Previously we had it for various
-- so-called "generic" placetypes, but sometimes the categories were wrong.
insert(handlers, function(label)
for _, canon_label in ipairs { lcfirst(label), label } do
local ptdesc, ptdata = m_placetypes.get_placetype_display_form(canon_label, "top-level", "return full")
if ptdesc then
local from_category_props = {
from_category = true,
no_split_qualifiers = true,
}
local bare_category_parent = m_placetypes.get_equiv_placetype_prop(canon_label, function(pt)
local bare_category_parent = m_placetypes.get_placetype_prop(pt, "bare_category_parent")
if bare_category_parent then
return bare_category_parent
end
local class = m_placetypes.get_placetype_prop(pt, "class")
if class then
if class_to_bare_category_parent[class] == nil then
internal_error("Saw unknown category class %s derived from placetype %s",
class, canon_label)
end
return class_to_bare_category_parent[class]
end
end, from_category_props)
if not bare_category_parent then
internal_error("Saw placetype %s without a `class` or `bare_category_parent` setting, either " ..
"directly or through a fallback", canon_label)
end
local addl_bare_category_parents = m_placetypes.get_equiv_placetype_prop(canon_label, function(pt)
return m_placetypes.get_placetype_prop(pt, "addl_bare_category_parents")
end, from_category_props)
local bare_category_breadcrumb = m_placetypes.get_equiv_placetype_prop(canon_label, function(pt)
return m_placetypes.get_placetype_prop(pt, "bare_category_breadcrumb")
end, from_category_props)
if type(bare_category_parent) == "string" and bare_category_breadcrumb then
bare_category_parent = {name = bare_category_parent, sort = bare_category_breadcrumb}
end
local parents = {bare_category_parent}
if addl_bare_category_parents then
m_table.extend(parents, addl_bare_category_parents)
end
return {
type = "name",
topic = canon_label,
description = "{{{langname}}} " .. ptdesc .. ".",
breadcrumb = bare_category_breadcrumb,
parents = parents,
}
elseif ptdesc == false then
mw.log(("Display form for canon_label %s is false, can't categorize"):format(dump(canon_label)))
end
end
end)
local function fetch_primary_placetype(key, spec)
local placetype = spec.placetype
if type(placetype) == "table" then
placetype = placetype[1]
end
if not placetype then
internal_error("No placetype specified or defaulted for key %s, spec %s", key, spec)
end
return placetype
end
--[==[
Construct an appropriately linked location based on the full or elliptical placename, preceded by `"the "`` if
appropriate. Specifically:
Fetch the full and elliptical_placenames. If they are the same, just link to the placename directly. Otherwise, check if
the full placename exists; if so link to it. Otherwise, if the elliptical placename exists, link to it but display it as
the full placename. Finally, if neither full placename nor elliptical placename exists, fall back to linking to the full
placename. That way, we prefer full placenames to elliptical placenames if both or neither exist as Wiktionary entries,
but if only one exists, we link to that one rather than have a red link.
]==]
local function construct_linked_location(group, key, spec)
local full_placename, elliptical_placename = m_locations.key_to_placename(group, key)
local linked_placename
if elliptical_placename ~= full_placename then
local full_placename_title = mw.title.new(full_placename)
if full_placename_title and full_placename_title.exists then
linked_placename = m_locations.construct_linked_placename(spec, full_placename)
else
local elliptical_placename_title = mw.title.new(elliptical_placename)
if elliptical_placename_title and elliptical_placename_title.exists then
linked_placename = m_locations.construct_linked_placename(spec, elliptical_placename, full_placename)
end
end
end
return linked_placename or m_locations.construct_linked_placename(spec, full_placename)
end
--[==[
Construct the description of a location, including its container trail either to the end or until we encounter a
`no_include_container_in_desc` setting. For example, for the city of [[Birmingham]], the description will read
`"[[Birmingham]], a [[city]] in the [[West Midlands]] (which is a [[county]] of [[England]], which is a
[[constituent country]] of the [[United Kingdom]], which is a [[country]] in [[Europe]])"`. FIXME: Possibly we should
adopt the way city descriptions used to read, which was similar to `"the city of [[Birmingham]], in the county of the
[[West Midlands]], in the [[constituent country]] of [[England]], in the [[country]] of the [[United Kingdom]], in
[[Europe]]"`.
]==]
local function construct_location_desc(group, key, spec)
local parts = {}
local function ins(txt)
insert(parts, txt)
end
ins(construct_linked_location(group, key, spec))
local iteration = 0
local need_closing_paren = false
local containers = {{group = group, key = key, spec = spec}}
local container_iterator = m_locations.iterate_containers(group, key, spec)
while true do
iteration = iteration + 1
local include_container_in_desc = false
for _, container in ipairs(containers) do
if not container.spec.no_include_container_in_desc then
include_container_in_desc = true
break
end
end
if not include_container_in_desc then
break
end
local next_containers = container_iterator()
if not next_containers then
break
end
local is_former = nil
for _, container in ipairs(containers) do
local this_is_former = container.spec.is_former_place
if is_former == nil then
is_former = this_is_former
elseif is_former ~= this_is_former then
internal_error("When processing container trail of key %s, found a mixture of former and non-former " ..
"containers: %s", key, containers)
end
end
if #containers > 1 then
local placetypes = {}
local prepositions = {}
for _, container in ipairs(containers) do
local container_type = fetch_primary_placetype(container.key, container.spec)
m_table.insertIfNot(placetypes, m_placetypes.pluralize_placetype(container_type))
m_table.insertIfNot(prepositions, m_placetypes.get_placetype_entry_preposition(container_type))
end
if iteration == 1 then
ins(", ")
elseif iteration == 2 then
ins(" (which are ")
need_closing_paren = true
else
ins(", which are ")
end
if is_former then
ins("former ")
end
ins(m_table.serialCommaJoin(placetypes))
ins(" ")
ins(concat(prepositions, "/"))
else
if iteration == 1 then
ins(", ")
elseif iteration == 2 then
ins(" (which is ")
need_closing_paren = true
else
ins(", which is ")
end
local container_type = fetch_primary_placetype(containers[1].key, containers[1].spec)
if is_former then
ins("a former ")
else
ins(m_placetypes.get_placetype_article(container_type))
ins(" ")
end
ins(container_type)
ins(" ")
ins(m_placetypes.get_placetype_entry_preposition(container_type))
end
ins(" ")
first_container = false
containers = next_containers
local container_locations = {}
for _, container in ipairs(containers) do
insert(container_locations, construct_linked_location(container.group, container.key,
container.spec))
end
ins(m_table.serialCommaJoin(container_locations))
end
if need_closing_paren then
ins(")")
end
return concat(parts)
end
-- Fetch or construct the description of the location specified by `key`. If the `keydesc` property is specified,
-- use it directly but substitute any occurrence of `+++` with the auto-constructed location description, which
-- mentions the placename corresponding to the key, its placetype and container, and repeats the description up
-- the container trail until either there are no more containers or (more usually) the `no_include_container_in_desc`
-- setting is found (which is set on all continents and continent-level regions).
local function fetch_or_construct_location_desc(group, key, spec)
local val = spec.keydesc
if is_callable(val) then
val = val(group, key, spec)
spec.keydesc = val
end
val = val or "+++"
if val:find("%+%+%+") then
val = gsub_literally(val, "+++", construct_location_desc(group, key, spec))
end
return val
end
local function normalize_cat_as(cat_as, div)
if type(cat_as) ~= "table" or cat_as.type then
cat_as = {cat_as}
end
local ret_cat_as = {}
for _, pt_cat_as in ipairs(cat_as) do
if type(pt_cat_as) == "string" then
pt_cat_as = {type = pt_cat_as}
end
insert(ret_cat_as, {type = pt_cat_as.type, prep = pt_cat_as.prep or div.prep or "ของ"})
end
return ret_cat_as
end
-- Find the specified plural placetype among the divs for a given known location. Return a list of cat_as specs, where
-- each spec is of the form {type = "PLURAL_PLACETYPE", prep = "PREP"} indicating the plural placetype to use when
-- categorizing and the preposition to follow.
local function find_placetype_cat_as(divs, pl_placetype)
if divs then
if type(divs) ~= "table" then
divs = {divs}
end
for _, div in ipairs(divs) do
if type(div) == "string" then
div = {type = div}
end
if div.type == pl_placetype then
local cat_as = div.cat_as or div.type
return normalize_cat_as(cat_as, div)
end
end
end
return nil
end
-- Handler for bare placename categories for known locations in `locations` in [[Module:place/locations]].
insert(handlers, function(label)
for _, canon_label in ipairs { label, lcfirst(label) } do
local group, spec = m_locations.find_canonical_key(canon_label)
if group then
-- wp= defaults to true (Wikipedia article matches location's full placename)
local wp = spec.wp
if wp == nil then
wp = true
end
-- wpcat= defaults to wp= (if Wikipedia article has its own name, Wikipedia category and Commons category
-- generally follow)
local wpcat = spec.wpcat
if wpcat == nil then
wpcat = wp
end
-- commonscat= defaults to wpcat= (if Wikipedia category has its own name, Commons category generally
-- follows)
local commonscat = spec.commonscat
if commonscat == nil then
commonscat = wpcat
end
local parents = {}
local bare_label_parents = spec.overriding_bare_label_parents
local container_iterator = m_locations.iterate_containers(group, canon_label, spec)
local containers = container_iterator()
if not bare_label_parents then
bare_label_parents = {"+++"}
end
local full_location_placename, elliptical_location_placename = m_locations.key_to_placename(group, canon_label)
local full_container_placename
if containers then
full_container_placename, _ = m_locations.key_to_placename(containers[1].group, containers[1].key)
end
local inserted_containers = false
for _, parent in ipairs(bare_label_parents) do
if parent == "+++" then
parent = "PL_PLACETYPEPREPCONTAINER" --th not use spaces
end
if parent:find("CONTAINER") then
if not containers then
internal_error("Parent category %s needs the container of %s but no containers specified: %s",
parent, canon_label, spec)
end
local location_type = fetch_primary_placetype(canon_label, spec)
local pl_location_type = m_placetypes.pluralize_placetype(location_type)
for _, container in ipairs(containers) do
local per_container_parent = parent
local cat_as_list
if per_container_parent:find("PL_PLACETYPE") then
if spec.bare_category_parent_type then
cat_as_list = normalize_cat_as(spec.bare_category_parent_type, spec)
else
cat_as_list = find_placetype_cat_as(container.spec.divs, pl_location_type) or
find_placetype_cat_as(container.spec.addl_divs, pl_location_type)
end
end
if not cat_as_list then
local canon_placetype, ptdata, ptmatch = m_placetypes.get_placetype_data(location_type, "from category")
if not canon_placetype or not (ptdata.generic_before_non_cities or ptdata.generic_before_cities) then
internal_error("Unable to locate plural location type %s among the divs or addl_divs " ..
"for container key %s spec %s, and the location type is either not in placetype_data or " ..
"not identified as a generic placetype", pl_location_type, container.key, container.spec)
end
cat_as_list = {{type = pl_location_type, prep =
m_placetypes.get_placetype_entry_preposition(location_type)}}
end
local prefixed_key = m_placetypes.get_prefixed_key(container.key, container.spec)
per_container_parent = gsub_literally(per_container_parent, "CONTAINER", prefixed_key)
for _, cat_as in ipairs(cat_as_list) do
local per_container_per_placetype_parent = per_container_parent
per_container_per_placetype_parent = gsub_literally(per_container_per_placetype_parent, "PL_PLACETYPE",
cat_as.type)
per_container_per_placetype_parent = gsub_literally(per_container_per_placetype_parent, "PREP",
cat_as.prep)
m_table.insertIfNot(parents, per_container_per_placetype_parent)
end
end
inserted_containers = true
else
m_table.insertIfNot(parents, parent)
end
end
if not inserted_containers and containers then
-- If we didn't insert the containers above in some form, insert them now as bare categories. Note that
-- this may be different categories from the container categories inserted above.
for _, container in ipairs(containers) do
m_table.insertIfNot(parents, container.key)
end
end
if spec.addl_parents then
for _, parent in ipairs(spec.addl_parents) do
m_table.insertIfNot(parents, parent)
end
end
local function format_boxval(val, specname)
if val == true then
val = "%l"
end
if type(val) == "string" then
val = gsub_literally(val, "%l", full_location_placename)
val = gsub_literally(val, "%e", elliptical_location_placename)
if val:find("%%c") then
if not full_container_placename then
internal_error("Wikipedia/Commons spec %s = %s has %%c in it but key %s has no " ..
"containers: %s", specname, val, canon_label, spec)
end
val = gsub_literally(val, "%c", full_container_placename)
end
end
return val
end
local description = spec.fulldesc or (
"{{{langname}}} terms related to the people, culture, or territory of " ..
fetch_or_construct_location_desc(group, canon_label, spec) .. ".")
local full_placename, _ = m_locations.key_to_placename(group, canon_label)
return {
type = "topic",
description = description,
breadcrumb = full_placename,
parents = parents,
wp = format_boxval(wp, "wp"),
wpcat = format_boxval(wpcat, "wpcat"),
commonscat = format_boxval(commonscat, "commonscat"),
}
end
end
end)
local function find_canonical_key_from_place(place, canon_label)
local has_the = false
local key
if place:find("^the ") then
key = place:gsub("^the ", "")
has_the = true
else
key = place
end
local group, spec = m_locations.find_canonical_key(key)
if group then
local requires_the = spec.the or false
if has_the ~= requires_the then
if has_the then
mw.log(("Mismatch in category name '%s', has 'the' in the category when it should not"):format(
canon_label))
else
mw.log(("Mismatch in category name '%s', should have 'the' in the category but does not"):
format(canon_label))
end
return nil
end
return group, key, spec
end
return nil
end
-- Handler for generic placetypes (those whose categories are added through category generation handlers or through
-- explicit category specs in the placetype data) for known locations in [[Module:place/locations]]. All such
-- placetypes have either a `generic_before_non_cities` setting (meaning they can occur before non-city locations) or
-- `generic_before_cities` setting (meaning they can occur before cities), or both. Examples of such categories are
-- "cities in the Bahamas" or "rivers in Western Australia, Australia", or (for city locations)
-- "neighbourhoods of Hong Kong" or "places in Melbourne".
insert(handlers, function(label)
for _, canon_label in ipairs { lcfirst(label), label } do
local placetype, in_of, place = mw.ustring.match(canon_label, "^([A-Za-zก-๛%- ]-)(ใน)(.*)$") --th
if not placetype then
placetype, in_of, place = mw.ustring.match(canon_label, "^([A-Za-zก-๛%- ]-)(ของ)(.*)$") --th
end
if placetype then
local normalized_placetype = placetype == "neighbourhoods" and "neighborhoods" or placetype
local canon_placetype, ptdata, ptmatch = m_placetypes.get_placetype_data(normalized_placetype, "from category")
if canon_placetype and (ptdata.generic_before_non_cities or ptdata.generic_before_cities) then
local group, key, spec = find_canonical_key_from_place(place, canon_label)
if group then
-- Check whether the location uses British spelling, but also check all containers, because
-- it's too hard to keep in sync the `british_spelling` setting for locations at all different
-- levels (e.g. cities of various countries, first and second level administrative division, etc.),
-- so we just set it at top level on the country.
local uses_british_spelling = spec.british_spelling
if uses_british_spelling == nil then
for containers in m_locations.iterate_containers(group, key, spec) do
local must_outer_break = false
for _, container in ipairs(containers) do
if container.spec.british_spelling ~= nil then
uses_british_spelling = container.spec.british_spelling
must_outer_break = true
break
end
end
if must_outer_break then
break
end
end
end
local allow_cat = true
if placetype == "neighborhoods" and uses_british_spelling or
placetype == "neighbourhoods" and not uses_british_spelling then
mw.log(("Mismatch in spelling of placetype '%s' in category '%s', should be '%s'"):format(
placetype, canon_label, uses_british_spelling and "neighbourhoods" or "neighborhoods"))
allow_cat = false
end
if spec.is_former_place and placetype ~= "สถานที่" then
allow_cat = false
end
local expected_prep
if spec.is_city then
expected_prep = ptdata.generic_before_cities
else
expected_prep = ptdata.generic_before_non_cities
end
if not expected_prep then
allow_cat = false
end
if allow_cat then
if expected_prep ~= in_of then
mw.log(("Mismatch in category name '%s', has '%s' when it should have '%s'"):format(
canon_label, in_of, expected_prep))
return nil
end
local linkdesc = m_placetypes.get_placetype_display_form(placetype,
spec.is_city and "city" or "noncity", "return full")
if linkdesc == false then
mw.log(("Display form for placetype %s is false, can't categorize"):format(dump(placetype)))
return nil
end
if not linkdesc then
internal_error("Unrecognized placetype %s when processing key %s, data %s, label %s",
placetype, key, spec, canon_label)
end
desc = linkdesc .. " " .. in_of .. " " .. fetch_or_construct_location_desc(group, key, spec)
desc = "{{{langname}}} " .. desc .. "."
local parents = {}
insert(parents, key)
if spec.no_container_parent then
-- top-level country, constituent country, continent or the like
insert(parents, {name = normalized_placetype, sort = key})
if spec.placetype == "ประเทศ" or m_table.contains(spec.placetype, "ประเทศ") then
local category_class = m_placetypes.get_equiv_placetype_prop(normalized_placetype,
function(pt) return m_placetypes.get_placetype_prop(pt, "class") end, {
from_category = true,
no_split_qualifiers = true,
})
if not category_class then
internal_error("Saw placetype %s that is either unknown or has no `class` " ..
"setting in `placetype_data`", normalized_placetype)
end
if class_is_political_division[category_class] == nil then
internal_error("Saw unknown category class %s derived from placetype %s",
category_class, normalized_placetype)
end
if class_is_political_division[category_class] then
insert(parents, "political divisions of specific countries")
end
end
else
local container_iterator = m_locations.iterate_containers(group, key, spec)
local next_containers = container_iterator()
if next_containers then
for _, container in ipairs(next_containers) do
local container_prep
if container.spec.is_city then
container_prep = ptdata.generic_before_cities
else
container_prep = ptdata.generic_before_non_cities
end
if not container_prep then
internal_error("For container key %s spec %s defines is_city = %s but " ..
"there is no corresponding `generic_before_*` setting in the " ..
"placedata for placetype %s", container.key, container.spec,
container.spec.is_city, placetype)
end
insert(parents, {
name = placetype .. container_prep .. m_placetypes.get_prefixed_key(container.key, container.spec), --th
sort = key
})
end
else
-- unrecognized countries or the like
insert(parents, {name = normalized_placetype, sort = key})
end
end
return {
type = "name",
topic = canon_label,
description = desc,
breadcrumb = placetype,
parents = parents,
}
end
end
end
end
end
end)
-- Handler for "state capitals of the United States", "provincial capitals of Canada", etc. This must precede the next
-- handler for specific political and misc (non-political) divisions of polities and subpolities, such as
-- "provinces of the Philippines", because "departmental capitals" is listed in cat_as for French prefectures and so
-- will trigger an error if that handler runs before this one.
insert(handlers, function(label)
label = lcfirst(label)
local capital_cat, place = label:match("^([a-z%- ]- capitals) of (.*)$")
-- Make sure we recognize the type of capital.
if place and capital_cat_to_placetype[capital_cat] then
local placetype = capital_cat_to_placetype[capital_cat]
local pl_placetype = m_placetypes.pluralize_placetype(placetype)
-- Locate the container, fetch its known political divisions, and make sure the placetype corresponding to the
-- type of capital is among the list.
local group, key, spec = find_canonical_key_from_place(place, canon_label)
if group and (spec.divs or spec.addl_divs) then
local saw_match = false
local variant_matches = {}
local divlists = {}
if spec.divs then
insert(divlists, spec.divs)
end
if spec.addl_divs then
insert(divlists, spec.addl_divs)
end
for _, divlist in ipairs(divlists) do
if type(divlist) ~= "table" then
divlist = {divlist}
end
for _, div in ipairs(divlist) do
if type(div) == "string" then
div = {type = div}
end
-- HACK. Currently if we don't find a match for the placetype, we map e.g. 'autonomous region'
-- -> 'regional capitals' and 'union territory' -> 'territorial capitals'. When encountering a
-- political division like 'autonomous region' or 'union territory', chop off everything up
-- through a space to make things match. To make this clearer, we record all such
-- "variant match" cases, and down below we insert a note into the category text indicating that
-- such "variant matches" are included among the category.
if pl_placetype == div.type or pl_placetype == div.type:gsub("^.* ", "") then
saw_match = true
if pl_placetype ~= div.type then
insert(variant_matches, div.type)
end
end
end
end
if saw_match then
-- Everything checks out, construct the category description.
local placetype_desc = m_placetypes.get_placetype_display_form(pl_placetype,
placetype.is_city and "city" or "noncity")
if placetype_desc == false then
mw.log(("Display form for pl_placetype %s is false, can't categorize"):format(dump(pl_placetype)))
return nil
end
if not placetype_desc then
internal_error("Unrecognized plural placetype %s, generated as the plural of %s, which " ..
"was found as the placetype of capital placetype %s in label %s", pl_placetype,
placetype, capital_cat, label)
end
local variant_match_text = ""
if variant_matches[1] then
local real_variant_match_descs = {}
for i, variant_match in ipairs(variant_matches) do
local variant_match_desc = m_placetypes.get_placetype_display_form(variant_match,
placetype.is_city and "city" or "noncity")
if variant_match_desc == nil then
internal_error("Unrecognized variant match plural placetype %s, coming from " ..
"place key %s, data %s in label %s", variant_match, key, spec, label)
end
if variant_match_desc then
-- skip those for which the description is `false`, like `ABBREVIATION_OF states`
-- in the United States divs.
insert(real_variant_match_descs, variant_match_desc)
end
end
if real_variant_match_descs[1] then
variant_match_text = " (including " .. m_table.serialCommaJoin(real_variant_match_descs)
.. ")"
end
end
local desc = "{{{langname}}} names of [[capital]]s of " .. placetype_desc .. variant_match_text ..
" of " .. fetch_or_construct_location_desc(group, key, spec) .. "."
local full_placename, _ = m_locations.key_to_placename(group, key)
local parents = {}
if spec.no_container_parent then
-- top-level country, constituent country, continent or the like
insert(parents, {name = capital_cat, sort = key})
else
local container_iterator = m_locations.iterate_containers(group, key, spec)
local next_containers = container_iterator()
if next_containers then
for _, container in ipairs(next_containers) do
insert(parents, {
name = capital_cat .. "ของ" .. m_placetypes.get_prefixed_key(container.key, container.spec), --th
sort = key
})
end
else
-- unrecognized countries or the like
insert(parents, {name = capital_cat, sort = key})
end
end
insert(parents, key)
return {
type = "name",
topic = label,
description = desc,
breadcrumb = full_placename,
parents = parents,
}
end
end
end
end)
local overriding_category_descriptions = {
["autonomous cities of Spain"] = "the [[w:Autonomous communities of Spain#Autonomous_cities|autonomous cities of Spain]]",
["regions of Greece"] = "the regions ([[periphery|peripheries]]) of [[Greece]]",
["regions of North Macedonia"] = "the regions ([[periphery|peripheries]]) of [[North Macedonia]]",
["subprefectures of Japan"] = "[[subprefecture]]s of [[Japan]]ese [[prefecture]]s",
}
-- Handler for specific political and misc (non-political) divisions of locations (polities, subpolities, cities, etc.),
-- such as "provinces of the Philippines", "counties of Wales", "municipalities of Tocantins, Brazil",
-- "boroughs of New York City", etc. This does not handle categories for generic placetypes (cities, rivers, etc.) of
-- locations, which are handled by different handlers above.
insert(handlers, function(label)
-- The label comes with an initial capitalization but we have to check both lowercase-initial and capital-initial
-- versions of the placetype to handle e.g. [[:Category:en:Indian reserves of Canada]].
for _, canon_label in ipairs { label, lcfirst(label) } do
for _, minimal_placetype in ipairs { true, false } do
local match_quantifier = minimal_placetype and "-" or "+"
-- Some categories have two "of"s in them, and depending on the category, it's correct to do either a greedy
-- ([[:Category:en:Abbreviations of states of the United States]], with placetype `abbreviations of states`)
-- or non-greedy ([[:Category:en:Provinces of the Democratic Republic of the Congo]], with placetype
-- `provinces`) match. We can't know in advance which is correct so we try both possibilities, doing the
-- non-greedy one first as it seems more common (there are many locations with "of" in them, but currently
-- only `abbreviations of states` occurs with a following location).
local placetype, in_of, place = mw.ustring.match(canon_label, "^([A-Za-zก-๛%- ]" .. match_quantifier .. ")(ของ)(.*)$")
if not placetype then
placetype, in_of, place = mw.ustring.match(canon_label, "^([A-Za-zก-๛%- ]" .. match_quantifier .. ")(ใน)(.*)$")
end
if placetype then
local group, key, spec = find_canonical_key_from_place(place, canon_label)
if group then
local function find_placetype(divs)
if divs then
if type(divs) ~= "table" then
divs = {divs}
end
for _, div in ipairs(divs) do
if type(div) == "string" then
div = {type = div}
end
local cat_as = div.cat_as or div.type
if type(cat_as) ~= "table" then
cat_as = {cat_as}
end
for _, pt_cat_as in ipairs(cat_as) do
if type(pt_cat_as) == "string" then
pt_cat_as = {type = pt_cat_as}
end
if placetype == pt_cat_as.type then
local div_parent = pt_cat_as.container_parent_type
if div_parent == nil then -- allow false
div_parent = div.container_parent_type
end
if div_parent == nil then
div_parent = placetype
end
return div_parent, pt_cat_as.prep or div.prep or "ของ"
end
end
end
end
return nil
end
local div_parent, div_prep = find_placetype(spec.divs)
if div_parent == nil then -- allow false
div_parent, div_prep = find_placetype(spec.addl_divs)
end
if div_parent == nil then -- allow false
div_parent, div_prep = find_placetype(spec.addl_divs_for_categorization)
end
if div_parent ~= nil then
if div_prep ~= in_of then
mw.log(("Mismatch in category name '%s', has '%s' when it should have '%s'"):format(
canon_label, in_of, div_prep))
return nil
end
local linkdesc = m_placetypes.get_placetype_display_form(placetype, spec.is_city and "city" or "noncity",
"return full")
if linkdesc == false then
mw.log(("Display form for placetype %s is false, can't categorize"):format(dump(placetype)))
return nil
end
if not linkdesc then
internal_error("Unrecognized placetype %s when processing key %s, data %s, label %s",
placetype, key, spec, canon_label)
end
local desc = overriding_category_descriptions[canon_label]
if not desc then
desc = linkdesc .. in_of .. fetch_or_construct_location_desc(group, key, spec) --th
end
desc = "{{{langname}}} " .. desc .. "."
local parents = {}
insert(parents, key)
if div_parent then -- div_parent may be `false`
if spec.no_container_parent then
-- top-level country, constituent country, continent or the like
insert(parents, {name = placetype, sort = " " .. key})
if spec.placetype == "ประเทศ" or m_table.contains(spec.placetype, "ประเทศ") then --th
insert(parents, "political divisions of specific countries")
end
else
local container_iterator = m_locations.iterate_containers(group, key, spec)
local next_containers = container_iterator()
if next_containers then
for _, container in ipairs(next_containers) do
insert(parents, {
name = div_parent .. in_of .. m_placetypes.get_prefixed_key(container.key, container.spec), --th
sort = key
})
end
else
-- unrecognized countries or the like
insert(parents, {name = placetype, sort = " " .. key})
end
end
end
return {
type = "name",
topic = canon_label,
description = desc,
breadcrumb = placetype,
parents = parents,
}
end
end
end
end
end
end)
labels["exonyms"] = {
type = "name",
-- special-cased description
description = "{{{langname}}} [[exonym]]s.",
parents = {"สถานที่"},
}
labels["political divisions of specific countries"] = {
type = "grouping",
description = "{{{langname}}} categories for political divisions of specific countries.",
parents = {"สถานที่"},
}
-- Misc. FIXME: Remove the need for this.
labels["nomes of Ancient Egypt"] = {
type = "name",
-- special-cased description
description = "{{{langname}}} names of the [[nome]]s of [[Ancient Egypt]].",
breadcrumb = "nomes",
parents = {"อียิปต์โบราณ"},
}
-- FIXME: Everything here has been moved from [[Module:category tree/topic/Earth]]. Most should be removed.
labels["มหาสมุทรแอตแลนติก"] = {
type = "related-to",
description = "default with the",
parents = {"โลก"},
}
labels["Atlantic Ocean"] = labels["มหาสมุทรแอตแลนติก"]
labels["British Isles"] = {
type = "related-to",
description = "=the people, culture, or territory of [[Great Britain]], [[Ireland]], and other nearby islands",
parents = {"ยุโรป", "เกาะ"},
}
labels["สหภาพยุโรป"] = {
type = "related-to",
description = "default with the",
parents = {"ยุโรป"},
}
labels["European Union"] = labels["สหภาพยุโรป"]
labels["Gascony"] = {
type = "related-to",
description = "default",
parents = {"Occitania, France"},
}
labels["Indian subcontinent"] = {
type = "related-to",
description = "default with the",
parents = {"เอเชียใต้"},
}
labels["Bengal"] = {
type = "related-to",
description = "{{{langname}}} terms related to the people, culture, or territory of [[Bengal]].",
parents = {"Indian subcontinent"},
}
labels["Kashmir"] = {
type = "related-to",
description = "{{{langname}}} terms related to the people, culture, or territory of [[Kashmir]].",
parents = {"Indian subcontinent"},
}
labels["Kashmir, India"] = {
type = "related-to",
description = "{{{langname}}} names of places in {{w|Kashmir, India}}.",
parents = {"อินเดีย", "Kashmir"},
}
labels["เกาหลี"] = {
type = "related-to",
description = "=the people, culture, or territory of [[Korea]]",
parents = {"เอเชีย"},
}
labels["Korea"] = labels["เกาหลี"]
labels["Languedoc"] = {
type = "related-to",
description = "default",
parents = {"Occitania, France"},
}
labels["Lapland"] = {
type = "related-to",
description = "=[[Lapland]], a region in northernmost Europe",
parents = {"ยุโรป", "ฟินแลนด์", "นอร์เวย์", "รัสเซีย", "สวีเดน"},
}
labels["ตะวันออกกลาง"] = {
type = "related-to",
description = "default with the",
parents = {"แอฟริกา", "เอเชีย"},
}
labels["Middle East"] = labels["ตะวันออกกลาง"]
labels["Netherlands Antilles"] = {
type = "related-to",
description = "=the people, culture, or territory of the [[Netherlands Antilles]]",
parents = {"เนเธอร์แลนด์", "อเมริกาเหนือ"},
}
labels["Provence"] = {
type = "related-to",
description = "default",
parents = {"Provence-Alpes-Côte d'Azur, France"},
}
labels["เอเชียใต้"] = {
type = "related-to",
description = "default",
parents = {"ยูเรเชีย", "เอเชีย"},
}
labels["South Asia"] = labels["เอเชียใต้"]
return {LABELS = labels, HANDLERS = handlers}
9fu8gl0wabsmxyswonlqjr6p0dt4ghr
5720687
5720685
2026-04-21T01:23:06Z
OctraBot
3198
5720687
Scribunto
text/plain
local labels = {}
local handlers = {}
local m_table = require("Module:table")
local en_utilities_module = "Module:en-utilities"
local string_utilities_module = "Module:string utilities"
local m_locations = require("Module:place/locations")
local m_placetypes = require("Module:place/placetypes")
local placetype_data = m_placetypes.placetype_data
local internal_error = m_locations.internal_error
local dump = mw.dumpObject
local insert = table.insert
local concat = table.concat
local is_callable = require("Module:fun").is_callable
--[==[ intro:
This module is part of the category tree code and contains code to generate the descriptions of place-related categories
such as [[Category:de:Hokkaido Prefecture, Japan]], [[Category:es:Cities in France]],
[[Category:pt:Municipalities of Tocantins, Brazil]], etc.). Note that this module doesn't actually create the
categories; that must be done separately, with the text "{{tl|auto cat}}" as the definition of the category. (This
process should automatically happen periodically for non-empty categories, because they will appear in
[[Special:WantedCategories]] and a bot will periodically examine that list and create any needed category.)
There are two ways that category descriptions are specified: (1) by manually adding an entry to the `labels` table,
keyed by the label (the category minus the language code) with a value consisting of a Lua table specifying the
description text and the category's parents; (2) through handlers (pieces of Lua code) added to the `handlers` list,
which recognize labels of a specific type (e.g. `Cities in France`) and generate the appropriate specification for that
label on-the-fly.
See [[Module:place]] for an introduction to the terminology associated with places along with a list of all the relevant
modules, along with for more specific information on types of toponyms and placetypes and how their categorization
works.
]==]
local function lcfirst(label)
return mw.getContentLanguage():lcfirst(label)
end
local function gsub_literally(str, from, to)
local m_strutils = require(string_utilities_module)
return (str:gsub(m_strutils.pattern_escape(from), m_strutils.replacement_escape(to)))
end
--ห้ามแปล class
local class_to_bare_category_parent = {
["polity"] = "องค์การทางการเมือง",
["subpolity"] = "political divisions",
["settlement"] = "การตั้งถิ่นฐาน",
["non-admin settlement"] = "การตั้งถิ่นฐาน",
["capital"] = "เมืองหลวง",
["natural feature"] = "natural features",
["man-made structure"] = "man-made structures",
["geographic region"] = "geographic and cultural areas",
}
--ห้ามแปล class
local class_is_political_division = {
["polity"] = true, -- strictly false but there are placetypes ambiguous between polity and subpolity
["subpolity"] = true,
["settlement"] = true,
["non-admin settlement"] = false,
["capital"] = true,
["natural feature"] = false,
["man-made structure"] = false,
["geographic region"] = false,
["generic place"] = false,
}
local capital_cat_to_placetype = {}
for placetype, capital_cat in pairs(m_placetypes.placetype_to_capital_cat) do
capital_cat_to_placetype[capital_cat] = placetype
end
-- Handler for bare categories for all types of capitals. This needs to precede the handler for bare placetype
-- categories as some of the types of capitals exist as placetypes as well.
insert(handlers, function(label)
label = lcfirst(label)
local capital_placetype = capital_cat_to_placetype[label]
if capital_placetype then
local pl_placetype = m_placetypes.pluralize_placetype(capital_placetype)
local linkdesc = m_placetypes.get_placetype_display_form(pl_placetype, "top-level")
if linkdesc == nil then
internal_error("Unrecognized placetype %s when processing label %s", capital_placetype, label)
end
if linkdesc == false then
mw.log(("Display form for pl_placetype %s is false, can't categorize"):format(dump(pl_placetype)))
return nil
end
return {
type = "name",
topic = label,
description = "{{{langname}}} names of [[capital]]s of " .. linkdesc .. ".",
parents = {"เมืองหลวง"},
}
end
end)
-- Handler for bare placetype categories. FIXME: Add wpcat= and commonscat= info. Previously we had it for various
-- so-called "generic" placetypes, but sometimes the categories were wrong.
insert(handlers, function(label)
for _, canon_label in ipairs { lcfirst(label), label } do
local ptdesc, ptdata = m_placetypes.get_placetype_display_form(canon_label, "top-level", "return full")
if ptdesc then
local from_category_props = {
from_category = true,
no_split_qualifiers = true,
}
local bare_category_parent = m_placetypes.get_equiv_placetype_prop(canon_label, function(pt)
local bare_category_parent = m_placetypes.get_placetype_prop(pt, "bare_category_parent")
if bare_category_parent then
return bare_category_parent
end
local class = m_placetypes.get_placetype_prop(pt, "class")
if class then
if class_to_bare_category_parent[class] == nil then
internal_error("Saw unknown category class %s derived from placetype %s",
class, canon_label)
end
return class_to_bare_category_parent[class]
end
end, from_category_props)
if not bare_category_parent then
internal_error("Saw placetype %s without a `class` or `bare_category_parent` setting, either " ..
"directly or through a fallback", canon_label)
end
local addl_bare_category_parents = m_placetypes.get_equiv_placetype_prop(canon_label, function(pt)
return m_placetypes.get_placetype_prop(pt, "addl_bare_category_parents")
end, from_category_props)
local bare_category_breadcrumb = m_placetypes.get_equiv_placetype_prop(canon_label, function(pt)
return m_placetypes.get_placetype_prop(pt, "bare_category_breadcrumb")
end, from_category_props)
if type(bare_category_parent) == "string" and bare_category_breadcrumb then
bare_category_parent = {name = bare_category_parent, sort = bare_category_breadcrumb}
end
local parents = {bare_category_parent}
if addl_bare_category_parents then
m_table.extend(parents, addl_bare_category_parents)
end
return {
type = "name",
topic = canon_label,
description = "{{{langname}}} " .. ptdesc .. ".",
breadcrumb = bare_category_breadcrumb,
parents = parents,
}
elseif ptdesc == false then
mw.log(("Display form for canon_label %s is false, can't categorize"):format(dump(canon_label)))
end
end
end)
local function fetch_primary_placetype(key, spec)
local placetype = spec.placetype
if type(placetype) == "table" then
placetype = placetype[1]
end
if not placetype then
internal_error("No placetype specified or defaulted for key %s, spec %s", key, spec)
end
return placetype
end
--[==[
Construct an appropriately linked location based on the full or elliptical placename, preceded by `"the "`` if
appropriate. Specifically:
Fetch the full and elliptical_placenames. If they are the same, just link to the placename directly. Otherwise, check if
the full placename exists; if so link to it. Otherwise, if the elliptical placename exists, link to it but display it as
the full placename. Finally, if neither full placename nor elliptical placename exists, fall back to linking to the full
placename. That way, we prefer full placenames to elliptical placenames if both or neither exist as Wiktionary entries,
but if only one exists, we link to that one rather than have a red link.
]==]
local function construct_linked_location(group, key, spec)
local full_placename, elliptical_placename = m_locations.key_to_placename(group, key)
local linked_placename
if elliptical_placename ~= full_placename then
local full_placename_title = mw.title.new(full_placename)
if full_placename_title and full_placename_title.exists then
linked_placename = m_locations.construct_linked_placename(spec, full_placename)
else
local elliptical_placename_title = mw.title.new(elliptical_placename)
if elliptical_placename_title and elliptical_placename_title.exists then
linked_placename = m_locations.construct_linked_placename(spec, elliptical_placename, full_placename)
end
end
end
return linked_placename or m_locations.construct_linked_placename(spec, full_placename)
end
--[==[
Construct the description of a location, including its container trail either to the end or until we encounter a
`no_include_container_in_desc` setting. For example, for the city of [[Birmingham]], the description will read
`"[[Birmingham]], a [[city]] in the [[West Midlands]] (which is a [[county]] of [[England]], which is a
[[constituent country]] of the [[United Kingdom]], which is a [[country]] in [[Europe]])"`. FIXME: Possibly we should
adopt the way city descriptions used to read, which was similar to `"the city of [[Birmingham]], in the county of the
[[West Midlands]], in the [[constituent country]] of [[England]], in the [[country]] of the [[United Kingdom]], in
[[Europe]]"`.
]==]
local function construct_location_desc(group, key, spec)
local parts = {}
local function ins(txt)
insert(parts, txt)
end
ins(construct_linked_location(group, key, spec))
local iteration = 0
local need_closing_paren = false
local containers = {{group = group, key = key, spec = spec}}
local container_iterator = m_locations.iterate_containers(group, key, spec)
while true do
iteration = iteration + 1
local include_container_in_desc = false
for _, container in ipairs(containers) do
if not container.spec.no_include_container_in_desc then
include_container_in_desc = true
break
end
end
if not include_container_in_desc then
break
end
local next_containers = container_iterator()
if not next_containers then
break
end
local is_former = nil
for _, container in ipairs(containers) do
local this_is_former = container.spec.is_former_place
if is_former == nil then
is_former = this_is_former
elseif is_former ~= this_is_former then
internal_error("When processing container trail of key %s, found a mixture of former and non-former " ..
"containers: %s", key, containers)
end
end
if #containers > 1 then
local placetypes = {}
local prepositions = {}
for _, container in ipairs(containers) do
local container_type = fetch_primary_placetype(container.key, container.spec)
m_table.insertIfNot(placetypes, m_placetypes.pluralize_placetype(container_type))
m_table.insertIfNot(prepositions, m_placetypes.get_placetype_entry_preposition(container_type))
end
if iteration == 1 then
ins(", ")
elseif iteration == 2 then
ins(" (which are ")
need_closing_paren = true
else
ins(", which are ")
end
if is_former then
ins("former ")
end
ins(m_table.serialCommaJoin(placetypes))
ins(" ")
ins(concat(prepositions, "/"))
else
if iteration == 1 then
ins(", ")
elseif iteration == 2 then
ins(" (which is ")
need_closing_paren = true
else
ins(", which is ")
end
local container_type = fetch_primary_placetype(containers[1].key, containers[1].spec)
if is_former then
ins("a former ")
else
ins(m_placetypes.get_placetype_article(container_type))
ins(" ")
end
ins(container_type)
ins(" ")
ins(m_placetypes.get_placetype_entry_preposition(container_type))
end
ins(" ")
first_container = false
containers = next_containers
local container_locations = {}
for _, container in ipairs(containers) do
insert(container_locations, construct_linked_location(container.group, container.key,
container.spec))
end
ins(m_table.serialCommaJoin(container_locations))
end
if need_closing_paren then
ins(")")
end
return concat(parts)
end
-- Fetch or construct the description of the location specified by `key`. If the `keydesc` property is specified,
-- use it directly but substitute any occurrence of `+++` with the auto-constructed location description, which
-- mentions the placename corresponding to the key, its placetype and container, and repeats the description up
-- the container trail until either there are no more containers or (more usually) the `no_include_container_in_desc`
-- setting is found (which is set on all continents and continent-level regions).
local function fetch_or_construct_location_desc(group, key, spec)
local val = spec.keydesc
if is_callable(val) then
val = val(group, key, spec)
spec.keydesc = val
end
val = val or "+++"
if val:find("%+%+%+") then
val = gsub_literally(val, "+++", construct_location_desc(group, key, spec))
end
return val
end
local function normalize_cat_as(cat_as, div)
if type(cat_as) ~= "table" or cat_as.type then
cat_as = {cat_as}
end
local ret_cat_as = {}
for _, pt_cat_as in ipairs(cat_as) do
if type(pt_cat_as) == "string" then
pt_cat_as = {type = pt_cat_as}
end
insert(ret_cat_as, {type = pt_cat_as.type, prep = pt_cat_as.prep or div.prep or "ของ"})
end
return ret_cat_as
end
-- Find the specified plural placetype among the divs for a given known location. Return a list of cat_as specs, where
-- each spec is of the form {type = "PLURAL_PLACETYPE", prep = "PREP"} indicating the plural placetype to use when
-- categorizing and the preposition to follow.
local function find_placetype_cat_as(divs, pl_placetype)
if divs then
if type(divs) ~= "table" then
divs = {divs}
end
for _, div in ipairs(divs) do
if type(div) == "string" then
div = {type = div}
end
if div.type == pl_placetype then
local cat_as = div.cat_as or div.type
return normalize_cat_as(cat_as, div)
end
end
end
return nil
end
-- Handler for bare placename categories for known locations in `locations` in [[Module:place/locations]].
insert(handlers, function(label)
for _, canon_label in ipairs { label, lcfirst(label) } do
local group, spec = m_locations.find_canonical_key(canon_label)
if group then
-- wp= defaults to true (Wikipedia article matches location's full placename)
local wp = spec.wp
if wp == nil then
wp = true
end
-- wpcat= defaults to wp= (if Wikipedia article has its own name, Wikipedia category and Commons category
-- generally follow)
local wpcat = spec.wpcat
if wpcat == nil then
wpcat = wp
end
-- commonscat= defaults to wpcat= (if Wikipedia category has its own name, Commons category generally
-- follows)
local commonscat = spec.commonscat
if commonscat == nil then
commonscat = wpcat
end
local parents = {}
local bare_label_parents = spec.overriding_bare_label_parents
local container_iterator = m_locations.iterate_containers(group, canon_label, spec)
local containers = container_iterator()
if not bare_label_parents then
bare_label_parents = {"+++"}
end
local full_location_placename, elliptical_location_placename = m_locations.key_to_placename(group, canon_label)
local full_container_placename
if containers then
full_container_placename, _ = m_locations.key_to_placename(containers[1].group, containers[1].key)
end
local inserted_containers = false
for _, parent in ipairs(bare_label_parents) do
if parent == "+++" then
parent = "PL_PLACETYPEPREPCONTAINER" --th not use spaces
end
if parent:find("CONTAINER") then
if not containers then
internal_error("Parent category %s needs the container of %s but no containers specified: %s",
parent, canon_label, spec)
end
local location_type = fetch_primary_placetype(canon_label, spec)
local pl_location_type = m_placetypes.pluralize_placetype(location_type)
for _, container in ipairs(containers) do
local per_container_parent = parent
local cat_as_list
if per_container_parent:find("PL_PLACETYPE") then
if spec.bare_category_parent_type then
cat_as_list = normalize_cat_as(spec.bare_category_parent_type, spec)
else
cat_as_list = find_placetype_cat_as(container.spec.divs, pl_location_type) or
find_placetype_cat_as(container.spec.addl_divs, pl_location_type)
end
end
if not cat_as_list then
local canon_placetype, ptdata, ptmatch = m_placetypes.get_placetype_data(location_type, "from category")
if not canon_placetype or not (ptdata.generic_before_non_cities or ptdata.generic_before_cities) then
internal_error("Unable to locate plural location type %s among the divs or addl_divs " ..
"for container key %s spec %s, and the location type is either not in placetype_data or " ..
"not identified as a generic placetype", pl_location_type, container.key, container.spec)
end
cat_as_list = {{type = pl_location_type, prep =
m_placetypes.get_placetype_entry_preposition(location_type)}}
end
local prefixed_key = m_placetypes.get_prefixed_key(container.key, container.spec)
per_container_parent = gsub_literally(per_container_parent, "CONTAINER", prefixed_key)
for _, cat_as in ipairs(cat_as_list) do
local per_container_per_placetype_parent = per_container_parent
per_container_per_placetype_parent = gsub_literally(per_container_per_placetype_parent, "PL_PLACETYPE",
cat_as.type)
per_container_per_placetype_parent = gsub_literally(per_container_per_placetype_parent, "PREP",
cat_as.prep)
m_table.insertIfNot(parents, per_container_per_placetype_parent)
end
end
inserted_containers = true
else
m_table.insertIfNot(parents, parent)
end
end
if not inserted_containers and containers then
-- If we didn't insert the containers above in some form, insert them now as bare categories. Note that
-- this may be different categories from the container categories inserted above.
for _, container in ipairs(containers) do
m_table.insertIfNot(parents, container.key)
end
end
if spec.addl_parents then
for _, parent in ipairs(spec.addl_parents) do
m_table.insertIfNot(parents, parent)
end
end
local function format_boxval(val, specname)
if val == true then
val = "%l"
end
if type(val) == "string" then
val = gsub_literally(val, "%l", full_location_placename)
val = gsub_literally(val, "%e", elliptical_location_placename)
if val:find("%%c") then
if not full_container_placename then
internal_error("Wikipedia/Commons spec %s = %s has %%c in it but key %s has no " ..
"containers: %s", specname, val, canon_label, spec)
end
val = gsub_literally(val, "%c", full_container_placename)
end
end
return val
end
local description = spec.fulldesc or (
"{{{langname}}} terms related to the people, culture, or territory of " ..
fetch_or_construct_location_desc(group, canon_label, spec) .. ".")
local full_placename, _ = m_locations.key_to_placename(group, canon_label)
return {
type = "topic",
description = description,
breadcrumb = full_placename,
parents = parents,
wp = format_boxval(wp, "wp"),
wpcat = format_boxval(wpcat, "wpcat"),
commonscat = format_boxval(commonscat, "commonscat"),
}
end
end
end)
local function find_canonical_key_from_place(place, canon_label)
local has_the = false
local key
if place:find("^the ") then
key = place:gsub("^the ", "")
has_the = true
else
key = place
end
local group, spec = m_locations.find_canonical_key(key)
if group then
local requires_the = spec.the or false
if has_the ~= requires_the then
if has_the then
mw.log(("Mismatch in category name '%s', has 'the' in the category when it should not"):format(
canon_label))
else
mw.log(("Mismatch in category name '%s', should have 'the' in the category but does not"):
format(canon_label))
end
return nil
end
return group, key, spec
end
return nil
end
-- Handler for generic placetypes (those whose categories are added through category generation handlers or through
-- explicit category specs in the placetype data) for known locations in [[Module:place/locations]]. All such
-- placetypes have either a `generic_before_non_cities` setting (meaning they can occur before non-city locations) or
-- `generic_before_cities` setting (meaning they can occur before cities), or both. Examples of such categories are
-- "cities in the Bahamas" or "rivers in Western Australia, Australia", or (for city locations)
-- "neighbourhoods of Hong Kong" or "places in Melbourne".
insert(handlers, function(label)
for _, canon_label in ipairs { lcfirst(label), label } do
local placetype, in_of, place = mw.ustring.match(canon_label, "^([A-Za-zก-๛%- ]-)(ใน)(.*)$") --th
if not placetype then
placetype, in_of, place = mw.ustring.match(canon_label, "^([A-Za-zก-๛%- ]-)(ของ)(.*)$") --th
end
if placetype then
local normalized_placetype = placetype == "neighbourhoods" and "neighborhoods" or placetype
local canon_placetype, ptdata, ptmatch = m_placetypes.get_placetype_data(normalized_placetype, "from category")
if canon_placetype and (ptdata.generic_before_non_cities or ptdata.generic_before_cities) then
local group, key, spec = find_canonical_key_from_place(place, canon_label)
if group then
-- Check whether the location uses British spelling, but also check all containers, because
-- it's too hard to keep in sync the `british_spelling` setting for locations at all different
-- levels (e.g. cities of various countries, first and second level administrative division, etc.),
-- so we just set it at top level on the country.
local uses_british_spelling = spec.british_spelling
if uses_british_spelling == nil then
for containers in m_locations.iterate_containers(group, key, spec) do
local must_outer_break = false
for _, container in ipairs(containers) do
if container.spec.british_spelling ~= nil then
uses_british_spelling = container.spec.british_spelling
must_outer_break = true
break
end
end
if must_outer_break then
break
end
end
end
local allow_cat = true
if placetype == "neighborhoods" and uses_british_spelling or
placetype == "neighbourhoods" and not uses_british_spelling then
mw.log(("Mismatch in spelling of placetype '%s' in category '%s', should be '%s'"):format(
placetype, canon_label, uses_british_spelling and "neighbourhoods" or "neighborhoods"))
allow_cat = false
end
if spec.is_former_place and placetype ~= "สถานที่" then
allow_cat = false
end
local expected_prep
if spec.is_city then
expected_prep = ptdata.generic_before_cities
else
expected_prep = ptdata.generic_before_non_cities
end
if not expected_prep then
allow_cat = false
end
if allow_cat then
if expected_prep ~= in_of then
mw.log(("Mismatch in category name '%s', has '%s' when it should have '%s'"):format(
canon_label, in_of, expected_prep))
return nil
end
local linkdesc = m_placetypes.get_placetype_display_form(placetype,
spec.is_city and "city" or "noncity", "return full")
if linkdesc == false then
mw.log(("Display form for placetype %s is false, can't categorize"):format(dump(placetype)))
return nil
end
if not linkdesc then
internal_error("Unrecognized placetype %s when processing key %s, data %s, label %s",
placetype, key, spec, canon_label)
end
desc = linkdesc .. " " .. in_of .. " " .. fetch_or_construct_location_desc(group, key, spec)
desc = "{{{langname}}} " .. desc .. "."
local parents = {}
insert(parents, key)
if spec.no_container_parent then
-- top-level country, constituent country, continent or the like
insert(parents, {name = normalized_placetype, sort = key})
if spec.placetype == "ประเทศ" or m_table.contains(spec.placetype, "ประเทศ") then
local category_class = m_placetypes.get_equiv_placetype_prop(normalized_placetype,
function(pt) return m_placetypes.get_placetype_prop(pt, "class") end, {
from_category = true,
no_split_qualifiers = true,
})
if not category_class then
internal_error("Saw placetype %s that is either unknown or has no `class` " ..
"setting in `placetype_data`", normalized_placetype)
end
if class_is_political_division[category_class] == nil then
internal_error("Saw unknown category class %s derived from placetype %s",
category_class, normalized_placetype)
end
if class_is_political_division[category_class] then
insert(parents, "political divisions of specific countries")
end
end
else
local container_iterator = m_locations.iterate_containers(group, key, spec)
local next_containers = container_iterator()
if next_containers then
for _, container in ipairs(next_containers) do
local container_prep
if container.spec.is_city then
container_prep = ptdata.generic_before_cities
else
container_prep = ptdata.generic_before_non_cities
end
if not container_prep then
internal_error("For container key %s spec %s defines is_city = %s but " ..
"there is no corresponding `generic_before_*` setting in the " ..
"placedata for placetype %s", container.key, container.spec,
container.spec.is_city, placetype)
end
insert(parents, {
name = placetype .. container_prep .. m_placetypes.get_prefixed_key(container.key, container.spec), --th
sort = key
})
end
else
-- unrecognized countries or the like
insert(parents, {name = normalized_placetype, sort = key})
end
end
return {
type = "name",
topic = canon_label,
description = desc,
breadcrumb = placetype,
parents = parents,
}
end
end
end
end
end
end)
-- Handler for "state capitals of the United States", "provincial capitals of Canada", etc. This must precede the next
-- handler for specific political and misc (non-political) divisions of polities and subpolities, such as
-- "provinces of the Philippines", because "departmental capitals" is listed in cat_as for French prefectures and so
-- will trigger an error if that handler runs before this one.
insert(handlers, function(label)
label = lcfirst(label)
local capital_cat, place = label:match("^([a-z%- ]- capitals) of (.*)$")
-- Make sure we recognize the type of capital.
if place and capital_cat_to_placetype[capital_cat] then
local placetype = capital_cat_to_placetype[capital_cat]
local pl_placetype = m_placetypes.pluralize_placetype(placetype)
-- Locate the container, fetch its known political divisions, and make sure the placetype corresponding to the
-- type of capital is among the list.
local group, key, spec = find_canonical_key_from_place(place, canon_label)
if group and (spec.divs or spec.addl_divs) then
local saw_match = false
local variant_matches = {}
local divlists = {}
if spec.divs then
insert(divlists, spec.divs)
end
if spec.addl_divs then
insert(divlists, spec.addl_divs)
end
for _, divlist in ipairs(divlists) do
if type(divlist) ~= "table" then
divlist = {divlist}
end
for _, div in ipairs(divlist) do
if type(div) == "string" then
div = {type = div}
end
-- HACK. Currently if we don't find a match for the placetype, we map e.g. 'autonomous region'
-- -> 'regional capitals' and 'union territory' -> 'territorial capitals'. When encountering a
-- political division like 'autonomous region' or 'union territory', chop off everything up
-- through a space to make things match. To make this clearer, we record all such
-- "variant match" cases, and down below we insert a note into the category text indicating that
-- such "variant matches" are included among the category.
if pl_placetype == div.type or pl_placetype == div.type:gsub("^.* ", "") then
saw_match = true
if pl_placetype ~= div.type then
insert(variant_matches, div.type)
end
end
end
end
if saw_match then
-- Everything checks out, construct the category description.
local placetype_desc = m_placetypes.get_placetype_display_form(pl_placetype,
placetype.is_city and "city" or "noncity")
if placetype_desc == false then
mw.log(("Display form for pl_placetype %s is false, can't categorize"):format(dump(pl_placetype)))
return nil
end
if not placetype_desc then
internal_error("Unrecognized plural placetype %s, generated as the plural of %s, which " ..
"was found as the placetype of capital placetype %s in label %s", pl_placetype,
placetype, capital_cat, label)
end
local variant_match_text = ""
if variant_matches[1] then
local real_variant_match_descs = {}
for i, variant_match in ipairs(variant_matches) do
local variant_match_desc = m_placetypes.get_placetype_display_form(variant_match,
placetype.is_city and "city" or "noncity")
if variant_match_desc == nil then
internal_error("Unrecognized variant match plural placetype %s, coming from " ..
"place key %s, data %s in label %s", variant_match, key, spec, label)
end
if variant_match_desc then
-- skip those for which the description is `false`, like `ABBREVIATION_OF states`
-- in the United States divs.
insert(real_variant_match_descs, variant_match_desc)
end
end
if real_variant_match_descs[1] then
variant_match_text = " (including " .. m_table.serialCommaJoin(real_variant_match_descs)
.. ")"
end
end
local desc = "{{{langname}}} names of [[capital]]s of " .. placetype_desc .. variant_match_text ..
" of " .. fetch_or_construct_location_desc(group, key, spec) .. "."
local full_placename, _ = m_locations.key_to_placename(group, key)
local parents = {}
if spec.no_container_parent then
-- top-level country, constituent country, continent or the like
insert(parents, {name = capital_cat, sort = key})
else
local container_iterator = m_locations.iterate_containers(group, key, spec)
local next_containers = container_iterator()
if next_containers then
for _, container in ipairs(next_containers) do
insert(parents, {
name = capital_cat .. "ของ" .. m_placetypes.get_prefixed_key(container.key, container.spec), --th
sort = key
})
end
else
-- unrecognized countries or the like
insert(parents, {name = capital_cat, sort = key})
end
end
insert(parents, key)
return {
type = "name",
topic = label,
description = desc,
breadcrumb = full_placename,
parents = parents,
}
end
end
end
end)
local overriding_category_descriptions = {
["autonomous cities of Spain"] = "the [[w:Autonomous communities of Spain#Autonomous_cities|autonomous cities of Spain]]",
["regions of Greece"] = "the regions ([[periphery|peripheries]]) of [[Greece]]",
["regions of North Macedonia"] = "the regions ([[periphery|peripheries]]) of [[North Macedonia]]",
["subprefectures of Japan"] = "[[subprefecture]]s of [[Japan]]ese [[prefecture]]s",
}
-- Handler for specific political and misc (non-political) divisions of locations (polities, subpolities, cities, etc.),
-- such as "provinces of the Philippines", "counties of Wales", "municipalities of Tocantins, Brazil",
-- "boroughs of New York City", etc. This does not handle categories for generic placetypes (cities, rivers, etc.) of
-- locations, which are handled by different handlers above.
insert(handlers, function(label)
-- The label comes with an initial capitalization but we have to check both lowercase-initial and capital-initial
-- versions of the placetype to handle e.g. [[:Category:en:Indian reserves of Canada]].
for _, canon_label in ipairs { label, lcfirst(label) } do
for _, minimal_placetype in ipairs { true, false } do
local match_quantifier = minimal_placetype and "-" or "+"
-- Some categories have two "of"s in them, and depending on the category, it's correct to do either a greedy
-- ([[:Category:en:Abbreviations of states of the United States]], with placetype `abbreviations of states`)
-- or non-greedy ([[:Category:en:Provinces of the Democratic Republic of the Congo]], with placetype
-- `provinces`) match. We can't know in advance which is correct so we try both possibilities, doing the
-- non-greedy one first as it seems more common (there are many locations with "of" in them, but currently
-- only `abbreviations of states` occurs with a following location).
local placetype, in_of, place = mw.ustring.match(canon_label, "^([A-Za-zก-๛%- ]" .. match_quantifier .. ")(ของ)(.*)$")
if not placetype then
placetype, in_of, place = mw.ustring.match(canon_label, "^([A-Za-zก-๛%- ]" .. match_quantifier .. ")(ใน)(.*)$")
end
if placetype then
local group, key, spec = find_canonical_key_from_place(place, canon_label)
if group then
local function find_placetype(divs)
if divs then
if type(divs) ~= "table" then
divs = {divs}
end
for _, div in ipairs(divs) do
if type(div) == "string" then
div = {type = div}
end
local cat_as = div.cat_as or div.type
if type(cat_as) ~= "table" then
cat_as = {cat_as}
end
for _, pt_cat_as in ipairs(cat_as) do
if type(pt_cat_as) == "string" then
pt_cat_as = {type = pt_cat_as}
end
if placetype == pt_cat_as.type then
local div_parent = pt_cat_as.container_parent_type
if div_parent == nil then -- allow false
div_parent = div.container_parent_type
end
if div_parent == nil then
div_parent = placetype
end
return div_parent, pt_cat_as.prep or div.prep or "ของ"
end
end
end
end
return nil
end
local div_parent, div_prep = find_placetype(spec.divs)
if div_parent == nil then -- allow false
div_parent, div_prep = find_placetype(spec.addl_divs)
end
if div_parent == nil then -- allow false
div_parent, div_prep = find_placetype(spec.addl_divs_for_categorization)
end
if div_parent ~= nil then
if div_prep ~= in_of then
mw.log(("Mismatch in category name '%s', has '%s' when it should have '%s'"):format(
canon_label, in_of, div_prep))
return nil
end
local linkdesc = m_placetypes.get_placetype_display_form(placetype, spec.is_city and "city" or "noncity",
"return full")
if linkdesc == false then
mw.log(("Display form for placetype %s is false, can't categorize"):format(dump(placetype)))
return nil
end
if not linkdesc then
internal_error("Unrecognized placetype %s when processing key %s, data %s, label %s",
placetype, key, spec, canon_label)
end
local desc = overriding_category_descriptions[canon_label]
if not desc then
desc = linkdesc .. in_of .. fetch_or_construct_location_desc(group, key, spec) --th
end
desc = "{{{langname}}} " .. desc .. "."
local parents = {}
insert(parents, key)
if div_parent then -- div_parent may be `false`
if spec.no_container_parent then
-- top-level country, constituent country, continent or the like
insert(parents, {name = placetype, sort = " " .. key})
if spec.placetype == "ประเทศ" or m_table.contains(spec.placetype, "ประเทศ") then --th
insert(parents, "political divisions of specific countries")
end
else
local container_iterator = m_locations.iterate_containers(group, key, spec)
local next_containers = container_iterator()
if next_containers then
for _, container in ipairs(next_containers) do
insert(parents, {
name = div_parent .. in_of .. m_placetypes.get_prefixed_key(container.key, container.spec), --th
sort = key
})
end
else
-- unrecognized countries or the like
insert(parents, {name = placetype, sort = " " .. key})
end
end
end
return {
type = "name",
topic = canon_label,
description = desc,
breadcrumb = placetype,
parents = parents,
}
end
end
end
end
end
end)
labels["exonyms"] = {
type = "name",
-- special-cased description
description = "{{{langname}}} [[exonym]]s.",
parents = {"สถานที่"},
}
labels["political divisions of specific countries"] = {
type = "grouping",
description = "{{{langname}}} categories for political divisions of specific countries.",
parents = {"สถานที่"},
}
-- Misc. FIXME: Remove the need for this.
labels["nomes of Ancient Egypt"] = {
type = "name",
-- special-cased description
description = "{{{langname}}} names of the [[nome]]s of [[Ancient Egypt]].",
breadcrumb = "nomes",
parents = {"อียิปต์โบราณ"},
}
-- FIXME: Everything here has been moved from [[Module:category tree/topic/Earth]]. Most should be removed.
labels["มหาสมุทรแอตแลนติก"] = {
type = "related-to",
description = "default with the",
parents = {"โลก"},
}
labels["Atlantic Ocean"] = labels["มหาสมุทรแอตแลนติก"]
labels["British Isles"] = {
type = "related-to",
description = "=the people, culture, or territory of [[Great Britain]], [[Ireland]], and other nearby islands",
parents = {"ยุโรป", "เกาะ"},
}
labels["สหภาพยุโรป"] = {
type = "related-to",
description = "default with the",
parents = {"ยุโรป"},
}
labels["European Union"] = labels["สหภาพยุโรป"]
labels["Gascony"] = {
type = "related-to",
description = "default",
parents = {"Occitania, France"},
}
labels["Indian subcontinent"] = {
type = "related-to",
description = "default with the",
parents = {"เอเชียใต้"},
}
labels["Bengal"] = {
type = "related-to",
description = "{{{langname}}} terms related to the people, culture, or territory of [[Bengal]].",
parents = {"Indian subcontinent"},
}
labels["Kashmir"] = {
type = "related-to",
description = "{{{langname}}} terms related to the people, culture, or territory of [[Kashmir]].",
parents = {"Indian subcontinent"},
}
labels["Kashmir, India"] = {
type = "related-to",
description = "{{{langname}}} names of places in {{w|Kashmir, India}}.",
parents = {"อินเดีย", "Kashmir"},
}
labels["เกาหลี"] = {
type = "related-to",
description = "=the people, culture, or territory of [[Korea]]",
parents = {"เอเชีย"},
}
labels["Korea"] = labels["เกาหลี"]
labels["Languedoc"] = {
type = "related-to",
description = "default",
parents = {"Occitania, France"},
}
labels["Lapland"] = {
type = "related-to",
description = "=[[Lapland]], a region in northernmost Europe",
parents = {"ยุโรป", "ฟินแลนด์", "นอร์เวย์", "รัสเซีย", "สวีเดน"},
}
labels["ตะวันออกกลาง"] = {
type = "related-to",
description = "default with the",
parents = {"แอฟริกา", "เอเชีย"},
}
labels["Middle East"] = labels["ตะวันออกกลาง"]
labels["Netherlands Antilles"] = {
type = "related-to",
description = "=the people, culture, or territory of the [[Netherlands Antilles]]",
parents = {"เนเธอร์แลนด์", "อเมริกาเหนือ"},
}
labels["Provence"] = {
type = "related-to",
description = "default",
parents = {"Provence-Alpes-Côte d'Azur, France"},
}
labels["เอเชียใต้"] = {
type = "related-to",
description = "default",
parents = {"ยูเรเชีย", "เอเชีย"},
}
labels["South Asia"] = labels["เอเชียใต้"]
return {LABELS = labels, HANDLERS = handlers}
e5bhvs95rhqhsqvpd8l0ytl1hzaz8zt
5720702
5720687
2026-04-21T01:53:18Z
OctraBot
3198
5720702
Scribunto
text/plain
local labels = {}
local handlers = {}
local m_table = require("Module:table")
local en_utilities_module = "Module:en-utilities"
local string_utilities_module = "Module:string utilities"
local m_locations = require("Module:place/locations")
local m_placetypes = require("Module:place/placetypes")
local placetype_data = m_placetypes.placetype_data
local internal_error = m_locations.internal_error
local dump = mw.dumpObject
local insert = table.insert
local concat = table.concat
local is_callable = require("Module:fun").is_callable
--[==[ intro:
This module is part of the category tree code and contains code to generate the descriptions of place-related categories
such as [[Category:de:Hokkaido Prefecture, Japan]], [[Category:es:Cities in France]],
[[Category:pt:Municipalities of Tocantins, Brazil]], etc.). Note that this module doesn't actually create the
categories; that must be done separately, with the text "{{tl|auto cat}}" as the definition of the category. (This
process should automatically happen periodically for non-empty categories, because they will appear in
[[Special:WantedCategories]] and a bot will periodically examine that list and create any needed category.)
There are two ways that category descriptions are specified: (1) by manually adding an entry to the `labels` table,
keyed by the label (the category minus the language code) with a value consisting of a Lua table specifying the
description text and the category's parents; (2) through handlers (pieces of Lua code) added to the `handlers` list,
which recognize labels of a specific type (e.g. `Cities in France`) and generate the appropriate specification for that
label on-the-fly.
See [[Module:place]] for an introduction to the terminology associated with places along with a list of all the relevant
modules, along with for more specific information on types of toponyms and placetypes and how their categorization
works.
]==]
local function lcfirst(label)
return mw.getContentLanguage():lcfirst(label)
end
local function gsub_literally(str, from, to)
local m_strutils = require(string_utilities_module)
return (str:gsub(m_strutils.pattern_escape(from), m_strutils.replacement_escape(to)))
end
--ห้ามแปล class
local class_to_bare_category_parent = {
["polity"] = "องค์การทางการเมือง",
["subpolity"] = "political divisions",
["settlement"] = "การตั้งถิ่นฐาน",
["non-admin settlement"] = "การตั้งถิ่นฐาน",
["capital"] = "เมืองหลวง",
["natural feature"] = "natural features",
["man-made structure"] = "man-made structures",
["geographic region"] = "geographic and cultural areas",
}
--ห้ามแปล class
local class_is_political_division = {
["polity"] = true, -- strictly false but there are placetypes ambiguous between polity and subpolity
["subpolity"] = true,
["settlement"] = true,
["non-admin settlement"] = false,
["capital"] = true,
["natural feature"] = false,
["man-made structure"] = false,
["geographic region"] = false,
["generic place"] = false,
}
local capital_cat_to_placetype = {}
for placetype, capital_cat in pairs(m_placetypes.placetype_to_capital_cat) do
capital_cat_to_placetype[capital_cat] = placetype
end
-- Handler for bare categories for all types of capitals. This needs to precede the handler for bare placetype
-- categories as some of the types of capitals exist as placetypes as well.
insert(handlers, function(label)
label = lcfirst(label)
local capital_placetype = capital_cat_to_placetype[label]
if capital_placetype then
local pl_placetype = m_placetypes.pluralize_placetype(capital_placetype)
local linkdesc = m_placetypes.get_placetype_display_form(pl_placetype, "top-level")
if linkdesc == nil then
internal_error("Unrecognized placetype %s when processing label %s", capital_placetype, label)
end
if linkdesc == false then
mw.log(("Display form for pl_placetype %s is false, can't categorize"):format(dump(pl_placetype)))
return nil
end
return {
type = "name",
topic = label,
description = "{{{langname}}} names of [[capital]]s of " .. linkdesc .. ".",
parents = {"เมืองหลวง"},
}
end
end)
-- Handler for bare placetype categories. FIXME: Add wpcat= and commonscat= info. Previously we had it for various
-- so-called "generic" placetypes, but sometimes the categories were wrong.
insert(handlers, function(label)
for _, canon_label in ipairs { lcfirst(label), label } do
local ptdesc, ptdata = m_placetypes.get_placetype_display_form(canon_label, "top-level", "return full")
if ptdesc then
local from_category_props = {
from_category = true,
no_split_qualifiers = true,
}
local bare_category_parent = m_placetypes.get_equiv_placetype_prop(canon_label, function(pt)
local bare_category_parent = m_placetypes.get_placetype_prop(pt, "bare_category_parent")
if bare_category_parent then
return bare_category_parent
end
local class = m_placetypes.get_placetype_prop(pt, "class")
if class then
if class_to_bare_category_parent[class] == nil then
internal_error("Saw unknown category class %s derived from placetype %s",
class, canon_label)
end
return class_to_bare_category_parent[class]
end
end, from_category_props)
if not bare_category_parent then
internal_error("Saw placetype %s without a `class` or `bare_category_parent` setting, either " ..
"directly or through a fallback", canon_label)
end
local addl_bare_category_parents = m_placetypes.get_equiv_placetype_prop(canon_label, function(pt)
return m_placetypes.get_placetype_prop(pt, "addl_bare_category_parents")
end, from_category_props)
local bare_category_breadcrumb = m_placetypes.get_equiv_placetype_prop(canon_label, function(pt)
return m_placetypes.get_placetype_prop(pt, "bare_category_breadcrumb")
end, from_category_props)
if type(bare_category_parent) == "string" and bare_category_breadcrumb then
bare_category_parent = {name = bare_category_parent, sort = bare_category_breadcrumb}
end
local parents = {bare_category_parent}
if addl_bare_category_parents then
m_table.extend(parents, addl_bare_category_parents)
end
return {
type = "name",
topic = canon_label,
description = "{{{langname}}} " .. ptdesc .. ".",
breadcrumb = bare_category_breadcrumb,
parents = parents,
}
elseif ptdesc == false then
mw.log(("Display form for canon_label %s is false, can't categorize"):format(dump(canon_label)))
end
end
end)
local function fetch_primary_placetype(key, spec)
local placetype = spec.placetype
if type(placetype) == "table" then
placetype = placetype[1]
end
if not placetype then
internal_error("No placetype specified or defaulted for key %s, spec %s", key, spec)
end
return placetype
end
--[==[
Construct an appropriately linked location based on the full or elliptical placename, preceded by `"the "`` if
appropriate. Specifically:
Fetch the full and elliptical_placenames. If they are the same, just link to the placename directly. Otherwise, check if
the full placename exists; if so link to it. Otherwise, if the elliptical placename exists, link to it but display it as
the full placename. Finally, if neither full placename nor elliptical placename exists, fall back to linking to the full
placename. That way, we prefer full placenames to elliptical placenames if both or neither exist as Wiktionary entries,
but if only one exists, we link to that one rather than have a red link.
]==]
local function construct_linked_location(group, key, spec)
local full_placename, elliptical_placename = m_locations.key_to_placename(group, key)
local linked_placename
if elliptical_placename ~= full_placename then
local full_placename_title = mw.title.new(full_placename)
if full_placename_title and full_placename_title.exists then
linked_placename = m_locations.construct_linked_placename(spec, full_placename)
else
local elliptical_placename_title = mw.title.new(elliptical_placename)
if elliptical_placename_title and elliptical_placename_title.exists then
linked_placename = m_locations.construct_linked_placename(spec, elliptical_placename, full_placename)
end
end
end
return linked_placename or m_locations.construct_linked_placename(spec, full_placename)
end
--[==[
Construct the description of a location, including its container trail either to the end or until we encounter a
`no_include_container_in_desc` setting. For example, for the city of [[Birmingham]], the description will read
`"[[Birmingham]], a [[city]] in the [[West Midlands]] (which is a [[county]] of [[England]], which is a
[[constituent country]] of the [[United Kingdom]], which is a [[country]] in [[Europe]])"`. FIXME: Possibly we should
adopt the way city descriptions used to read, which was similar to `"the city of [[Birmingham]], in the county of the
[[West Midlands]], in the [[constituent country]] of [[England]], in the [[country]] of the [[United Kingdom]], in
[[Europe]]"`.
]==]
local function construct_location_desc(group, key, spec)
local parts = {}
local function ins(txt)
insert(parts, txt)
end
ins(construct_linked_location(group, key, spec))
local iteration = 0
local need_closing_paren = false
local containers = {{group = group, key = key, spec = spec}}
local container_iterator = m_locations.iterate_containers(group, key, spec)
while true do
iteration = iteration + 1
local include_container_in_desc = false
for _, container in ipairs(containers) do
if not container.spec.no_include_container_in_desc then
include_container_in_desc = true
break
end
end
if not include_container_in_desc then
break
end
local next_containers = container_iterator()
if not next_containers then
break
end
local is_former = nil
for _, container in ipairs(containers) do
local this_is_former = container.spec.is_former_place
if is_former == nil then
is_former = this_is_former
elseif is_former ~= this_is_former then
internal_error("When processing container trail of key %s, found a mixture of former and non-former " ..
"containers: %s", key, containers)
end
end
if #containers > 1 then
local placetypes = {}
local prepositions = {}
for _, container in ipairs(containers) do
local container_type = fetch_primary_placetype(container.key, container.spec)
m_table.insertIfNot(placetypes, m_placetypes.pluralize_placetype(container_type))
m_table.insertIfNot(prepositions, m_placetypes.get_placetype_entry_preposition(container_type))
end
if iteration == 1 then
ins(", ")
elseif iteration == 2 then
ins(" (which are ")
need_closing_paren = true
else
ins(", which are ")
end
if is_former then
ins("former ")
end
ins(m_table.serialCommaJoin(placetypes))
ins(" ")
ins(concat(prepositions, "/"))
else
if iteration == 1 then
ins(", ")
elseif iteration == 2 then
ins(" (which is ")
need_closing_paren = true
else
ins(", which is ")
end
local container_type = fetch_primary_placetype(containers[1].key, containers[1].spec)
if is_former then
ins("a former ")
else
ins(m_placetypes.get_placetype_article(container_type))
ins(" ")
end
ins(container_type)
ins(" ")
ins(m_placetypes.get_placetype_entry_preposition(container_type))
end
ins(" ")
first_container = false
containers = next_containers
local container_locations = {}
for _, container in ipairs(containers) do
insert(container_locations, construct_linked_location(container.group, container.key,
container.spec))
end
ins(m_table.serialCommaJoin(container_locations))
end
if need_closing_paren then
ins(")")
end
return concat(parts)
end
-- Fetch or construct the description of the location specified by `key`. If the `keydesc` property is specified,
-- use it directly but substitute any occurrence of `+++` with the auto-constructed location description, which
-- mentions the placename corresponding to the key, its placetype and container, and repeats the description up
-- the container trail until either there are no more containers or (more usually) the `no_include_container_in_desc`
-- setting is found (which is set on all continents and continent-level regions).
local function fetch_or_construct_location_desc(group, key, spec)
local val = spec.keydesc
if is_callable(val) then
val = val(group, key, spec)
spec.keydesc = val
end
val = val or "+++"
if val:find("%+%+%+") then
val = gsub_literally(val, "+++", construct_location_desc(group, key, spec))
end
return val
end
local function normalize_cat_as(cat_as, div)
if type(cat_as) ~= "table" or cat_as.type then
cat_as = {cat_as}
end
local ret_cat_as = {}
for _, pt_cat_as in ipairs(cat_as) do
if type(pt_cat_as) == "string" then
pt_cat_as = {type = pt_cat_as}
end
insert(ret_cat_as, {type = pt_cat_as.type, prep = pt_cat_as.prep or div.prep or "ของ"})
end
return ret_cat_as
end
-- Find the specified plural placetype among the divs for a given known location. Return a list of cat_as specs, where
-- each spec is of the form {type = "PLURAL_PLACETYPE", prep = "PREP"} indicating the plural placetype to use when
-- categorizing and the preposition to follow.
local function find_placetype_cat_as(divs, pl_placetype)
if divs then
if type(divs) ~= "table" then
divs = {divs}
end
for _, div in ipairs(divs) do
if type(div) == "string" then
div = {type = div}
end
if div.type == pl_placetype then
local cat_as = div.cat_as or div.type
return normalize_cat_as(cat_as, div)
end
end
end
return nil
end
-- Handler for bare placename categories for known locations in `locations` in [[Module:place/locations]].
insert(handlers, function(label)
for _, canon_label in ipairs { label, lcfirst(label) } do
local group, spec = m_locations.find_canonical_key(canon_label)
if group then
-- wp= defaults to true (Wikipedia article matches location's full placename)
local wp = spec.wp
if wp == nil then
wp = true
end
-- wpcat= defaults to wp= (if Wikipedia article has its own name, Wikipedia category and Commons category
-- generally follow)
local wpcat = spec.wpcat
if wpcat == nil then
wpcat = wp
end
-- commonscat= defaults to wpcat= (if Wikipedia category has its own name, Commons category generally
-- follows)
local commonscat = spec.commonscat
if commonscat == nil then
commonscat = wpcat
end
local parents = {}
local bare_label_parents = spec.overriding_bare_label_parents
local container_iterator = m_locations.iterate_containers(group, canon_label, spec)
local containers = container_iterator()
if not bare_label_parents then
bare_label_parents = {"+++"}
end
local full_location_placename, elliptical_location_placename = m_locations.key_to_placename(group, canon_label)
local full_container_placename
if containers then
full_container_placename, _ = m_locations.key_to_placename(containers[1].group, containers[1].key)
end
local inserted_containers = false
for _, parent in ipairs(bare_label_parents) do
if parent == "+++" then
parent = "PL_PLACETYPEPREPCONTAINER" --th not use spaces
end
if parent:find("CONTAINER") then
if not containers then
internal_error("Parent category %s needs the container of %s but no containers specified: %s",
parent, canon_label, spec)
end
local location_type = fetch_primary_placetype(canon_label, spec)
local pl_location_type = m_placetypes.pluralize_placetype(location_type)
for _, container in ipairs(containers) do
local per_container_parent = parent
local cat_as_list
if per_container_parent:find("PL_PLACETYPE") then
if spec.bare_category_parent_type then
cat_as_list = normalize_cat_as(spec.bare_category_parent_type, spec)
else
cat_as_list = find_placetype_cat_as(container.spec.divs, pl_location_type) or
find_placetype_cat_as(container.spec.addl_divs, pl_location_type)
end
end
if not cat_as_list then
local canon_placetype, ptdata, ptmatch = m_placetypes.get_placetype_data(location_type, "from category")
if not canon_placetype or not (ptdata.generic_before_non_cities or ptdata.generic_before_cities) then
internal_error("Unable to locate plural location type %s among the divs or addl_divs " ..
"for container key %s spec %s, and the location type is either not in placetype_data or " ..
"not identified as a generic placetype", pl_location_type, container.key, container.spec)
end
cat_as_list = {{type = pl_location_type, prep =
m_placetypes.get_placetype_entry_preposition(location_type)}}
end
local prefixed_key = m_placetypes.get_prefixed_key(container.key, container.spec)
per_container_parent = gsub_literally(per_container_parent, "CONTAINER", prefixed_key)
for _, cat_as in ipairs(cat_as_list) do
local per_container_per_placetype_parent = per_container_parent
per_container_per_placetype_parent = gsub_literally(per_container_per_placetype_parent, "PL_PLACETYPE",
cat_as.type)
per_container_per_placetype_parent = gsub_literally(per_container_per_placetype_parent, "PREP",
cat_as.prep)
m_table.insertIfNot(parents, per_container_per_placetype_parent)
end
end
inserted_containers = true
else
m_table.insertIfNot(parents, parent)
end
end
if not inserted_containers and containers then
-- If we didn't insert the containers above in some form, insert them now as bare categories. Note that
-- this may be different categories from the container categories inserted above.
for _, container in ipairs(containers) do
m_table.insertIfNot(parents, container.key)
end
end
if spec.addl_parents then
for _, parent in ipairs(spec.addl_parents) do
m_table.insertIfNot(parents, parent)
end
end
local function format_boxval(val, specname)
if val == true then
val = "%l"
end
if type(val) == "string" then
val = gsub_literally(val, "%l", full_location_placename)
val = gsub_literally(val, "%e", elliptical_location_placename)
if val:find("%%c") then
if not full_container_placename then
internal_error("Wikipedia/Commons spec %s = %s has %%c in it but key %s has no " ..
"containers: %s", specname, val, canon_label, spec)
end
val = gsub_literally(val, "%c", full_container_placename)
end
end
return val
end
local description = spec.fulldesc or (
"{{{langname}}} terms related to the people, culture, or territory of " ..
fetch_or_construct_location_desc(group, canon_label, spec) .. ".")
local full_placename, _ = m_locations.key_to_placename(group, canon_label)
return {
type = "topic",
description = description,
breadcrumb = full_placename,
parents = parents,
wp = format_boxval(wp, "wp"),
wpcat = format_boxval(wpcat, "wpcat"),
commonscat = format_boxval(commonscat, "commonscat"),
}
end
end
end)
local function find_canonical_key_from_place(place, canon_label)
local has_the = false
local key
if place:find("^the ") then
key = place:gsub("^the ", "")
has_the = true
else
key = place
end
local group, spec = m_locations.find_canonical_key(key)
if group then
local requires_the = spec.the or false
if has_the ~= requires_the then
if has_the then
mw.log(("Mismatch in category name '%s', has 'the' in the category when it should not"):format(
canon_label))
else
mw.log(("Mismatch in category name '%s', should have 'the' in the category but does not"):
format(canon_label))
end
return nil
end
return group, key, spec
end
return nil
end
-- Handler for generic placetypes (those whose categories are added through category generation handlers or through
-- explicit category specs in the placetype data) for known locations in [[Module:place/locations]]. All such
-- placetypes have either a `generic_before_non_cities` setting (meaning they can occur before non-city locations) or
-- `generic_before_cities` setting (meaning they can occur before cities), or both. Examples of such categories are
-- "cities in the Bahamas" or "rivers in Western Australia, Australia", or (for city locations)
-- "neighbourhoods of Hong Kong" or "places in Melbourne".
insert(handlers, function(label)
for _, canon_label in ipairs { lcfirst(label), label } do
local placetype, in_of, place = mw.ustring.match(canon_label, "^([A-Za-zก-๛%- ]-)(ใน)(.*)$") --th
if not placetype then
placetype, in_of, place = mw.ustring.match(canon_label, "^([A-Za-zก-๛%- ]-)(ของ)(.*)$") --th
end
if placetype then
local normalized_placetype = placetype == "neighbourhoods" and "neighborhoods" or placetype
local canon_placetype, ptdata, ptmatch = m_placetypes.get_placetype_data(normalized_placetype, "from category")
if canon_placetype and (ptdata.generic_before_non_cities or ptdata.generic_before_cities) then
local group, key, spec = find_canonical_key_from_place(place, canon_label)
if group then
-- Check whether the location uses British spelling, but also check all containers, because
-- it's too hard to keep in sync the `british_spelling` setting for locations at all different
-- levels (e.g. cities of various countries, first and second level administrative division, etc.),
-- so we just set it at top level on the country.
local uses_british_spelling = spec.british_spelling
if uses_british_spelling == nil then
for containers in m_locations.iterate_containers(group, key, spec) do
local must_outer_break = false
for _, container in ipairs(containers) do
if container.spec.british_spelling ~= nil then
uses_british_spelling = container.spec.british_spelling
must_outer_break = true
break
end
end
if must_outer_break then
break
end
end
end
local allow_cat = true
if placetype == "neighborhoods" and uses_british_spelling or
placetype == "neighbourhoods" and not uses_british_spelling then
mw.log(("Mismatch in spelling of placetype '%s' in category '%s', should be '%s'"):format(
placetype, canon_label, uses_british_spelling and "neighbourhoods" or "neighborhoods"))
allow_cat = false
end
if spec.is_former_place and placetype ~= "สถานที่" then
allow_cat = false
end
local expected_prep
if spec.is_city then
expected_prep = ptdata.generic_before_cities
else
expected_prep = ptdata.generic_before_non_cities
end
if not expected_prep then
allow_cat = false
end
if allow_cat then
if expected_prep ~= in_of then
mw.log(("Mismatch in category name '%s', has '%s' when it should have '%s'"):format(
canon_label, in_of, expected_prep))
return nil
end
local linkdesc = m_placetypes.get_placetype_display_form(placetype,
spec.is_city and "city" or "noncity", "return full")
if linkdesc == false then
mw.log(("Display form for placetype %s is false, can't categorize"):format(dump(placetype)))
return nil
end
if not linkdesc then
internal_error("Unrecognized placetype %s when processing key %s, data %s, label %s",
placetype, key, spec, canon_label)
end
desc = linkdesc .. " " .. in_of .. " " .. fetch_or_construct_location_desc(group, key, spec)
desc = "{{{langname}}} " .. desc .. "."
local parents = {}
insert(parents, key)
if spec.no_container_parent then
-- top-level country, constituent country, continent or the like
insert(parents, {name = normalized_placetype, sort = key})
if spec.placetype == "ประเทศ" or m_table.contains(spec.placetype, "ประเทศ") then
local category_class = m_placetypes.get_equiv_placetype_prop(normalized_placetype,
function(pt) return m_placetypes.get_placetype_prop(pt, "class") end, {
from_category = true,
no_split_qualifiers = true,
})
if not category_class then
internal_error("Saw placetype %s that is either unknown or has no `class` " ..
"setting in `placetype_data`", normalized_placetype)
end
if class_is_political_division[category_class] == nil then
internal_error("Saw unknown category class %s derived from placetype %s",
category_class, normalized_placetype)
end
if class_is_political_division[category_class] then
insert(parents, "political divisions of specific countries")
end
end
else
local container_iterator = m_locations.iterate_containers(group, key, spec)
local next_containers = container_iterator()
if next_containers then
for _, container in ipairs(next_containers) do
local container_prep
if container.spec.is_city then
container_prep = ptdata.generic_before_cities
else
container_prep = ptdata.generic_before_non_cities
end
if not container_prep then
internal_error("For container key %s spec %s defines is_city = %s but " ..
"there is no corresponding `generic_before_*` setting in the " ..
"placedata for placetype %s", container.key, container.spec,
container.spec.is_city, placetype)
end
insert(parents, {
name = placetype .. container_prep .. m_placetypes.get_prefixed_key(container.key, container.spec), --th
sort = key
})
end
else
-- unrecognized countries or the like
insert(parents, {name = normalized_placetype, sort = key})
end
end
return {
type = "name",
topic = canon_label,
description = desc,
breadcrumb = placetype,
parents = parents,
}
end
end
end
end
end
end)
-- Handler for "state capitals of the United States", "provincial capitals of Canada", etc. This must precede the next
-- handler for specific political and misc (non-political) divisions of polities and subpolities, such as
-- "provinces of the Philippines", because "departmental capitals" is listed in cat_as for French prefectures and so
-- will trigger an error if that handler runs before this one.
insert(handlers, function(label)
label = lcfirst(label)
--local capital_cat, place = label:match("^([a-z%- ]- capitals) of (.*)$")
local capital_cat, place = mw.ustring.match("^(เมืองหลวงของ[a-zก-๛%- ]-)ของ(.*)$")
-- Make sure we recognize the type of capital.
if place and capital_cat_to_placetype[capital_cat] then
local placetype = capital_cat_to_placetype[capital_cat]
local pl_placetype = m_placetypes.pluralize_placetype(placetype)
-- Locate the container, fetch its known political divisions, and make sure the placetype corresponding to the
-- type of capital is among the list.
local group, key, spec = find_canonical_key_from_place(place, canon_label)
if group and (spec.divs or spec.addl_divs) then
local saw_match = false
local variant_matches = {}
local divlists = {}
if spec.divs then
insert(divlists, spec.divs)
end
if spec.addl_divs then
insert(divlists, spec.addl_divs)
end
for _, divlist in ipairs(divlists) do
if type(divlist) ~= "table" then
divlist = {divlist}
end
for _, div in ipairs(divlist) do
if type(div) == "string" then
div = {type = div}
end
-- HACK. Currently if we don't find a match for the placetype, we map e.g. 'autonomous region'
-- -> 'regional capitals' and 'union territory' -> 'territorial capitals'. When encountering a
-- political division like 'autonomous region' or 'union territory', chop off everything up
-- through a space to make things match. To make this clearer, we record all such
-- "variant match" cases, and down below we insert a note into the category text indicating that
-- such "variant matches" are included among the category.
if pl_placetype == div.type or pl_placetype == div.type:gsub("^.* ", "") then
saw_match = true
if pl_placetype ~= div.type then
insert(variant_matches, div.type)
end
end
end
end
if saw_match then
-- Everything checks out, construct the category description.
local placetype_desc = m_placetypes.get_placetype_display_form(pl_placetype,
placetype.is_city and "city" or "noncity")
if placetype_desc == false then
mw.log(("Display form for pl_placetype %s is false, can't categorize"):format(dump(pl_placetype)))
return nil
end
if not placetype_desc then
internal_error("Unrecognized plural placetype %s, generated as the plural of %s, which " ..
"was found as the placetype of capital placetype %s in label %s", pl_placetype,
placetype, capital_cat, label)
end
local variant_match_text = ""
if variant_matches[1] then
local real_variant_match_descs = {}
for i, variant_match in ipairs(variant_matches) do
local variant_match_desc = m_placetypes.get_placetype_display_form(variant_match,
placetype.is_city and "city" or "noncity")
if variant_match_desc == nil then
internal_error("Unrecognized variant match plural placetype %s, coming from " ..
"place key %s, data %s in label %s", variant_match, key, spec, label)
end
if variant_match_desc then
-- skip those for which the description is `false`, like `ABBREVIATION_OF states`
-- in the United States divs.
insert(real_variant_match_descs, variant_match_desc)
end
end
if real_variant_match_descs[1] then
variant_match_text = " (including " .. m_table.serialCommaJoin(real_variant_match_descs)
.. ")"
end
end
local desc = "{{{langname}}} names of [[capital]]s of " .. placetype_desc .. variant_match_text ..
" of " .. fetch_or_construct_location_desc(group, key, spec) .. "."
local full_placename, _ = m_locations.key_to_placename(group, key)
local parents = {}
if spec.no_container_parent then
-- top-level country, constituent country, continent or the like
insert(parents, {name = capital_cat, sort = key})
else
local container_iterator = m_locations.iterate_containers(group, key, spec)
local next_containers = container_iterator()
if next_containers then
for _, container in ipairs(next_containers) do
insert(parents, {
name = capital_cat .. "ของ" .. m_placetypes.get_prefixed_key(container.key, container.spec), --th
sort = key
})
end
else
-- unrecognized countries or the like
insert(parents, {name = capital_cat, sort = key})
end
end
insert(parents, key)
return {
type = "name",
topic = label,
description = desc,
breadcrumb = full_placename,
parents = parents,
}
end
end
end
end)
local overriding_category_descriptions = {
["autonomous cities of Spain"] = "the [[w:Autonomous communities of Spain#Autonomous_cities|autonomous cities of Spain]]",
["regions of Greece"] = "the regions ([[periphery|peripheries]]) of [[Greece]]",
["regions of North Macedonia"] = "the regions ([[periphery|peripheries]]) of [[North Macedonia]]",
["subprefectures of Japan"] = "[[subprefecture]]s of [[Japan]]ese [[prefecture]]s",
}
-- Handler for specific political and misc (non-political) divisions of locations (polities, subpolities, cities, etc.),
-- such as "provinces of the Philippines", "counties of Wales", "municipalities of Tocantins, Brazil",
-- "boroughs of New York City", etc. This does not handle categories for generic placetypes (cities, rivers, etc.) of
-- locations, which are handled by different handlers above.
insert(handlers, function(label)
-- The label comes with an initial capitalization but we have to check both lowercase-initial and capital-initial
-- versions of the placetype to handle e.g. [[:Category:en:Indian reserves of Canada]].
for _, canon_label in ipairs { label, lcfirst(label) } do
for _, minimal_placetype in ipairs { true, false } do
local match_quantifier = minimal_placetype and "-" or "+"
-- Some categories have two "of"s in them, and depending on the category, it's correct to do either a greedy
-- ([[:Category:en:Abbreviations of states of the United States]], with placetype `abbreviations of states`)
-- or non-greedy ([[:Category:en:Provinces of the Democratic Republic of the Congo]], with placetype
-- `provinces`) match. We can't know in advance which is correct so we try both possibilities, doing the
-- non-greedy one first as it seems more common (there are many locations with "of" in them, but currently
-- only `abbreviations of states` occurs with a following location).
local placetype, in_of, place = mw.ustring.match(canon_label, "^([A-Za-zก-๛%- ]" .. match_quantifier .. ")(ของ)(.*)$")
if not placetype then
placetype, in_of, place = mw.ustring.match(canon_label, "^([A-Za-zก-๛%- ]" .. match_quantifier .. ")(ใน)(.*)$")
end
if placetype then
local group, key, spec = find_canonical_key_from_place(place, canon_label)
if group then
local function find_placetype(divs)
if divs then
if type(divs) ~= "table" then
divs = {divs}
end
for _, div in ipairs(divs) do
if type(div) == "string" then
div = {type = div}
end
local cat_as = div.cat_as or div.type
if type(cat_as) ~= "table" then
cat_as = {cat_as}
end
for _, pt_cat_as in ipairs(cat_as) do
if type(pt_cat_as) == "string" then
pt_cat_as = {type = pt_cat_as}
end
if placetype == pt_cat_as.type then
local div_parent = pt_cat_as.container_parent_type
if div_parent == nil then -- allow false
div_parent = div.container_parent_type
end
if div_parent == nil then
div_parent = placetype
end
return div_parent, pt_cat_as.prep or div.prep or "ของ"
end
end
end
end
return nil
end
local div_parent, div_prep = find_placetype(spec.divs)
if div_parent == nil then -- allow false
div_parent, div_prep = find_placetype(spec.addl_divs)
end
if div_parent == nil then -- allow false
div_parent, div_prep = find_placetype(spec.addl_divs_for_categorization)
end
if div_parent ~= nil then
if div_prep ~= in_of then
mw.log(("Mismatch in category name '%s', has '%s' when it should have '%s'"):format(
canon_label, in_of, div_prep))
return nil
end
local linkdesc = m_placetypes.get_placetype_display_form(placetype, spec.is_city and "city" or "noncity",
"return full")
if linkdesc == false then
mw.log(("Display form for placetype %s is false, can't categorize"):format(dump(placetype)))
return nil
end
if not linkdesc then
internal_error("Unrecognized placetype %s when processing key %s, data %s, label %s",
placetype, key, spec, canon_label)
end
local desc = overriding_category_descriptions[canon_label]
if not desc then
desc = linkdesc .. in_of .. fetch_or_construct_location_desc(group, key, spec) --th
end
desc = "{{{langname}}} " .. desc .. "."
local parents = {}
insert(parents, key)
if div_parent then -- div_parent may be `false`
if spec.no_container_parent then
-- top-level country, constituent country, continent or the like
insert(parents, {name = placetype, sort = " " .. key})
if spec.placetype == "ประเทศ" or m_table.contains(spec.placetype, "ประเทศ") then --th
insert(parents, "political divisions of specific countries")
end
else
local container_iterator = m_locations.iterate_containers(group, key, spec)
local next_containers = container_iterator()
if next_containers then
for _, container in ipairs(next_containers) do
insert(parents, {
name = div_parent .. in_of .. m_placetypes.get_prefixed_key(container.key, container.spec), --th
sort = key
})
end
else
-- unrecognized countries or the like
insert(parents, {name = placetype, sort = " " .. key})
end
end
end
return {
type = "name",
topic = canon_label,
description = desc,
breadcrumb = placetype,
parents = parents,
}
end
end
end
end
end
end)
labels["exonyms"] = {
type = "name",
-- special-cased description
description = "{{{langname}}} [[exonym]]s.",
parents = {"สถานที่"},
}
labels["political divisions of specific countries"] = {
type = "grouping",
description = "{{{langname}}} categories for political divisions of specific countries.",
parents = {"สถานที่"},
}
-- Misc. FIXME: Remove the need for this.
labels["nomes of Ancient Egypt"] = {
type = "name",
-- special-cased description
description = "{{{langname}}} names of the [[nome]]s of [[Ancient Egypt]].",
breadcrumb = "nomes",
parents = {"อียิปต์โบราณ"},
}
-- FIXME: Everything here has been moved from [[Module:category tree/topic/Earth]]. Most should be removed.
labels["มหาสมุทรแอตแลนติก"] = {
type = "related-to",
description = "default with the",
parents = {"โลก"},
}
labels["Atlantic Ocean"] = labels["มหาสมุทรแอตแลนติก"]
labels["British Isles"] = {
type = "related-to",
description = "=the people, culture, or territory of [[Great Britain]], [[Ireland]], and other nearby islands",
parents = {"ยุโรป", "เกาะ"},
}
labels["สหภาพยุโรป"] = {
type = "related-to",
description = "default with the",
parents = {"ยุโรป"},
}
labels["European Union"] = labels["สหภาพยุโรป"]
labels["Gascony"] = {
type = "related-to",
description = "default",
parents = {"Occitania, France"},
}
labels["Indian subcontinent"] = {
type = "related-to",
description = "default with the",
parents = {"เอเชียใต้"},
}
labels["Bengal"] = {
type = "related-to",
description = "{{{langname}}} terms related to the people, culture, or territory of [[Bengal]].",
parents = {"Indian subcontinent"},
}
labels["Kashmir"] = {
type = "related-to",
description = "{{{langname}}} terms related to the people, culture, or territory of [[Kashmir]].",
parents = {"Indian subcontinent"},
}
labels["Kashmir, India"] = {
type = "related-to",
description = "{{{langname}}} names of places in {{w|Kashmir, India}}.",
parents = {"อินเดีย", "Kashmir"},
}
labels["เกาหลี"] = {
type = "related-to",
description = "=the people, culture, or territory of [[Korea]]",
parents = {"เอเชีย"},
}
labels["Korea"] = labels["เกาหลี"]
labels["Languedoc"] = {
type = "related-to",
description = "default",
parents = {"Occitania, France"},
}
labels["Lapland"] = {
type = "related-to",
description = "=[[Lapland]], a region in northernmost Europe",
parents = {"ยุโรป", "ฟินแลนด์", "นอร์เวย์", "รัสเซีย", "สวีเดน"},
}
labels["ตะวันออกกลาง"] = {
type = "related-to",
description = "default with the",
parents = {"แอฟริกา", "เอเชีย"},
}
labels["Middle East"] = labels["ตะวันออกกลาง"]
labels["Netherlands Antilles"] = {
type = "related-to",
description = "=the people, culture, or territory of the [[Netherlands Antilles]]",
parents = {"เนเธอร์แลนด์", "อเมริกาเหนือ"},
}
labels["Provence"] = {
type = "related-to",
description = "default",
parents = {"Provence-Alpes-Côte d'Azur, France"},
}
labels["เอเชียใต้"] = {
type = "related-to",
description = "default",
parents = {"ยูเรเชีย", "เอเชีย"},
}
labels["South Asia"] = labels["เอเชียใต้"]
return {LABELS = labels, HANDLERS = handlers}
2linlys6mypmnvk6ldw4bjnug2h7ne0
5720703
5720702
2026-04-21T01:54:09Z
OctraBot
3198
5720703
Scribunto
text/plain
local labels = {}
local handlers = {}
local m_table = require("Module:table")
local en_utilities_module = "Module:en-utilities"
local string_utilities_module = "Module:string utilities"
local m_locations = require("Module:place/locations")
local m_placetypes = require("Module:place/placetypes")
local placetype_data = m_placetypes.placetype_data
local internal_error = m_locations.internal_error
local dump = mw.dumpObject
local insert = table.insert
local concat = table.concat
local is_callable = require("Module:fun").is_callable
--[==[ intro:
This module is part of the category tree code and contains code to generate the descriptions of place-related categories
such as [[Category:de:Hokkaido Prefecture, Japan]], [[Category:es:Cities in France]],
[[Category:pt:Municipalities of Tocantins, Brazil]], etc.). Note that this module doesn't actually create the
categories; that must be done separately, with the text "{{tl|auto cat}}" as the definition of the category. (This
process should automatically happen periodically for non-empty categories, because they will appear in
[[Special:WantedCategories]] and a bot will periodically examine that list and create any needed category.)
There are two ways that category descriptions are specified: (1) by manually adding an entry to the `labels` table,
keyed by the label (the category minus the language code) with a value consisting of a Lua table specifying the
description text and the category's parents; (2) through handlers (pieces of Lua code) added to the `handlers` list,
which recognize labels of a specific type (e.g. `Cities in France`) and generate the appropriate specification for that
label on-the-fly.
See [[Module:place]] for an introduction to the terminology associated with places along with a list of all the relevant
modules, along with for more specific information on types of toponyms and placetypes and how their categorization
works.
]==]
local function lcfirst(label)
return mw.getContentLanguage():lcfirst(label)
end
local function gsub_literally(str, from, to)
local m_strutils = require(string_utilities_module)
return (str:gsub(m_strutils.pattern_escape(from), m_strutils.replacement_escape(to)))
end
--ห้ามแปล class
local class_to_bare_category_parent = {
["polity"] = "องค์การทางการเมือง",
["subpolity"] = "political divisions",
["settlement"] = "การตั้งถิ่นฐาน",
["non-admin settlement"] = "การตั้งถิ่นฐาน",
["capital"] = "เมืองหลวง",
["natural feature"] = "natural features",
["man-made structure"] = "man-made structures",
["geographic region"] = "geographic and cultural areas",
}
--ห้ามแปล class
local class_is_political_division = {
["polity"] = true, -- strictly false but there are placetypes ambiguous between polity and subpolity
["subpolity"] = true,
["settlement"] = true,
["non-admin settlement"] = false,
["capital"] = true,
["natural feature"] = false,
["man-made structure"] = false,
["geographic region"] = false,
["generic place"] = false,
}
local capital_cat_to_placetype = {}
for placetype, capital_cat in pairs(m_placetypes.placetype_to_capital_cat) do
capital_cat_to_placetype[capital_cat] = placetype
end
-- Handler for bare categories for all types of capitals. This needs to precede the handler for bare placetype
-- categories as some of the types of capitals exist as placetypes as well.
insert(handlers, function(label)
label = lcfirst(label)
local capital_placetype = capital_cat_to_placetype[label]
if capital_placetype then
local pl_placetype = m_placetypes.pluralize_placetype(capital_placetype)
local linkdesc = m_placetypes.get_placetype_display_form(pl_placetype, "top-level")
if linkdesc == nil then
internal_error("Unrecognized placetype %s when processing label %s", capital_placetype, label)
end
if linkdesc == false then
mw.log(("Display form for pl_placetype %s is false, can't categorize"):format(dump(pl_placetype)))
return nil
end
return {
type = "name",
topic = label,
description = "{{{langname}}} names of [[capital]]s of " .. linkdesc .. ".",
parents = {"เมืองหลวง"},
}
end
end)
-- Handler for bare placetype categories. FIXME: Add wpcat= and commonscat= info. Previously we had it for various
-- so-called "generic" placetypes, but sometimes the categories were wrong.
insert(handlers, function(label)
for _, canon_label in ipairs { lcfirst(label), label } do
local ptdesc, ptdata = m_placetypes.get_placetype_display_form(canon_label, "top-level", "return full")
if ptdesc then
local from_category_props = {
from_category = true,
no_split_qualifiers = true,
}
local bare_category_parent = m_placetypes.get_equiv_placetype_prop(canon_label, function(pt)
local bare_category_parent = m_placetypes.get_placetype_prop(pt, "bare_category_parent")
if bare_category_parent then
return bare_category_parent
end
local class = m_placetypes.get_placetype_prop(pt, "class")
if class then
if class_to_bare_category_parent[class] == nil then
internal_error("Saw unknown category class %s derived from placetype %s",
class, canon_label)
end
return class_to_bare_category_parent[class]
end
end, from_category_props)
if not bare_category_parent then
internal_error("Saw placetype %s without a `class` or `bare_category_parent` setting, either " ..
"directly or through a fallback", canon_label)
end
local addl_bare_category_parents = m_placetypes.get_equiv_placetype_prop(canon_label, function(pt)
return m_placetypes.get_placetype_prop(pt, "addl_bare_category_parents")
end, from_category_props)
local bare_category_breadcrumb = m_placetypes.get_equiv_placetype_prop(canon_label, function(pt)
return m_placetypes.get_placetype_prop(pt, "bare_category_breadcrumb")
end, from_category_props)
if type(bare_category_parent) == "string" and bare_category_breadcrumb then
bare_category_parent = {name = bare_category_parent, sort = bare_category_breadcrumb}
end
local parents = {bare_category_parent}
if addl_bare_category_parents then
m_table.extend(parents, addl_bare_category_parents)
end
return {
type = "name",
topic = canon_label,
description = "{{{langname}}} " .. ptdesc .. ".",
breadcrumb = bare_category_breadcrumb,
parents = parents,
}
elseif ptdesc == false then
mw.log(("Display form for canon_label %s is false, can't categorize"):format(dump(canon_label)))
end
end
end)
local function fetch_primary_placetype(key, spec)
local placetype = spec.placetype
if type(placetype) == "table" then
placetype = placetype[1]
end
if not placetype then
internal_error("No placetype specified or defaulted for key %s, spec %s", key, spec)
end
return placetype
end
--[==[
Construct an appropriately linked location based on the full or elliptical placename, preceded by `"the "`` if
appropriate. Specifically:
Fetch the full and elliptical_placenames. If they are the same, just link to the placename directly. Otherwise, check if
the full placename exists; if so link to it. Otherwise, if the elliptical placename exists, link to it but display it as
the full placename. Finally, if neither full placename nor elliptical placename exists, fall back to linking to the full
placename. That way, we prefer full placenames to elliptical placenames if both or neither exist as Wiktionary entries,
but if only one exists, we link to that one rather than have a red link.
]==]
local function construct_linked_location(group, key, spec)
local full_placename, elliptical_placename = m_locations.key_to_placename(group, key)
local linked_placename
if elliptical_placename ~= full_placename then
local full_placename_title = mw.title.new(full_placename)
if full_placename_title and full_placename_title.exists then
linked_placename = m_locations.construct_linked_placename(spec, full_placename)
else
local elliptical_placename_title = mw.title.new(elliptical_placename)
if elliptical_placename_title and elliptical_placename_title.exists then
linked_placename = m_locations.construct_linked_placename(spec, elliptical_placename, full_placename)
end
end
end
return linked_placename or m_locations.construct_linked_placename(spec, full_placename)
end
--[==[
Construct the description of a location, including its container trail either to the end or until we encounter a
`no_include_container_in_desc` setting. For example, for the city of [[Birmingham]], the description will read
`"[[Birmingham]], a [[city]] in the [[West Midlands]] (which is a [[county]] of [[England]], which is a
[[constituent country]] of the [[United Kingdom]], which is a [[country]] in [[Europe]])"`. FIXME: Possibly we should
adopt the way city descriptions used to read, which was similar to `"the city of [[Birmingham]], in the county of the
[[West Midlands]], in the [[constituent country]] of [[England]], in the [[country]] of the [[United Kingdom]], in
[[Europe]]"`.
]==]
local function construct_location_desc(group, key, spec)
local parts = {}
local function ins(txt)
insert(parts, txt)
end
ins(construct_linked_location(group, key, spec))
local iteration = 0
local need_closing_paren = false
local containers = {{group = group, key = key, spec = spec}}
local container_iterator = m_locations.iterate_containers(group, key, spec)
while true do
iteration = iteration + 1
local include_container_in_desc = false
for _, container in ipairs(containers) do
if not container.spec.no_include_container_in_desc then
include_container_in_desc = true
break
end
end
if not include_container_in_desc then
break
end
local next_containers = container_iterator()
if not next_containers then
break
end
local is_former = nil
for _, container in ipairs(containers) do
local this_is_former = container.spec.is_former_place
if is_former == nil then
is_former = this_is_former
elseif is_former ~= this_is_former then
internal_error("When processing container trail of key %s, found a mixture of former and non-former " ..
"containers: %s", key, containers)
end
end
if #containers > 1 then
local placetypes = {}
local prepositions = {}
for _, container in ipairs(containers) do
local container_type = fetch_primary_placetype(container.key, container.spec)
m_table.insertIfNot(placetypes, m_placetypes.pluralize_placetype(container_type))
m_table.insertIfNot(prepositions, m_placetypes.get_placetype_entry_preposition(container_type))
end
if iteration == 1 then
ins(", ")
elseif iteration == 2 then
ins(" (which are ")
need_closing_paren = true
else
ins(", which are ")
end
if is_former then
ins("former ")
end
ins(m_table.serialCommaJoin(placetypes))
ins(" ")
ins(concat(prepositions, "/"))
else
if iteration == 1 then
ins(", ")
elseif iteration == 2 then
ins(" (which is ")
need_closing_paren = true
else
ins(", which is ")
end
local container_type = fetch_primary_placetype(containers[1].key, containers[1].spec)
if is_former then
ins("a former ")
else
ins(m_placetypes.get_placetype_article(container_type))
ins(" ")
end
ins(container_type)
ins(" ")
ins(m_placetypes.get_placetype_entry_preposition(container_type))
end
ins(" ")
first_container = false
containers = next_containers
local container_locations = {}
for _, container in ipairs(containers) do
insert(container_locations, construct_linked_location(container.group, container.key,
container.spec))
end
ins(m_table.serialCommaJoin(container_locations))
end
if need_closing_paren then
ins(")")
end
return concat(parts)
end
-- Fetch or construct the description of the location specified by `key`. If the `keydesc` property is specified,
-- use it directly but substitute any occurrence of `+++` with the auto-constructed location description, which
-- mentions the placename corresponding to the key, its placetype and container, and repeats the description up
-- the container trail until either there are no more containers or (more usually) the `no_include_container_in_desc`
-- setting is found (which is set on all continents and continent-level regions).
local function fetch_or_construct_location_desc(group, key, spec)
local val = spec.keydesc
if is_callable(val) then
val = val(group, key, spec)
spec.keydesc = val
end
val = val or "+++"
if val:find("%+%+%+") then
val = gsub_literally(val, "+++", construct_location_desc(group, key, spec))
end
return val
end
local function normalize_cat_as(cat_as, div)
if type(cat_as) ~= "table" or cat_as.type then
cat_as = {cat_as}
end
local ret_cat_as = {}
for _, pt_cat_as in ipairs(cat_as) do
if type(pt_cat_as) == "string" then
pt_cat_as = {type = pt_cat_as}
end
insert(ret_cat_as, {type = pt_cat_as.type, prep = pt_cat_as.prep or div.prep or "ของ"})
end
return ret_cat_as
end
-- Find the specified plural placetype among the divs for a given known location. Return a list of cat_as specs, where
-- each spec is of the form {type = "PLURAL_PLACETYPE", prep = "PREP"} indicating the plural placetype to use when
-- categorizing and the preposition to follow.
local function find_placetype_cat_as(divs, pl_placetype)
if divs then
if type(divs) ~= "table" then
divs = {divs}
end
for _, div in ipairs(divs) do
if type(div) == "string" then
div = {type = div}
end
if div.type == pl_placetype then
local cat_as = div.cat_as or div.type
return normalize_cat_as(cat_as, div)
end
end
end
return nil
end
-- Handler for bare placename categories for known locations in `locations` in [[Module:place/locations]].
insert(handlers, function(label)
for _, canon_label in ipairs { label, lcfirst(label) } do
local group, spec = m_locations.find_canonical_key(canon_label)
if group then
-- wp= defaults to true (Wikipedia article matches location's full placename)
local wp = spec.wp
if wp == nil then
wp = true
end
-- wpcat= defaults to wp= (if Wikipedia article has its own name, Wikipedia category and Commons category
-- generally follow)
local wpcat = spec.wpcat
if wpcat == nil then
wpcat = wp
end
-- commonscat= defaults to wpcat= (if Wikipedia category has its own name, Commons category generally
-- follows)
local commonscat = spec.commonscat
if commonscat == nil then
commonscat = wpcat
end
local parents = {}
local bare_label_parents = spec.overriding_bare_label_parents
local container_iterator = m_locations.iterate_containers(group, canon_label, spec)
local containers = container_iterator()
if not bare_label_parents then
bare_label_parents = {"+++"}
end
local full_location_placename, elliptical_location_placename = m_locations.key_to_placename(group, canon_label)
local full_container_placename
if containers then
full_container_placename, _ = m_locations.key_to_placename(containers[1].group, containers[1].key)
end
local inserted_containers = false
for _, parent in ipairs(bare_label_parents) do
if parent == "+++" then
parent = "PL_PLACETYPEPREPCONTAINER" --th not use spaces
end
if parent:find("CONTAINER") then
if not containers then
internal_error("Parent category %s needs the container of %s but no containers specified: %s",
parent, canon_label, spec)
end
local location_type = fetch_primary_placetype(canon_label, spec)
local pl_location_type = m_placetypes.pluralize_placetype(location_type)
for _, container in ipairs(containers) do
local per_container_parent = parent
local cat_as_list
if per_container_parent:find("PL_PLACETYPE") then
if spec.bare_category_parent_type then
cat_as_list = normalize_cat_as(spec.bare_category_parent_type, spec)
else
cat_as_list = find_placetype_cat_as(container.spec.divs, pl_location_type) or
find_placetype_cat_as(container.spec.addl_divs, pl_location_type)
end
end
if not cat_as_list then
local canon_placetype, ptdata, ptmatch = m_placetypes.get_placetype_data(location_type, "from category")
if not canon_placetype or not (ptdata.generic_before_non_cities or ptdata.generic_before_cities) then
internal_error("Unable to locate plural location type %s among the divs or addl_divs " ..
"for container key %s spec %s, and the location type is either not in placetype_data or " ..
"not identified as a generic placetype", pl_location_type, container.key, container.spec)
end
cat_as_list = {{type = pl_location_type, prep =
m_placetypes.get_placetype_entry_preposition(location_type)}}
end
local prefixed_key = m_placetypes.get_prefixed_key(container.key, container.spec)
per_container_parent = gsub_literally(per_container_parent, "CONTAINER", prefixed_key)
for _, cat_as in ipairs(cat_as_list) do
local per_container_per_placetype_parent = per_container_parent
per_container_per_placetype_parent = gsub_literally(per_container_per_placetype_parent, "PL_PLACETYPE",
cat_as.type)
per_container_per_placetype_parent = gsub_literally(per_container_per_placetype_parent, "PREP",
cat_as.prep)
m_table.insertIfNot(parents, per_container_per_placetype_parent)
end
end
inserted_containers = true
else
m_table.insertIfNot(parents, parent)
end
end
if not inserted_containers and containers then
-- If we didn't insert the containers above in some form, insert them now as bare categories. Note that
-- this may be different categories from the container categories inserted above.
for _, container in ipairs(containers) do
m_table.insertIfNot(parents, container.key)
end
end
if spec.addl_parents then
for _, parent in ipairs(spec.addl_parents) do
m_table.insertIfNot(parents, parent)
end
end
local function format_boxval(val, specname)
if val == true then
val = "%l"
end
if type(val) == "string" then
val = gsub_literally(val, "%l", full_location_placename)
val = gsub_literally(val, "%e", elliptical_location_placename)
if val:find("%%c") then
if not full_container_placename then
internal_error("Wikipedia/Commons spec %s = %s has %%c in it but key %s has no " ..
"containers: %s", specname, val, canon_label, spec)
end
val = gsub_literally(val, "%c", full_container_placename)
end
end
return val
end
local description = spec.fulldesc or (
"{{{langname}}} terms related to the people, culture, or territory of " ..
fetch_or_construct_location_desc(group, canon_label, spec) .. ".")
local full_placename, _ = m_locations.key_to_placename(group, canon_label)
return {
type = "topic",
description = description,
breadcrumb = full_placename,
parents = parents,
wp = format_boxval(wp, "wp"),
wpcat = format_boxval(wpcat, "wpcat"),
commonscat = format_boxval(commonscat, "commonscat"),
}
end
end
end)
local function find_canonical_key_from_place(place, canon_label)
local has_the = false
local key
if place:find("^the ") then
key = place:gsub("^the ", "")
has_the = true
else
key = place
end
local group, spec = m_locations.find_canonical_key(key)
if group then
local requires_the = spec.the or false
if has_the ~= requires_the then
if has_the then
mw.log(("Mismatch in category name '%s', has 'the' in the category when it should not"):format(
canon_label))
else
mw.log(("Mismatch in category name '%s', should have 'the' in the category but does not"):
format(canon_label))
end
return nil
end
return group, key, spec
end
return nil
end
-- Handler for generic placetypes (those whose categories are added through category generation handlers or through
-- explicit category specs in the placetype data) for known locations in [[Module:place/locations]]. All such
-- placetypes have either a `generic_before_non_cities` setting (meaning they can occur before non-city locations) or
-- `generic_before_cities` setting (meaning they can occur before cities), or both. Examples of such categories are
-- "cities in the Bahamas" or "rivers in Western Australia, Australia", or (for city locations)
-- "neighbourhoods of Hong Kong" or "places in Melbourne".
insert(handlers, function(label)
for _, canon_label in ipairs { lcfirst(label), label } do
local placetype, in_of, place = mw.ustring.match(canon_label, "^([A-Za-zก-๛%- ]-)(ใน)(.*)$") --th
if not placetype then
placetype, in_of, place = mw.ustring.match(canon_label, "^([A-Za-zก-๛%- ]-)(ของ)(.*)$") --th
end
if placetype then
local normalized_placetype = placetype == "neighbourhoods" and "neighborhoods" or placetype
local canon_placetype, ptdata, ptmatch = m_placetypes.get_placetype_data(normalized_placetype, "from category")
if canon_placetype and (ptdata.generic_before_non_cities or ptdata.generic_before_cities) then
local group, key, spec = find_canonical_key_from_place(place, canon_label)
if group then
-- Check whether the location uses British spelling, but also check all containers, because
-- it's too hard to keep in sync the `british_spelling` setting for locations at all different
-- levels (e.g. cities of various countries, first and second level administrative division, etc.),
-- so we just set it at top level on the country.
local uses_british_spelling = spec.british_spelling
if uses_british_spelling == nil then
for containers in m_locations.iterate_containers(group, key, spec) do
local must_outer_break = false
for _, container in ipairs(containers) do
if container.spec.british_spelling ~= nil then
uses_british_spelling = container.spec.british_spelling
must_outer_break = true
break
end
end
if must_outer_break then
break
end
end
end
local allow_cat = true
if placetype == "neighborhoods" and uses_british_spelling or
placetype == "neighbourhoods" and not uses_british_spelling then
mw.log(("Mismatch in spelling of placetype '%s' in category '%s', should be '%s'"):format(
placetype, canon_label, uses_british_spelling and "neighbourhoods" or "neighborhoods"))
allow_cat = false
end
if spec.is_former_place and placetype ~= "สถานที่" then
allow_cat = false
end
local expected_prep
if spec.is_city then
expected_prep = ptdata.generic_before_cities
else
expected_prep = ptdata.generic_before_non_cities
end
if not expected_prep then
allow_cat = false
end
if allow_cat then
if expected_prep ~= in_of then
mw.log(("Mismatch in category name '%s', has '%s' when it should have '%s'"):format(
canon_label, in_of, expected_prep))
return nil
end
local linkdesc = m_placetypes.get_placetype_display_form(placetype,
spec.is_city and "city" or "noncity", "return full")
if linkdesc == false then
mw.log(("Display form for placetype %s is false, can't categorize"):format(dump(placetype)))
return nil
end
if not linkdesc then
internal_error("Unrecognized placetype %s when processing key %s, data %s, label %s",
placetype, key, spec, canon_label)
end
desc = linkdesc .. " " .. in_of .. " " .. fetch_or_construct_location_desc(group, key, spec)
desc = "{{{langname}}} " .. desc .. "."
local parents = {}
insert(parents, key)
if spec.no_container_parent then
-- top-level country, constituent country, continent or the like
insert(parents, {name = normalized_placetype, sort = key})
if spec.placetype == "ประเทศ" or m_table.contains(spec.placetype, "ประเทศ") then
local category_class = m_placetypes.get_equiv_placetype_prop(normalized_placetype,
function(pt) return m_placetypes.get_placetype_prop(pt, "class") end, {
from_category = true,
no_split_qualifiers = true,
})
if not category_class then
internal_error("Saw placetype %s that is either unknown or has no `class` " ..
"setting in `placetype_data`", normalized_placetype)
end
if class_is_political_division[category_class] == nil then
internal_error("Saw unknown category class %s derived from placetype %s",
category_class, normalized_placetype)
end
if class_is_political_division[category_class] then
insert(parents, "political divisions of specific countries")
end
end
else
local container_iterator = m_locations.iterate_containers(group, key, spec)
local next_containers = container_iterator()
if next_containers then
for _, container in ipairs(next_containers) do
local container_prep
if container.spec.is_city then
container_prep = ptdata.generic_before_cities
else
container_prep = ptdata.generic_before_non_cities
end
if not container_prep then
internal_error("For container key %s spec %s defines is_city = %s but " ..
"there is no corresponding `generic_before_*` setting in the " ..
"placedata for placetype %s", container.key, container.spec,
container.spec.is_city, placetype)
end
insert(parents, {
name = placetype .. container_prep .. m_placetypes.get_prefixed_key(container.key, container.spec), --th
sort = key
})
end
else
-- unrecognized countries or the like
insert(parents, {name = normalized_placetype, sort = key})
end
end
return {
type = "name",
topic = canon_label,
description = desc,
breadcrumb = placetype,
parents = parents,
}
end
end
end
end
end
end)
-- Handler for "state capitals of the United States", "provincial capitals of Canada", etc. This must precede the next
-- handler for specific political and misc (non-political) divisions of polities and subpolities, such as
-- "provinces of the Philippines", because "departmental capitals" is listed in cat_as for French prefectures and so
-- will trigger an error if that handler runs before this one.
insert(handlers, function(label)
label = lcfirst(label)
--local capital_cat, place = label:match("^([a-z%- ]- capitals) of (.*)$")
local capital_cat, place = mw.ustring.match(label, "^(เมืองหลวงของ[a-zก-๛%- ]-)ของ(.*)$")
-- Make sure we recognize the type of capital.
if place and capital_cat_to_placetype[capital_cat] then
local placetype = capital_cat_to_placetype[capital_cat]
local pl_placetype = m_placetypes.pluralize_placetype(placetype)
-- Locate the container, fetch its known political divisions, and make sure the placetype corresponding to the
-- type of capital is among the list.
local group, key, spec = find_canonical_key_from_place(place, canon_label)
if group and (spec.divs or spec.addl_divs) then
local saw_match = false
local variant_matches = {}
local divlists = {}
if spec.divs then
insert(divlists, spec.divs)
end
if spec.addl_divs then
insert(divlists, spec.addl_divs)
end
for _, divlist in ipairs(divlists) do
if type(divlist) ~= "table" then
divlist = {divlist}
end
for _, div in ipairs(divlist) do
if type(div) == "string" then
div = {type = div}
end
-- HACK. Currently if we don't find a match for the placetype, we map e.g. 'autonomous region'
-- -> 'regional capitals' and 'union territory' -> 'territorial capitals'. When encountering a
-- political division like 'autonomous region' or 'union territory', chop off everything up
-- through a space to make things match. To make this clearer, we record all such
-- "variant match" cases, and down below we insert a note into the category text indicating that
-- such "variant matches" are included among the category.
if pl_placetype == div.type or pl_placetype == div.type:gsub("^.* ", "") then
saw_match = true
if pl_placetype ~= div.type then
insert(variant_matches, div.type)
end
end
end
end
if saw_match then
-- Everything checks out, construct the category description.
local placetype_desc = m_placetypes.get_placetype_display_form(pl_placetype,
placetype.is_city and "city" or "noncity")
if placetype_desc == false then
mw.log(("Display form for pl_placetype %s is false, can't categorize"):format(dump(pl_placetype)))
return nil
end
if not placetype_desc then
internal_error("Unrecognized plural placetype %s, generated as the plural of %s, which " ..
"was found as the placetype of capital placetype %s in label %s", pl_placetype,
placetype, capital_cat, label)
end
local variant_match_text = ""
if variant_matches[1] then
local real_variant_match_descs = {}
for i, variant_match in ipairs(variant_matches) do
local variant_match_desc = m_placetypes.get_placetype_display_form(variant_match,
placetype.is_city and "city" or "noncity")
if variant_match_desc == nil then
internal_error("Unrecognized variant match plural placetype %s, coming from " ..
"place key %s, data %s in label %s", variant_match, key, spec, label)
end
if variant_match_desc then
-- skip those for which the description is `false`, like `ABBREVIATION_OF states`
-- in the United States divs.
insert(real_variant_match_descs, variant_match_desc)
end
end
if real_variant_match_descs[1] then
variant_match_text = " (including " .. m_table.serialCommaJoin(real_variant_match_descs)
.. ")"
end
end
local desc = "{{{langname}}} names of [[capital]]s of " .. placetype_desc .. variant_match_text ..
" of " .. fetch_or_construct_location_desc(group, key, spec) .. "."
local full_placename, _ = m_locations.key_to_placename(group, key)
local parents = {}
if spec.no_container_parent then
-- top-level country, constituent country, continent or the like
insert(parents, {name = capital_cat, sort = key})
else
local container_iterator = m_locations.iterate_containers(group, key, spec)
local next_containers = container_iterator()
if next_containers then
for _, container in ipairs(next_containers) do
insert(parents, {
name = capital_cat .. "ของ" .. m_placetypes.get_prefixed_key(container.key, container.spec), --th
sort = key
})
end
else
-- unrecognized countries or the like
insert(parents, {name = capital_cat, sort = key})
end
end
insert(parents, key)
return {
type = "name",
topic = label,
description = desc,
breadcrumb = full_placename,
parents = parents,
}
end
end
end
end)
local overriding_category_descriptions = {
["autonomous cities of Spain"] = "the [[w:Autonomous communities of Spain#Autonomous_cities|autonomous cities of Spain]]",
["regions of Greece"] = "the regions ([[periphery|peripheries]]) of [[Greece]]",
["regions of North Macedonia"] = "the regions ([[periphery|peripheries]]) of [[North Macedonia]]",
["subprefectures of Japan"] = "[[subprefecture]]s of [[Japan]]ese [[prefecture]]s",
}
-- Handler for specific political and misc (non-political) divisions of locations (polities, subpolities, cities, etc.),
-- such as "provinces of the Philippines", "counties of Wales", "municipalities of Tocantins, Brazil",
-- "boroughs of New York City", etc. This does not handle categories for generic placetypes (cities, rivers, etc.) of
-- locations, which are handled by different handlers above.
insert(handlers, function(label)
-- The label comes with an initial capitalization but we have to check both lowercase-initial and capital-initial
-- versions of the placetype to handle e.g. [[:Category:en:Indian reserves of Canada]].
for _, canon_label in ipairs { label, lcfirst(label) } do
for _, minimal_placetype in ipairs { true, false } do
local match_quantifier = minimal_placetype and "-" or "+"
-- Some categories have two "of"s in them, and depending on the category, it's correct to do either a greedy
-- ([[:Category:en:Abbreviations of states of the United States]], with placetype `abbreviations of states`)
-- or non-greedy ([[:Category:en:Provinces of the Democratic Republic of the Congo]], with placetype
-- `provinces`) match. We can't know in advance which is correct so we try both possibilities, doing the
-- non-greedy one first as it seems more common (there are many locations with "of" in them, but currently
-- only `abbreviations of states` occurs with a following location).
local placetype, in_of, place = mw.ustring.match(canon_label, "^([A-Za-zก-๛%- ]" .. match_quantifier .. ")(ของ)(.*)$")
if not placetype then
placetype, in_of, place = mw.ustring.match(canon_label, "^([A-Za-zก-๛%- ]" .. match_quantifier .. ")(ใน)(.*)$")
end
if placetype then
local group, key, spec = find_canonical_key_from_place(place, canon_label)
if group then
local function find_placetype(divs)
if divs then
if type(divs) ~= "table" then
divs = {divs}
end
for _, div in ipairs(divs) do
if type(div) == "string" then
div = {type = div}
end
local cat_as = div.cat_as or div.type
if type(cat_as) ~= "table" then
cat_as = {cat_as}
end
for _, pt_cat_as in ipairs(cat_as) do
if type(pt_cat_as) == "string" then
pt_cat_as = {type = pt_cat_as}
end
if placetype == pt_cat_as.type then
local div_parent = pt_cat_as.container_parent_type
if div_parent == nil then -- allow false
div_parent = div.container_parent_type
end
if div_parent == nil then
div_parent = placetype
end
return div_parent, pt_cat_as.prep or div.prep or "ของ"
end
end
end
end
return nil
end
local div_parent, div_prep = find_placetype(spec.divs)
if div_parent == nil then -- allow false
div_parent, div_prep = find_placetype(spec.addl_divs)
end
if div_parent == nil then -- allow false
div_parent, div_prep = find_placetype(spec.addl_divs_for_categorization)
end
if div_parent ~= nil then
if div_prep ~= in_of then
mw.log(("Mismatch in category name '%s', has '%s' when it should have '%s'"):format(
canon_label, in_of, div_prep))
return nil
end
local linkdesc = m_placetypes.get_placetype_display_form(placetype, spec.is_city and "city" or "noncity",
"return full")
if linkdesc == false then
mw.log(("Display form for placetype %s is false, can't categorize"):format(dump(placetype)))
return nil
end
if not linkdesc then
internal_error("Unrecognized placetype %s when processing key %s, data %s, label %s",
placetype, key, spec, canon_label)
end
local desc = overriding_category_descriptions[canon_label]
if not desc then
desc = linkdesc .. in_of .. fetch_or_construct_location_desc(group, key, spec) --th
end
desc = "{{{langname}}} " .. desc .. "."
local parents = {}
insert(parents, key)
if div_parent then -- div_parent may be `false`
if spec.no_container_parent then
-- top-level country, constituent country, continent or the like
insert(parents, {name = placetype, sort = " " .. key})
if spec.placetype == "ประเทศ" or m_table.contains(spec.placetype, "ประเทศ") then --th
insert(parents, "political divisions of specific countries")
end
else
local container_iterator = m_locations.iterate_containers(group, key, spec)
local next_containers = container_iterator()
if next_containers then
for _, container in ipairs(next_containers) do
insert(parents, {
name = div_parent .. in_of .. m_placetypes.get_prefixed_key(container.key, container.spec), --th
sort = key
})
end
else
-- unrecognized countries or the like
insert(parents, {name = placetype, sort = " " .. key})
end
end
end
return {
type = "name",
topic = canon_label,
description = desc,
breadcrumb = placetype,
parents = parents,
}
end
end
end
end
end
end)
labels["exonyms"] = {
type = "name",
-- special-cased description
description = "{{{langname}}} [[exonym]]s.",
parents = {"สถานที่"},
}
labels["political divisions of specific countries"] = {
type = "grouping",
description = "{{{langname}}} categories for political divisions of specific countries.",
parents = {"สถานที่"},
}
-- Misc. FIXME: Remove the need for this.
labels["nomes of Ancient Egypt"] = {
type = "name",
-- special-cased description
description = "{{{langname}}} names of the [[nome]]s of [[Ancient Egypt]].",
breadcrumb = "nomes",
parents = {"อียิปต์โบราณ"},
}
-- FIXME: Everything here has been moved from [[Module:category tree/topic/Earth]]. Most should be removed.
labels["มหาสมุทรแอตแลนติก"] = {
type = "related-to",
description = "default with the",
parents = {"โลก"},
}
labels["Atlantic Ocean"] = labels["มหาสมุทรแอตแลนติก"]
labels["British Isles"] = {
type = "related-to",
description = "=the people, culture, or territory of [[Great Britain]], [[Ireland]], and other nearby islands",
parents = {"ยุโรป", "เกาะ"},
}
labels["สหภาพยุโรป"] = {
type = "related-to",
description = "default with the",
parents = {"ยุโรป"},
}
labels["European Union"] = labels["สหภาพยุโรป"]
labels["Gascony"] = {
type = "related-to",
description = "default",
parents = {"Occitania, France"},
}
labels["Indian subcontinent"] = {
type = "related-to",
description = "default with the",
parents = {"เอเชียใต้"},
}
labels["Bengal"] = {
type = "related-to",
description = "{{{langname}}} terms related to the people, culture, or territory of [[Bengal]].",
parents = {"Indian subcontinent"},
}
labels["Kashmir"] = {
type = "related-to",
description = "{{{langname}}} terms related to the people, culture, or territory of [[Kashmir]].",
parents = {"Indian subcontinent"},
}
labels["Kashmir, India"] = {
type = "related-to",
description = "{{{langname}}} names of places in {{w|Kashmir, India}}.",
parents = {"อินเดีย", "Kashmir"},
}
labels["เกาหลี"] = {
type = "related-to",
description = "=the people, culture, or territory of [[Korea]]",
parents = {"เอเชีย"},
}
labels["Korea"] = labels["เกาหลี"]
labels["Languedoc"] = {
type = "related-to",
description = "default",
parents = {"Occitania, France"},
}
labels["Lapland"] = {
type = "related-to",
description = "=[[Lapland]], a region in northernmost Europe",
parents = {"ยุโรป", "ฟินแลนด์", "นอร์เวย์", "รัสเซีย", "สวีเดน"},
}
labels["ตะวันออกกลาง"] = {
type = "related-to",
description = "default with the",
parents = {"แอฟริกา", "เอเชีย"},
}
labels["Middle East"] = labels["ตะวันออกกลาง"]
labels["Netherlands Antilles"] = {
type = "related-to",
description = "=the people, culture, or territory of the [[Netherlands Antilles]]",
parents = {"เนเธอร์แลนด์", "อเมริกาเหนือ"},
}
labels["Provence"] = {
type = "related-to",
description = "default",
parents = {"Provence-Alpes-Côte d'Azur, France"},
}
labels["เอเชียใต้"] = {
type = "related-to",
description = "default",
parents = {"ยูเรเชีย", "เอเชีย"},
}
labels["South Asia"] = labels["เอเชียใต้"]
return {LABELS = labels, HANDLERS = handlers}
8mj1wiag254u54rhuemz518zgwxyobs
มอดูล:category tree/หัวข้อ/สังคม
828
44518
5720684
5720603
2026-04-21T01:19:32Z
OctraBot
3198
5720684
Scribunto
text/plain
local labels = {}
local unpack = unpack or table.unpack -- Lua 5.2 compatibility
labels["สังคม"] = {
type = "related-to",
description = "default",
parents = {"หัวข้อทั้งหมด"},
}
labels["society"] = labels["สังคม"]
labels["ปริญญา"] = {
type = "name",
description = "default",
parents = {"การศึกษา"},
}
labels["academic degrees"] = labels["ปริญญา"]
labels["ชั้นเรียน"] = {
type = "set",
description = "default",
parents = {"การศึกษา"},
}
labels["academic grades"] = labels["ชั้นเรียน"]
labels["การบัญชี"] = {
type = "related-to",
description = "default",
parents = {"การเงิน"},
}
labels["accounting"] = labels["การบัญชี"]
labels["administrative divisions"] = {
type = "set",
description = "default",
parents = {"รัฐบาลและการปกครอง"},
}
labels["การโฆษณา"] = {
type = "related-to",
description = "default",
parents = {"ธุรกิจ", "การตลาด"},
}
labels["advertising"] = labels["การโฆษณา"]
labels["alt-right"] = {
type = "related-to",
description = "=the [[alt-right]], a loosely connected [[far-right]], [[white nationalist]] movement",
parents = {"อนุรักษนิยม", "ลัทธิฟาสซิสต์", "คตินิยม", "white supremacist ideology"},
}
labels["อนาธิปไตย"] = {
type = "related-to",
description = "default",
parents = {"คตินิยม", "ฝ่ายซ้าย"},
}
labels["anarchism"] = labels["อนาธิปไตย"]
labels["anti-Semitism"] = {
type = "related-to",
description = "default",
parents = {"forms of discrimination"},
}
labels["รางวัล"] = {
type = "name,type",
description = "default",
parents = {"สังคม"},
}
labels["awards"] = labels["รางวัล"]
labels["การธนาคาร"] = {
type = "related-to",
description = "default",
parents = {"การเงิน", "อุตสาหกรรม"},
}
labels["banking"] = labels["การธนาคาร"]
labels["bars"] = {
type = "type",
description = "default",
parents = {"กิจการ", "การดื่ม"},
}
labels["Basque nationalism"] = {
type = "related-to",
description = "default",
parents = {"Basque Country, Spain", "ชาตินิยม"},
}
labels["เครื่องนอน"] = {
type = "related-to",
description = "default",
parents = {"บ้าน"},
}
labels["bedding"] = labels["เครื่องนอน"]
labels["blacksmithing"] = {
type = "related-to",
description = "default",
parents = {"โลหกรรม"},
}
labels["bond market"] = {
type = "related-to",
description = "default with the",
parents = {"การเงิน"},
}
labels["bookbinding"] = {
type = "related-to",
description = "default",
parents = {"publishing"},
}
labels["book sizes"] = {
type = "name",
description = "default",
parents = {"bookbinding"},
}
labels["Brexit"] = {
type = "related-to",
description = "={{w|Brexit}}, i.e. the withdrawal of the {{w|United Kingdom}} from the {{w|European Union}}",
parents = {"ชาตินิยม", "การเมืองยุโรป", "การเมืองสหราชอาณาจักร"},
}
labels["burial"] = {
type = "related-to",
description = "default",
parents = {"สังคม", "ความตาย"},
}
labels["ธุรกิจ"] = {
type = "related-to",
description = "default",
parents = {"เศรษฐศาสตร์", "สังคม"},
}
labels["business"] = labels["ธุรกิจ"]
labels["กิจการ"] = {
type = "type",
description = "=[[business]]es (specific commercial enterprises or establishments)",
parents = {"ธุรกิจ"},
}
labels["businesses"] = labels["กิจการ"]
labels["ทุนนิยม"] = {
type = "related-to",
description = "default",
parents = {"เศรษฐศาสตร์", "คตินิยม"},
}
labels["capitalism"] = labels["ทุนนิยม"]
labels["chairs"] = {
type = "related-to",
description = "default",
parents = {"เครื่องเรือน", "การนั่ง"},
}
labels["child abuse"] = {
type = "related-to",
description = "default",
parents = {"อาชญากรรม", "เด็ก", "ความรุนแรง"},
}
labels["Chinese restaurants"] = {
type = "related-to",
description = "default",
breadcrumb = "Chinese",
parents = {"ร้านอาหาร", "จีน"},
}
labels["cleaning"] = {
type = "related-to",
description = "default",
parents = {"บ้าน"},
}
labels["เหรียญ"] = {
type = "set,related-to",
description = "default",
parents = {"เงิน (ตัวกลาง)"},
}
labels["coins"] = labels["เหรียญ"]
labels["อนุรักษนิยม"] = {
type = "related-to",
description = "=[[conservatism]] or [[traditionalist]] beliefs",
parents = {"คตินิยม"},
}
labels["conservatism"] = labels["อนุรักษนิยม"]
labels["commerce"] = {
type = "related-to",
description = "default",
parents = {"ธุรกิจ"},
}
labels["commercial documents"] = {
type = "set",
description = "default",
parents = {"commerce"},
}
labels["commercial law"] = {
type = "related-to",
description = "default",
breacrumb = "commercial",
parents = {"กฎหมาย", "commerce"},
}
labels["competition law"] = {
type = "related-to",
description = "default",
breacrumb = "competition",
parents = {"กฎหมาย"},
}
labels["antitrust law"] = {
description = "default",
breacrumb = "antitrust",
parents = {"competition law"},
}
labels["law of unfair competition"] = {
description = "default with the",
breacrumb = "unfair",
parents = {{name = "competition law", sort = "unfair"}},
}
labels["ลัทธิคอมมิวนิสต์"] = {
type = "related-to",
description = "default",
parents = {"คตินิยม", "สังคมนิยม", "ฝ่ายซ้าย"},
}
labels["communism"] = labels["ลัทธิคอมมิวนิสต์"]
labels["constitutional law"] = {
type = "related-to",
description = "default",
breadcrumb = "constitutional",
parents = {"กฎหมาย"},
}
labels["ลิขสิทธิ์"] = {
type = "related-to",
description = "default",
parents = {"ทรัพย์สินทางปัญญา"},
}
labels["copyright"] = labels["ลิขสิทธิ์"]
labels["copyright licenses"] = {
type = "name",
description = "=[[license]]s of [[copyright]]",
breadcrumb_and_first_sort_base = "licenses",
parents = {"ลิขสิทธิ์"},
}
labels["corporate law"] = {
type = "related-to",
description = "default",
breadcrumb = "corporate",
parents = {"กฎหมาย"},
}
labels["corruption"] = {
type = "related-to",
description = "default",
parents = {"อาชญากรรม", "การเมือง"},
}
labels["งานฝีมือ"] = {
type = "type",
description = "default",
parents = {"สังคม"},
}
labels["crafts"] = labels["งานฝีมือ"]
labels["อาชญากรรม"] = {
type = "related-to",
description = "default",
parents = {"สังคม", "กฎหมายอาญา"},
}
labels["crime"] = labels["อาชญากรรม"]
labels["crime prevention"] = {
type = "related-to",
description = "default",
parents = {"public safety", "อาชญากรรม"},
}
labels["กฎหมายอาญา"] = {
type = "related-to",
description = "default",
breadcrumb = "criminal",
parents = {"กฎหมาย"},
}
labels["criminal law"] = labels["กฎหมายอาญา"]
labels["crochet"] = {
type = "related-to",
description = "default",
parents = {"งานฝีมือ"},
}
labels["cryptocurrency"] = {
type = "related-to",
description = "default",
parents = {"เงินตรา", "วิทยาการรหัสลับ", "เทคโนโลยี"},
}
-- currencies คือเงินตราชนิดต่าง ๆ
labels["สกุลเงิน"] = {
type = "set",
description = "default",
parents = {"เงิน (ตัวกลาง)", "เงินตรา"},
}
labels["currencies"] = labels["สกุลเงิน"]
-- currency คือเงินที่กำหนดตามกฎหมาย มีตราของรัฐ
labels["เงินตรา"] = {
type = "related-to",
description = "default",
parents = {"เงิน (ตัวกลาง)"},
}
labels["currency"] = labels["เงินตรา"]
labels["dairy farming"] = {
type = "related-to",
description = "default",
parents = {"เกษตรกรรม", "อุตสาหกรรม"},
}
labels["ประชาธิปไตย"] = {
type = "related-to",
description = "default",
parents = {"ระบอบการปกครอง"},
}
labels["democracy"] = labels["ประชาธิปไตย"]
labels["diplomacy"] = {
type = "related-to",
description = "default",
parents = {"สังคม"},
}
labels["discrimination"] = {
type = "related-to",
description = "default",
parents = {"สังคม"},
}
labels["drug trafficking"] = {
type = "related-to",
description = "default",
parents = {"อาชญากรรม", "ยา"},
}
labels["การศึกษา"] = {
type = "related-to",
description = "default",
parents = {"สังคม"},
}
labels["education"] = labels["การศึกษา"]
labels["emergency services"] = {
type = "related-to",
description = "default",
parents = {"public safety"},
}
labels["employment"] = {
type = "related-to",
description = "default",
parents = {"ธุรกิจ", "งาน"},
}
labels["espionage"] = {
type = "related-to",
description = "default",
parents = {"security", "deception", "secrecy"},
}
labels["evil"] = {
type = "related-to",
description = "default",
parents = {"จริยศาสตร์", "ศาสนา"},
}
labels["fame"] = {
type = "related-to",
description = "default",
parents = {"สังคม", "ความรู้"},
}
labels["family law"] = {
type = "related-to",
description = "default",
breadcrumb = "family",
parents = {"กฎหมาย"},
}
labels["ลัทธิฟาสซิสต์"] = {
type = "related-to",
description = "default",
parents = {"คตินิยม"},
}
labels["fascism"] = labels["ลัทธิฟาสซิสต์"]
labels["farriery"] = {
type = "related-to",
description = "default",
parents = {"blacksmithing", "ม้า"},
}
-- AKA คตินิยมสิทธิสตรี, สตรีสิทธินิยม
labels["สตรีนิยม"] = {
type = "related-to",
description = "default",
parents = {"สถานะเพศ", "เพศหญิง", "คตินิยม", "สังคม", "สังคมวิทยา"},
}
labels["feminism"] = labels["สตรีนิยม"]
labels["feudalism"] = {
type = "related-to",
description = "default",
parents = {"ระบอบการปกครอง"},
}
labels["การเงิน"] = {
type = "related-to",
description = "default",
parents = {"ธุรกิจ"},
}
labels["finance"] = labels["การเงิน"]
labels["firefighting"] = {
type = "related-to",
description = "default",
parents = {"emergency services", "ไฟ"},
}
labels["forms of discrimination"] = {
type = "type",
description = "{{{langname}}} terms for [[form]]s of [[discrimination]].",
additional = "{{also|หมวดหมู่:{{{langcode}}}:อคติ|หมวดหมู่:{{{langcode}}}:ทฤษฎีสมคบคิด|หมวดหมู่:{{{langcode}}}:คตินิยม}}",
breadcrumb = "forms",
parents = {"discrimination"},
}
labels["ระบอบการปกครอง"] = {
type = "type",
description = "{{{langname}}} terms for [[form]]s of [[government]].",
breadcrumb = "forms",
parents = {"รัฐบาลและการปกครอง"},
}
labels["forms of government"] = labels["ระบอบการปกครอง"]
labels["อิสรภาพ"] = {
type = "related-to",
description = "default",
parents = {"สังคม"},
}
labels["freedom"] = labels["อิสรภาพ"]
labels["freedom of speech"] = {
type = "related-to",
description = "default",
breadcrumb = "speech",
parents = {{name = "อิสรภาพ", sort = "speech"}, "กฎหมาย"},
}
labels["freemasonry"] = {
type = "related-to",
description = "default",
parents = {"องค์การ"},
}
labels["funeral"] = {
type = "related-to",
description = "default",
parents = {"สังคม", "ความตาย", "อุตสาหกรรม"},
}
labels["เครื่องเรือน"] = {
type = "related-to",
description = "default",
parents = {"บ้าน"},
commonscat = true,
wpcat = true,
}
labels["furniture"] = labels["เครื่องเรือน"]
labels["gender-critical feminism"] = {
type = "related-to",
description = "default",
breadcrumb = "gender-critical",
parents = {"สตรีนิยม", "สถานะเพศ", "transphobia"},
}
labels["glassblowing"] = {
type = "related-to",
description = "default",
parents = {"งานฝีมือ", "glass"},
}
labels["good"] = {
type = "related-to",
description = "default",
parents = {"จริยศาสตร์", "ศาสนา"},
}
labels["รัฐบาลและการปกครอง"] = {
type = "related-to",
description = "default",
parents = {"สังคม", "การเมือง"},
}
labels["government"] = labels["รัฐบาลและการปกครอง"]
labels["hairdressing"] = {
type = "related-to",
description = "default",
parents = {"ผมและขน", "งานฝีมือ"},
}
labels["สังคมชั้นสูง"] = {
type = "related-to",
description = "=royalty and nobility",
parents = {"สังคม"},
}
labels["high society"] = labels["สังคมชั้นสูง"]
labels["Hindutva"] = {
type = "related-to",
description = "=[[Hindutva]] or {{w|Hindu nationalism}}",
parents = {"อนุรักษนิยม", "ศาสนาฮินดู", "คตินิยม", "การเมืองอินเดีย", "ชาตินิยม", "เทวาธิปไตย"},
}
labels["สกุลเงินในอดีต"] = {
type = "set",
description = "default",
breadcrumb = "historical",
parents = {"สกุลเงิน"},
}
labels["historical currencies"] = labels["สกุลเงินในอดีต"]
labels["บ้าน"] = {
type = "related-to",
description = "default with the",
parents = {"สังคม"},
}
labels["home"] = labels["บ้าน"]
labels["homophobia"] = {
type = "related-to",
description = "default",
parents = {"queerphobia"},
}
labels["hospitality"] = {
type = "related-to",
description = "default",
parents = {"ธุรกิจ"},
}
labels["host industry"] = {
type = "related-to",
description = "default",
parents = {"hospitality", "กิจการ"},
}
labels["โรงแรม"] = {
type = "type",
description = "default",
parents = {"กิจการ", "การท่องเที่ยว", "hospitality"},
}
labels["hotels"] = labels["โรงแรม"]
labels["ครัวเรือน"] = {
type = "related-to",
description = "default",
parents = {"บ้าน"},
}
labels["household"] = labels["ครัวเรือน"]
labels["housing"] = {
type = "related-to",
description = "default",
parents = {"บ้าน", "อาคาร"},
}
labels["ทรัพยากรมนุษย์"] = {
type = "related-to",
description = "default no singularize",
parents = {"ธุรกิจ", "สังคมวิทยา"},
}
labels["human resources"] = labels["ทรัพยากรมนุษย์"]
labels["คตินิยม"] = {
type = "related-to",
description = "default",
parents = {"สังคม", "การเมือง"},
}
labels["ideologies"] = labels["คตินิยม"]
labels["imperialism"] = {
type = "related-to",
description = "default",
parents = {"คตินิยม"},
}
labels["import/export"] = {
type = "related-to",
description = "=[[import]]s and [[export]]s",
parents = {"การค้า", "การคมนาคม"},
}
labels["incel community"] = {
type = "related-to",
description = "=the [[incel]] community",
parents = {"masculism", "เพศ"},
}
labels["incoterms"] = {
type = "related-to",
description = "=[[Incoterm]]s",
parents = {"ธุรกิจ", "import/export"},
}
labels["อุตสาหกรรม"] = {
type = "related-to",
description = "default",
parents = {"ธุรกิจ"},
}
labels["industries"] = labels["อุตสาหกรรม"]
labels["inheritance law"] = {
type = "related-to",
description = "default",
breadcrumb = "inheritance",
parents = {"กฤหมาย"},
}
labels["insurance"] = {
type = "related-to",
description = "default",
parents = {"การเงิน", "อุตสาหกรรม"},
}
labels["ทรัพย์สินทางปัญญา"] = {
type = "related-to",
description = "=[[intellectual property]] [[law]]",
parents = {"กฎหมาย"},
}
labels["intellectual property"] = labels["ทรัพย์สินทางปัญญา"]
labels["กฎหมายระหว่างประเทศ"] = {
type = "related-to",
description = "default",
breadcrumb = "international",
parents = {"กฎหมาย"},
}
labels["international law"] = labels["กฎหมายระหว่างประเทศ"]
labels["international relations"] = {
type = "related-to",
description = "default wikify",
parents = {"การเมือง", "โลก"},
}
labels["การเงินอิสลาม"] = {
type = "related-to",
description = "default wikify",
breadcrumb = "Islamic",
parents = {"การเงิน", "การธนาคาร", "ศาสนาอิสลาม"},
}
labels["Islamic finance"] = labels["การเงินอิสลาม"]
labels["กฎหมายอิสลาม"] = {
type = "related-to",
description = "default wikify",
breadcrumb = "กฎหมาย",
parents = {{name = "ศาสนาอิสลาม", sort = "กฎหมาย"}, "กฎหมาย"},
}
labels["Islamic law"] = labels["กฎหมายอิสลาม"]
labels["Islamism"] = {
type = "related-to",
description = "default",
parents = {"คตินิยม", "อนุรักษนิยม", "ศาสนาอิสลาม", "เทวาธิปไตย"},
}
labels["Juche"] = {
type = "related-to",
description = "default",
parents = {"เกาหลีเหนือ", "communism", "ชาตินิยม"},
}
labels["justice"] = {
type = "related-to",
description = "default",
parents = {"สังคม"},
}
labels["เคเอฟซี"] = {
type = "related-to",
description = "=the {{w|Kentucky Fried Chicken}} [[chain]] of [[fast-food]] [[restaurant]]s",
parents = {"ร้านอาหาร"},
}
labels["Kentucky Fried Chicken"] = labels["เคเอฟซี"]
labels["knitting"] = {
type = "related-to",
description = "default",
parents = {"งานฝีมือ"},
}
labels["Ku Klux Klan"] = {
type = "related-to",
description = "default with the",
parents = {"องค์การ", "white supremacist ideology"},
}
labels["kyabakura industry"] = {
type = "related-to",
description = "default",
parents = {"hospitality", "กิจการ"},
}
labels["labour"] = {
type = "related-to",
description = "=[[labour]] or the {{w|labour movement}}",
parents = {"งาน", "ฝ่ายซ้าย"},
}
labels["laundry"] = {
type = "related-to",
description = "default",
parents = {"cleaning"},
}
labels["กฎหมาย"] = {
type = "related-to",
description = "=the [[science]] and [[practice]] of [[law]]",
parents = {"justice"},
}
labels["law"] = labels["กฎหมาย"]
labels["การบังคับใช้กฎหมาย"] = {
type = "related-to",
description = "default",
parents = {"crime prevention", "emergency services", "กฎหมาย"},
}
labels["law enforcement"] = labels["การบังคับใช้กฎหมาย"]
labels["law of obligations"] = {
type = "related-to",
description = "default with the no singularize",
breadcrumb = "obligations",
parents = {"กฎหมาย"},
}
labels["leatherworking"] = {
type = "related-to",
description = "default",
parents = {"งานฝีมือ"},
}
labels["ฝ่ายซ้าย"] = {
type = "related-to",
description = "default",
parents = {"คตินิยม"},
}
labels["leftism"] = labels["ฝ่ายซ้าย"]
labels["เสรีนิยม"] = {
type = "related-to",
description = "default",
parents = {"คตินิยม"},
}
labels["liberalism"] = labels["เสรีนิยม"]
labels["อิสรนิยม"] = {
type = "related-to",
description = "default",
parents = {"คตินิยม"},
}
labels["libertarianism"] = labels["อิสรนิยม"]
labels["logistics"] = {
type = "related-to",
description = "default no singularize",
parents = {"operations"},
}
labels["การจัดการ"] = {
type = "related-to",
description = "default",
parents = {"ธุรกิจ"},
}
labels["management"] = labels["การจัดการ"]
labels["ลัทธิเหมา"] = {
type = "related-to",
description = "default",
parents = {"คตินิยม", "ลัทธิคอมมิวนิสต์", "ลัทธิมากซ์"},
}
labels["Maoism"] = labels["ลัทธิเหมา"]
labels["การตลาด"] = {
type = "related-to",
description = "default",
parents = {"ธุรกิจ"},
}
labels["marketing"] = labels["การตลาด"]
labels["ลัทธิมากซ์"] = {
type = "related-to",
description = "default",
parents = {"คตินิยม", "สังคมนิยม"},
}
labels["Marxism"] = labels["ลัทธิมากซ์"]
labels["masculism"] = {
type = "related-to",
description = "default",
parents = {"คตินิยม", "เพศชาย"},
}
labels["metalworking"] = {
type = "related-to",
description = "default",
parents = {"งานฝีมือ", "โลหกรรม"},
}
labels["McDonald's"] = {
type = "related-to",
description = "=the {{w|McDonald's}} [[chain]] of [[fast-food]] [[restaurant]]s",
parents = {"ร้านอาหาร"},
}
labels["micronationalism"] = {
type = "related-to",
description = "default",
parents = {"ระบอบการปกครอง", "คตินิยม"},
}
labels["การทหาร"] = {
type = "related-to",
description = "default with the",
parents = {"สังคม"},
}
labels["military"] = labels["การทหาร"]
labels["military units"] = {
type = "related-to",
description = "default",
parents = {"การทหาร", "อาชีพ"},
}
labels["mining"] = {
type = "related-to",
description = "default",
parents = {"อุตสาหกรรม"},
}
labels["กษัตริย์นิยม"] = {
type = "related-to",
description = "default",
parents = {"คตินิยม", "ราชาธิปไตย"},
}
labels["monarchism"] = labels["กษัตริย์นิยม"]
labels["ราชาธิปไตย"] = {
type = "related-to",
description = "default",
parents = {"ระบอบการปกครอง", "สังคมชั้นสูง"},
}
labels["monarchy"] = labels["ราชาธิปไตย"]
-- money คือรวมทั้งเงินตราและไม่ใช่เงินตรา
labels["เงิน (ตัวกลาง)"] = {
type = "related-to",
description = "default",
parents = {"ธุรกิจ"},
}
labels["money"] = labels["เงิน (ตัวกลาง)"]
labels["museums"] = {
type = "related-to",
description = "default",
parents = {"กิจการ", "การท่องเที่ยว", "ศิลปะ"},
}
labels["ชาตินิยม"] = {
type = "related-to",
description = "default",
parents = {"คตินิยม"},
}
labels["nationalism"] = labels["ชาตินิยม"]
labels["ลัทธินาซี"] = {
type = "related-to",
description = "default",
parents = {"ลัทธิฟาสซิสต์", "white supremacist ideology", "คตินิยม"},
}
labels["Nazism"] = labels["ลัทธินาซี"]
labels["ลัทธินาซีใหม่"] = { -- Adjacent to Nazism, but not quite the same thing.
type = "related-to",
description = "default",
parents = {"ลัทธินาซี", "ลัทธิฟาสซิสต์", "white supremacist ideology", "คตินิยม"},
}
labels["neo-Nazism"] = labels["ลัทธินาซีใหม่"]
labels["Nobel Prize"] = {
type = "related-to",
description = "default with the",
parents = {"รางวัล"},
}
labels["Objectivism"] = {
type = "related-to",
description = "=the political philosophy of {{w|Objectivism}} developed by {{w|Ayn Rand}}",
parents = {"คตินิยม", "อิสรนิยม"},
}
labels["offices"] = {
type = "type",
description = "=offices, in the sense \"position of responsibility of some authority within an organisation\"",
parents = {"รัฐบาลและการปกครอง"},
}
labels["อุตสาหกรรมน้ำมัน"] = {
type = "related-to",
description = "default with the",
breadcrumb = "oil",
parents = {"อุตสาหกรรม", "ปิโตรเลียม"},
}
labels["oil industry"] = labels["อุตสาหกรรมน้ำมัน"]
labels["operations"] = {
type = "related-to",
description = "{{{langname}}} terms covering all operational matters in [[production]], [[logistics]], or [[services]].",
parents = {"การจัดการ", "systems theory"},
}
labels["องค์การ"] = {
type = "name",
description = "default",
parents = {"สังคม"},
}
labels["organizations"] = labels["องค์การ"]
labels["papermaking"] = {
type = "related-to",
description = "default",
parents = {"งานฝีมือ", "อุตสาหกรรม"},
}
labels["กฎหมายสิทธิบัตร"] = {
type = "related-to",
description = "default",
breadcrumb = "patent",
parents = {"กฎหมาย"},
}
labels["patent law"] = labels["กฎหมายสิทธิบัตร"]
labels["peace"] = {
type = "related-to",
description = "default",
parents = {"security"},
}
labels["pensions"] = {
type = "related-to",
description = "default",
parents = {"การเงิน"},
}
labels["philanthropy"] = {
type = "related-to",
description = "default",
parents = {"สังคม"},
}
labels["Philmont Scout Ranch"] = {
type = "related-to",
description = "={{w|Philmont Scout Ranch}}, a Scouting ranch in the United States",
parents = {"Scouting"},
}
labels["piracy"] = {
type = "related-to",
description = "default",
parents = {"อาชญากรรม", "การเดินเรือ"},
}
labels["การเมือง"] = {
type = "related-to",
description = "default no singularize",
parents = {"สังคม"},
}
labels["politics"] = labels["การเมือง"]
labels["poverty"] = {
type = "related-to",
description = "default",
parents = {"wealth"},
}
--ภาษาไทยไม่ต้องผันรูป
for _, country_demonym in ipairs {
{"อาร์เจนตินา"},
{"ออสเตรเลีย"},
{"บังกลาเทศ"},
{"บราซิล"},
{"แคนาดา"},
{"ชิลี"},
{"จีน"},
{"ยุโรป"},
{"สหภาพยุโรป", nil, nil, "การเมืองยุโรป"},
{"ฝรั่งเศส", nil, nil, "การเมืองยุโรป"},
{"เยอรมนี", nil, nil, "การเมืองยุโรป"},
{"ฮ่องกง"},
{"ฮังการี", nil, nil, "การเมืองยุโรป"},
{"อินเดีย"},
{"อินโดนีเซีย"},
{"ไอร์แลนด์", nil, nil, "การเมืองยุโรป"},
{"ญี่ปุ่น"},
{"มาเลเซีย"},
{"เม็กซิโก"},
{"นิวซีแลนด์"},
{"ไนจีเรีย"},
{"ปากีสถาน"},
{"ปาเลสไตน์"},
{"เปรู"},
{"ฟิลิปปินส์"},
{"โปแลนด์", nil, nil, "การเมืองยุโรป"},
{"โปรตุเกส", nil, nil, "การเมืองยุโรป"},
{"รัสเซีย"},
{"สิงคโปร์"},
{"เซาท์แอฟริกา"},
{"เกาหลีใต้"},
{"สเปน", nil, nil, "การเมืองยุโรป"},
{"สวิตเซอร์แลนด์", nil, nil, "การเมืองยุโรป"},
{"ไต้หวัน"},
{"ยูเครน"},
{"สหราชอาณาจักร"},
{"สหรัฐอเมริกา"},
{"เวเนซุเอลา"},
{"เวียดนาม"},
} do
local country, demonym, full_country, parent = unpack(country_demonym)
labels["การเมือง" .. country] = { -- ภาษาไทยใช้คำเดียวกันหมด
type = "related-to",
description = ("=the {{w|politics of %s}}"):format(full_country or country),
parents = {parent or "การเมือง", country},
}
end
labels["การพิมพ์"] = {
type = "related-to",
description = "default",
parents = {"อุตสาหกรรม"},
}
labels["printing"] = labels["การพิมพ์"]
labels["prison"] = {
type = "related-to",
description = "default",
parents = {"การบังคับใช้กฎหมาย", "อาคาร"},
}
labels["กฎหมายวิธีพิจารณาความ"] = {
type = "related-to",
description = "default",
breadcrumb = "procedural",
parents = {"กฎหมาย"},
}
labels["procedural law"] = labels["กฎหมายวิธีพิจารณาความ"]
labels["property law"] = {
type = "related-to",
description = "default",
breadcrumb = "property",
parents = {"กฎหมาย"},
}
labels["public administration"] = {
type = "related-to",
description = "=the field of [[public]] [[administration]]",
parents = {"รัฐบาลและการปกครอง"},
}
labels["public safety"] = {
type = "related-to",
description = "=the field of [[public]] [[safety]]",
parents = {"public administration", "security"},
}
labels["publishing"] = {
type = "related-to",
description = "default",
parents = {"อุตสาหกรรม", "สื่อมวลชน"},
}
labels["QAnon"] = {
type = "related-to",
description = "=the [[QAnon]] movement",
parents = {"alt-right", "ทฤษฎีสมคบคิด", "Donald Trump", "pedophilia"},
}
labels["queerphobia"] = {
type = "related-to",
description = "default",
parents = {"forms of discrimination", "แอลจีบีทีคิว"},
}
labels["เชื้อชาตินิยม"] = {
type = "related-to",
description = "default",
parents = {"forms of discrimination"},
}
labels["racism"] = labels["เชื้อชาตินิยม"]
labels["rape"] = {
type = "related-to",
description = "=the field of [[sexual violence]]",
parents = {"เพศ", "อาชญากรรม", "ความรุนแรง"},
}
labels["อสังหาริมทรัพย์"] = {
type = "related-to",
description = "default",
parents = {"อุตสาหกรรม", "housing"},
}
labels["real estate"] = labels["อสังหาริมทรัพย์"]
labels["ร้านอาหาร"] = {
type = "related-to",
description = "=[[restaurant]]s (including [[pub]]s, [[café]]s etc.)",
parents = {"กิจการ", "อาหารและเครื่องดื่ม"},
}
labels["restaurants"] = labels["ร้านอาหาร"]
labels["royal residences"] = {
type = "related-to",
description = "default",
parents = {"housing", "ราชาธิปไตย"},
}
labels["โรงเรียน"] = {
type = "related-to",
description = "default",
parents = {"การศึกษา", "อาคาร"},
}
labels["schools"] = labels["โรงเรียน"]
-- Note: this is the usual term, not "Scottish law".
labels["Scots law"] = {
type = "related-to",
description = "default",
breadcrumb = "Scots",
parents = {"กฎหมาย", "สกอตแลนด์"},
}
labels["Scouting"] = {
type = "related-to",
description = "default",
parents = {"สังคม"},
}
labels["security"] = {
type = "related-to",
description = "default",
parents = {"สังคม"},
}
labels["sexism"] = {
type = "related-to",
description = "default",
parents = {"forms of discrimination", "สถานะเพศ"},
}
labels["sewing"] = {
type = "related-to",
description = "=[[sewing]], sewing tools, sewing [[technique]]s and so on",
parents = {"งานฝีมือ"},
}
labels["shoemaking"] = {
type = "related-to",
description = "default",
parents = {"งานฝีมือ"},
}
labels["slavery"] = {
type = "related-to",
description = "default",
parents = {"สังคม", "งาน"},
}
labels["สังคมนิยม"] = {
type = "related-to",
description = "default",
parents = {"เศรษฐศาสตร์", "คตินิยม", "ฝ่ายซ้าย"},
}
labels["socialism"] = labels["สังคมนิยม"]
labels["social justice"] = {
type = "related-to",
description = "default",
parents = {"การเมือง", "สังคม", "สัคมวิทยา", "ฝ่ายซ้าย"},
}
labels["social security"] = {
type = "related-to",
description = "default",
parents = {"รัฐบาลและการปกครอง", "กฎหมาย", "เงิน (ตัวกลาง)"},
}
labels["spinning"] = {
type = "related-to",
description = "=[[spinning]], the process of making [[yarn]] or [[string]] from raw [[fiber]]",
parents = {"งานฝีมือ"},
}
labels["square dancing"] = {
type = "related-to",
description = "default",
parents = {"dance"},
}
labels["standards of identity"] = {
type = "related-to",
description = "default",
parents = {"กฎหมาย", "อาหารและเครื่องดื่ม"},
}
labels["ตลาดหลักทรัพย์"] = {
type = "related-to",
description = "default with the",
parents = {"การเงิน"},
}
labels["stock market"] = labels["ตลาดหลักทรัพย์"]
labels["stock symbols for companies"] = {
type = "name",
description = "=[[stock symbol]]s for [[company|companies]]",
parents = {"การค้า"},
}
labels["supply chain"] = {
type = "related-to",
description = "default no singularize",
parents = {"operations"},
}
labels["ภาษีอากร"] = {
type = "related-to",
description = "default",
parents = {"รัฐบาลและการปกครอง", "กฎหมาย", "เงิน (ตัวกลาง)"},
}
labels["taxation"] = labels["ภาษีอากร"]
labels["theft"] = {
type = "related-to",
description = "default",
parents = {"อาชญากรรม"},
}
labels["เทวาธิปไตย"] = {
type = "related-to",
description = "default",
parents = {"คตินิยม", "ศาสนา"},
}
labels["theocracy"] = labels["เทวาธิปไตย"]
labels["อุตสาหกรรมป่าไม้"] = {
type = "related-to",
description = "default with the",
breadcrumb = "timber",
parents = {"อุตสาหกรรม"},
}
labels["timber industry"] = labels["อุตสาหกรรมป่าไม้"]
labels["เครื่องหมายการค้า"] = {
type = "related-to",
description = "=[[trademark]] [[law]]",
parents = {"ทรัพย์สินทางปัญญา"},
}
labels["trademark"] = labels["เครื่องหมายการค้า"]
labels["การค้า"] = {
type = "related-to",
description = "default",
parents = {"ธุรกิจ"},
}
labels["transphobia"] = {
type = "related-to",
description = "default",
parents = {"forms of discrimination", "transgender"},
}
labels["trust"] = {
type = "related-to",
description = "default",
parents = {"security"},
}
labels["types of settlements"] = {
type = "type",
topic = "การตั้งถิ่นฐาน",
description = "=[[การตั้งถิ่นฐาน]]",
parents = {"รัฐบาลและการปกครอง"},
}
labels["สหประชาชาติ"] = {
type = "related-to",
description = "=the [[United Nations Organization]]",
parents = {"องค์การ"},
}
labels["United Nations"] = labels["สหประชาชาติ"]
labels["มหาวิทยาลัย"] = {
type = "related-to",
description = "default",
parents = {"โรงเรียน"},
}
labels["universities"] = labels["มหาวิทยาลัย"]
labels["voting systems"] = {
type = "related-to",
description = "default",
parents = {"ประชาธิปไตย", "ระบบ"},
}
labels["wealth"] = {
type = "related-to",
description = "default",
parents = {"เศรษฐศาสตร์"},
}
labels["weaving"] = {
type = "related-to",
description = "default",
parents = {"งานฝีมือ"},
}
labels["white supremacist ideology"] = {
type = "related-to",
description = "default",
parents = {"เชื้อชาตินิยม", "anti-Semitism", "คตินิยม"},
}
labels["woodworking"] = {
type = "related-to",
description = "default",
parents = {"งานฝีมือ"},
}
labels["Zionism"] = {
type = "related-to",
description = "default",
parents = {"คตินิยม", "ศาสนายูดาย", "อิสราเอล", "ชาตินิยม"},
}
return labels
h1sxr4lcx65b9zhj1tsjntxprzxubfi
ciudad
0
45949
5720706
2188019
2026-04-21T01:57:48Z
OctraBot
3198
5720706
wikitext
text/x-wiki
== ภาษาชาบากาโน ==
=== รากศัพท์ ===
{{inh+|cbk|es|ciudad}}
=== คำนาม ===
{{head|cbk|คำนาม}}
# [[นคร]], [[เมืองใหญ่]]
== ภาษานาวัตล์คลาสสิก ==
=== รากศัพท์ ===
{{bor+|nci|es|ciudad}}
=== คำนาม ===
{{head|nci|คำนาม|head=ciudād}}
# [[นคร]], [[เมืองใหญ่]]
=== อ้างอิง ===
* Lockhart, James. (2001) ''Nahuatl as Written'', Stanford University Press, page 215.
== ภาษาสเปน ==
=== รูปแบบอื่น ===
* {{alter|es|cibdad||โบราณ}}
=== รากศัพท์ ===
{{inh+|es|osp|cibdat}}, {{m|osp|cibdad|çibdad}} (เทียบ{{cog|lad|sivdad}}), จาก{{inh|es|la|cīvitātem}}, กรรมการกเอกพจน์ของ {{m|la|cīvitās||เมืองใหญ่}} (เทียบ{{cog|pt|cidade}}, {{cog|gl|cidade}})
=== การออกเสียง ===
{{es-pr|+<audio:Es-am-lat-ciudad.ogg;Audio (Latin America)>}}
=== คำนาม ===
{{es-noun|f}}
# [[นคร]], [[เมืองใหญ่]]
#: {{usex|es|[[vivir|Viven]] en la '''ciudad'''.|They live in the '''city'''.}}
#: {{usex|es|¡Qué '''ciudad''' tan grande y bonita!|What a large and beautiful '''city'''!}}
==== ลูกคำ ====
{{col-auto|es
| ciudad estado
| ciudad federal
| ciudadano
| ciudad fantasma
| ciudad universitaria
}}
==== คำเกี่ยวข้อง ====
{{col-auto|es|ciudadela|civil}}
==== คำสืบทอด ====
* {{desc|cbk|ciudad}}
* {{desc|bcl|siyudad|bor=1}}
* {{desc|ceb|siyudad|bor=1}}
* {{desc|ilo|siudad|bor=1}}
* {{desc|tl|siyudad|bor=1}}
==== ดูเพิ่ม ====
* {{l|es|aldea}}
* {{l|es|pueblo}}
=== อ่านเพิ่ม ===
* {{R:es:DRAE}}
frgjhc2o5ycdcdmlz74npf5jd4k1qt3
หมวดหมู่:term cleanup
14
46060
5720718
224816
2026-04-21T02:17:34Z
OctraBot
3198
5720718
wikitext
text/x-wiki
{{delete}}
35r2j9t4ectnt1cmb7mlgcqvwz6h5k6
ผู้ใช้:Octahedron80/อักษรไทธรรม
2
138742
5720678
4933435
2026-04-20T15:48:13Z
OctraBot
3198
5720678
wikitext
text/x-wiki
<div class="Lana" lang="nod">
; พยัญชนะ
# ละ ที่เป็นอักษรตามหรืออักษรควบกล้ำ สามารถใช้ ᩖ หรือ ᩠ + ᩃ ก็ได้ วิกิพจนานุกรมกำหนดให้ ᩖ เป็นรายการหลัก
# ละตังหลาย ᩗ ใช้ใน [[ᨴᩗᩘᩣ]] (ตังหลาย) เท่านั้น
# หาง ᩛ ใช้ตามหลังพยัญชนะบางตัว ในคำที่ยืมมาจากบาลี/สันสกฤต (และบาลี/สันสกฤตโดยตรง)
#* หากตามหลังพยัญชนะ ᨭ (ฏ) ᨮ (ฐ) ᨯ (ฑ) ᨰ (ฒ) ᨱ (ณ) จะเท่ากับ ᩠ + ᨮ (+ฐ)
#* หากตามหลังพยัญชนะ ᨲ (ต) ᨳ (ถ) ᨴ (ท) ᨵ (ธ) ᨶ (น) จะเท่ากับ ᩠ + ᨳ (+ถ)
#* หากตามหลังพยัญชนะ ᨷ (ป) ᨹ (ผ) ᨻ (พ) ᨽ (ภ) ᨾ (ม) จะเท่ากับ ᩠ + ᨻ (+พ)
# สะใหญ่ ᩔ (สฺส) พบได้ในคำบาลี
# พยัญชนะสะกด ปกติจะเขียนไว้เป็นเชิงของตัวอักษรก่อนหน้า ซึ่งอาจเป็นพยัญชนะหรือสระก็ได้
#* เว้นแต่ ตัวอักษรก่อนหน้าเป็นสระล่าง หรือพยัญชนะเชิง จะเขียนเป็นตัวเต็มแทน
#* ตัวสะกด ᩠ + ᨿ (+ย) เขียนเป็นพยัญชนะเชิงเสมอ (สำหรับคำเมือง)
; สระ
# ใส่สระหลังจาก (กลุ่ม) พยัญชนะต้นเสมอ ตัว ᨿ, ᩅ, ᩋ ที่ปรากฏในรูปสระ ก็ใช้หลักเดียวกัน
# ถ้ามีสระหลายรูปประกอบกัน ให้ใส่สระหน้า สระล่าง สระบน และสระหลัง <u>ตามลำดับ</u>
#* ตัวเชิง ᨿ, ᩅ เป็นสระล่าง แต่สระออยของเขิน ᩭ เป็นสระหลัง
# สระอำเขียนต่างจากภาษาไทยคือ เขียนลากข้างก่อน แล้วตามด้วยนิคหิต ส่วนวรรณยุกต์อยู่บนพยัญชนะ (ก่อนสระ)
หมายเหตุ: ใช้ ᨠ เป็นพยัญชนะสำหรับเกาะ
{|
|-
|
{| class="wikitable"
|-
| 1. || ᨠᩫ (โอะมีตัวสะกด) || ᨠ + ᩫ
|- style="background-color:lightgreen"
| 2. || ᨠᩴ (อัง และนิคหิตของบาลี) || ᨠ + ᩴ
|- style="background-color:lightgreen"
| 3. || ᨠᩘ (ใช้ในตังหลาย และ งฺ ของบาลี) || ᨠ + ᩘ
|-
| 4. || ᨠᩢ (อะมีตัวสะกด) || ᨠ + ᩢ
|-
| 5. || ᨠ᩠ᩅᩫᩡ (อัวะ) || ᨠ + ᩠ + ᩅ + ᩫ + ᩡ
|-
| 6. || ᨠ᩠ᩅᩫ (อัวไม่มีตัวสะกด) || ᨠ + ᩠ + ᩅ + ᩫ
|-
| 7. || ᨠ᩠ᩅ (อัวมีตัวสะกด) || ᨠ + ᩠ + ᩅ
|- style="background-color:moccasin"
| 8. || ᨠᩬᩴ (ออไม่มีตัวสะกดของคำเมือง)<sup>[1]</sup> || ᨠ + ᩬ + ᩴ
|-
| 9. || ᨠᩬ (ออมีตัวสะกด) || ᨠ + ᩬ
|-
| 10. || ᨠᩡ (อะไม่มีตัวสะกด/ไม่เติมก็ได้) || ᨠ + ᩡ
|-
| 11. || ᨠᩣ (อาต่ำ) || ᨠ + ᩣ
|-
| 12. || ᨠᩤ (อาสูง) || ᨠ + ᩤ
|-
| 13. || ᨠᩣᩴ (อำต่ำ) || ᨠ + ᩣ + ᩴ
|-
| 14. || ᨠᩤᩴ (อำสูง) || ᨠ + ᩤ + ᩴ
|-
| 15. || ᨠᩥ (อิ) || ᨠ + ᩥ
|-
| 16. || ᨠᩦ (อี) || ᨠ + ᩦ
|-
| 17. || ᨠᩧ (อึ) || ᨠ + ᩧ
|-
| 18. || ᨠᩨ (อือ) || ᨠ + ᩨ
|-
| 19. || ᨠᩩ (อุ) || ᨠ + ᩩ
|-
| 20. || ᨠᩪ (อู) || ᨠ + ᩪ
|}
|
{| class="wikitable"
|-
| 21. || ᨠᩮᩡ (เอะไม่มีตัวสะกด) || ᨠ + ᩮ + ᩡ
|-
| 22. || ᨠᩮᩢ (เอะมีตัวสะกด) || ᨠ + ᩮ + ᩢ
|-
| 23. || ᨠᩮ (เอ) || ᨠ + ᩮ
|-
| 24. || ᨠᩯᩡ (แอะไม่มีตัวสะกด) || ᨠ + ᩯ + ᩡ
|-
| 25. || ᨠᩯᩢ (แอะมีตัวสะกด) || ᨠ + ᩯ + ᩢ
|-
| 26. || ᨠᩯ (แอ) || ᨠ + ᩯ
|-
| 27. || ᨠᩮᩬᩥᩡ (เออะ) || ᨠ + ᩮ + ᩬ + ᩥ + ᩡ
|-
| 28. || ᨠᩮᩬᩥ (เออไม่มีตัวสะกด หรือเอือมีตัวสะกด) || ᨠ + ᩮ + ᩬ + ᩥ
|-
| 29. || ᨠᩮᩥ (เออมีตัวสะกด) || ᨠ + ᩮ + ᩥ
|- style="background-color:pink"
| 30. || ᨠᩮᩬᩨᩡ (เออะของเขิน) || ᨠ + ᩮ + ᩬ + ᩨ + ᩡ
|- style="background-color:pink"
| 31. || ᨠᩮᩬᩨ (เออของเขิน) || ᨠ + ᩮ + ᩬ + ᩨ
|-
| 32. || ᨠᩮᩢᩣ (เอาต่ำ) || ᨠ + ᩮ + ᩢ + ᩣ
|-
| 33. || ᨠᩮᩢᩤ (เอาสูง) || ᨠ + ᩮ + ᩢ + ᩤ
|- style="background-color:lightgreen"
| 34. || ᨠᩮᩣ (โอต่ำของบาลี) || ᨠ + ᩮ + ᩣ
|- style="background-color:lightgreen"
| 35. || ᨠᩮᩤ (โอสูงของบาลี) || ᨠ + ᩮ + ᩤ
|- style="background-color:pink"
| 36. || ᨠᩳ (ออไม่มีตัวสะกดของเขิน) || ᨠ + ᩳ
|- style="background-color:lightblue"
| 37. || ᨠᩬᩳ (ออไม่มีตัวสะกดของลื้อ/ยอง) || ᨠ + ᩬ + ᩳ
|-
| 38. || ᨠ᩠ᨿᩮᩡ (เอียะ) || ᨠ + ᩠ + ᨿ + ᩮ + ᩡ
|-
| 39. || ᨠ᩠ᨿᩮ (เอียไม่มีตัวสะกด) || ᨠ + ᩠ + ᨿ + ᩮ
|-
| 40. || ᨠ᩠ᨿ (เอียมีตัวสะกด) || ᨠ + ᩠ + ᨿ
|}
|
{| class="wikitable"
|-
| 41. || ᨠᩮᩬᩥᩋᩡ (เอือะ) || ᨠ + ᩮ + ᩬ + ᩥ + ᩋ + ᩡ
|-
| 42. || ᨠᩮᩬᩥᩋ (เอือไม่มีตัวสะกด) || ᨠ + ᩮ + ᩬ + ᩥ + ᩋ
|-
| 43. || ᨠᩰᩡ (โอะไม่มีตัวสะกด) || ᨠ + ᩰ + ᩡ
|-
| 44. || ᨠᩰ (โอไม่มีตัวสะกด) || ᨠ + ᩰ
|-
| 45. || ᨠᩰᩫ (โอมีตัวสะกด) || ᨠ + ᩰ + ᩫ
|-
| 46. || ᨠᩰᩬᩡ (เอาะไม่มีตัวสะกด) || ᨠ + ᩰ + ᩬ + ᩡ
|-
| 47. || ᨠᩬᩢ (เอาะมีตัวสะกด) || ᨠ + ᩬ + ᩢ
|-
| 48. || ᨠᩱ (ไอ) || ᨠ + ᩱ
|- style="background-color:moccasin"
| 49. || ᨠᩲ (ใอของคำเมือง) || ᨠ + ᩲ
|- style="background:linear-gradient(to bottom, pink 0%, lightblue 100%);"
| 50. || ᨠᩭ (ออยของเขิน/ลื้อ/ยอง) || ᨠ + ᩭ
|-
| 51. || ᨠᩙ (อังอีกแบบหนึ่ง) || ᨠ + ᩙ
|-
| 52. || ᨠᩥᩴ (อิงอีกแบบหนึ่ง) || ᨠ + ᩥ + ᩴ
|}
|}
[1] ภาษาคำเมือง: คำพิเศษที่ไม่ต้องสะกดตามนี้ ได้แก่ [[ᨣᩴ᩵]] (ก็) และ [[ᨷᩴ᩵]] (บ่, ไม่)
; วรรณยุกต์
# ถ้ามีรูปสระหน้า สระล่าง สระบน ให้ใส่วรรณยุกต์หลังจากสระเหล่านี้ครบแล้ว
# ถ้าไม่มีสระ หรือมีแต่สระหลัง สามารถใส่วรรณยุกต์หลังจาก (กลุ่ม) พยัญชนะต้นได้ทันที
; สัญลักษณ์อื่น ๆ
# ไม้ซ้ำ ᩻ ใช้งานได้สามอย่าง ได้แก่
#* คำซ้ำ ให้ใส่ไม้ซ้ำที่ท้ายพยางค์หรือคำที่สะกดสำเร็จแล้ว เหมือนไม้ยมก
#* อักษรนำและอักษรตาม (อย่างคำเขมร) ให้ใส่ไม้ซ้ำหลังจากอักษรตาม (พยัญชนะตัวที่สอง) ทันที แล้วจึงตามด้วยสระ/วรรณยุกต์ต่อไป
#* ใช้เป็นตัวแก้ความกำกวมว่า พยัญชนะสองตัวที่ติดกัน คืออักษรนำและอักษรตาม มิใช่ตัวสะกด (ในกรณีที่สะกดตามปกติแล้วรูปเหมือนกัน) ตำแหน่งที่ใส่เหมือนข้อที่แล้ว
# ไม้กั๋งไหล ᩘ ใช้แทน งฺ ในคำบาลี/สันสกฤต โดยวางบนพยัญชนะตัวถัดไป
# ระห้าม ᩺ ใช้แทน รฺ ในคำบาลี/สันสกฤต โดยวางบนพยัญชนะตัวถัดไป หรือใช้แทนทัณฑฆาตที่อยู่ท้ายคำ
# การันต์ ᩼ ใช้แทนทัณฑฆาตของเขินและลื้อ ส่วนคำเมืองใช้ ระห้าม ᩺
; อื่น ๆ
# ภาษาเขิน, ภาษาลื้อ, ภาษายอง ที่เขียนด้วยอักษรไทธรรม ใช้อักขรวิธีเดียวกับภาษาคำเมือง เว้นแต่จะกำหนดไว้ในตาราง
</div>
qm6x5xllgzi2ozmiktw8i3zmkq0zimy
5720679
5720678
2026-04-20T15:48:51Z
OctraBot
3198
5720679
wikitext
text/x-wiki
<div class="Lana" lang="nod">
; พยัญชนะ
# ละ ที่เป็นอักษรตามหรืออักษรควบกล้ำ สามารถใช้ ᩖ หรือ ᩠ + ᩃ ก็ได้ วิกิพจนานุกรมกำหนดให้ ᩖ เป็นรายการหลัก
# ละตังหลาย ᩗ ใช้ใน [[ᨴᩗᩘᩣ]] (ตังหลาย) เท่านั้น
# หาง ᩛ ใช้ตามหลังพยัญชนะบางตัว ในคำที่ยืมมาจากบาลี/สันสกฤต (และบาลี/สันสกฤตโดยตรง)
#* หากตามหลังพยัญชนะ ᨭ (ฏ) ᨮ (ฐ) ᨯ (ฑ) ᨰ (ฒ) ᨱ (ณ) จะเท่ากับ ᩠ + ᨮ (+ฐ)
#* หากตามหลังพยัญชนะ ᨲ (ต) ᨳ (ถ) ᨴ (ท) ᨵ (ธ) ᨶ (น) จะเท่ากับ ᩠ + ᨳ (+ถ)
#* หากตามหลังพยัญชนะ ᨷ (ป) ᨹ (ผ) ᨻ (พ) ᨽ (ภ) ᨾ (ม) จะเท่ากับ ᩠ + ᨻ (+พ)
# สะใหญ่ ᩔ (สฺส) พบได้ในคำที่ยืมมาจากบาลี (และบาลีโดยตรง)
# พยัญชนะสะกด ปกติจะเขียนไว้เป็นเชิงของตัวอักษรก่อนหน้า ซึ่งอาจเป็นพยัญชนะหรือสระก็ได้
#* เว้นแต่ ตัวอักษรก่อนหน้าเป็นสระล่าง หรือพยัญชนะเชิง จะเขียนเป็นตัวเต็มแทน
#* ตัวสะกด ᩠ + ᨿ (+ย) เขียนเป็นพยัญชนะเชิงเสมอ (สำหรับคำเมือง)
; สระ
# ใส่สระหลังจาก (กลุ่ม) พยัญชนะต้นเสมอ ตัว ᨿ, ᩅ, ᩋ ที่ปรากฏในรูปสระ ก็ใช้หลักเดียวกัน
# ถ้ามีสระหลายรูปประกอบกัน ให้ใส่สระหน้า สระล่าง สระบน และสระหลัง <u>ตามลำดับ</u>
#* ตัวเชิง ᨿ, ᩅ เป็นสระล่าง แต่สระออยของเขิน ᩭ เป็นสระหลัง
# สระอำเขียนต่างจากภาษาไทยคือ เขียนลากข้างก่อน แล้วตามด้วยนิคหิต ส่วนวรรณยุกต์อยู่บนพยัญชนะ (ก่อนสระ)
หมายเหตุ: ใช้ ᨠ เป็นพยัญชนะสำหรับเกาะ
{|
|-
|
{| class="wikitable"
|-
| 1. || ᨠᩫ (โอะมีตัวสะกด) || ᨠ + ᩫ
|- style="background-color:lightgreen"
| 2. || ᨠᩴ (อัง และนิคหิตของบาลี) || ᨠ + ᩴ
|- style="background-color:lightgreen"
| 3. || ᨠᩘ (ใช้ในตังหลาย และ งฺ ของบาลี) || ᨠ + ᩘ
|-
| 4. || ᨠᩢ (อะมีตัวสะกด) || ᨠ + ᩢ
|-
| 5. || ᨠ᩠ᩅᩫᩡ (อัวะ) || ᨠ + ᩠ + ᩅ + ᩫ + ᩡ
|-
| 6. || ᨠ᩠ᩅᩫ (อัวไม่มีตัวสะกด) || ᨠ + ᩠ + ᩅ + ᩫ
|-
| 7. || ᨠ᩠ᩅ (อัวมีตัวสะกด) || ᨠ + ᩠ + ᩅ
|- style="background-color:moccasin"
| 8. || ᨠᩬᩴ (ออไม่มีตัวสะกดของคำเมือง)<sup>[1]</sup> || ᨠ + ᩬ + ᩴ
|-
| 9. || ᨠᩬ (ออมีตัวสะกด) || ᨠ + ᩬ
|-
| 10. || ᨠᩡ (อะไม่มีตัวสะกด/ไม่เติมก็ได้) || ᨠ + ᩡ
|-
| 11. || ᨠᩣ (อาต่ำ) || ᨠ + ᩣ
|-
| 12. || ᨠᩤ (อาสูง) || ᨠ + ᩤ
|-
| 13. || ᨠᩣᩴ (อำต่ำ) || ᨠ + ᩣ + ᩴ
|-
| 14. || ᨠᩤᩴ (อำสูง) || ᨠ + ᩤ + ᩴ
|-
| 15. || ᨠᩥ (อิ) || ᨠ + ᩥ
|-
| 16. || ᨠᩦ (อี) || ᨠ + ᩦ
|-
| 17. || ᨠᩧ (อึ) || ᨠ + ᩧ
|-
| 18. || ᨠᩨ (อือ) || ᨠ + ᩨ
|-
| 19. || ᨠᩩ (อุ) || ᨠ + ᩩ
|-
| 20. || ᨠᩪ (อู) || ᨠ + ᩪ
|}
|
{| class="wikitable"
|-
| 21. || ᨠᩮᩡ (เอะไม่มีตัวสะกด) || ᨠ + ᩮ + ᩡ
|-
| 22. || ᨠᩮᩢ (เอะมีตัวสะกด) || ᨠ + ᩮ + ᩢ
|-
| 23. || ᨠᩮ (เอ) || ᨠ + ᩮ
|-
| 24. || ᨠᩯᩡ (แอะไม่มีตัวสะกด) || ᨠ + ᩯ + ᩡ
|-
| 25. || ᨠᩯᩢ (แอะมีตัวสะกด) || ᨠ + ᩯ + ᩢ
|-
| 26. || ᨠᩯ (แอ) || ᨠ + ᩯ
|-
| 27. || ᨠᩮᩬᩥᩡ (เออะ) || ᨠ + ᩮ + ᩬ + ᩥ + ᩡ
|-
| 28. || ᨠᩮᩬᩥ (เออไม่มีตัวสะกด หรือเอือมีตัวสะกด) || ᨠ + ᩮ + ᩬ + ᩥ
|-
| 29. || ᨠᩮᩥ (เออมีตัวสะกด) || ᨠ + ᩮ + ᩥ
|- style="background-color:pink"
| 30. || ᨠᩮᩬᩨᩡ (เออะของเขิน) || ᨠ + ᩮ + ᩬ + ᩨ + ᩡ
|- style="background-color:pink"
| 31. || ᨠᩮᩬᩨ (เออของเขิน) || ᨠ + ᩮ + ᩬ + ᩨ
|-
| 32. || ᨠᩮᩢᩣ (เอาต่ำ) || ᨠ + ᩮ + ᩢ + ᩣ
|-
| 33. || ᨠᩮᩢᩤ (เอาสูง) || ᨠ + ᩮ + ᩢ + ᩤ
|- style="background-color:lightgreen"
| 34. || ᨠᩮᩣ (โอต่ำของบาลี) || ᨠ + ᩮ + ᩣ
|- style="background-color:lightgreen"
| 35. || ᨠᩮᩤ (โอสูงของบาลี) || ᨠ + ᩮ + ᩤ
|- style="background-color:pink"
| 36. || ᨠᩳ (ออไม่มีตัวสะกดของเขิน) || ᨠ + ᩳ
|- style="background-color:lightblue"
| 37. || ᨠᩬᩳ (ออไม่มีตัวสะกดของลื้อ/ยอง) || ᨠ + ᩬ + ᩳ
|-
| 38. || ᨠ᩠ᨿᩮᩡ (เอียะ) || ᨠ + ᩠ + ᨿ + ᩮ + ᩡ
|-
| 39. || ᨠ᩠ᨿᩮ (เอียไม่มีตัวสะกด) || ᨠ + ᩠ + ᨿ + ᩮ
|-
| 40. || ᨠ᩠ᨿ (เอียมีตัวสะกด) || ᨠ + ᩠ + ᨿ
|}
|
{| class="wikitable"
|-
| 41. || ᨠᩮᩬᩥᩋᩡ (เอือะ) || ᨠ + ᩮ + ᩬ + ᩥ + ᩋ + ᩡ
|-
| 42. || ᨠᩮᩬᩥᩋ (เอือไม่มีตัวสะกด) || ᨠ + ᩮ + ᩬ + ᩥ + ᩋ
|-
| 43. || ᨠᩰᩡ (โอะไม่มีตัวสะกด) || ᨠ + ᩰ + ᩡ
|-
| 44. || ᨠᩰ (โอไม่มีตัวสะกด) || ᨠ + ᩰ
|-
| 45. || ᨠᩰᩫ (โอมีตัวสะกด) || ᨠ + ᩰ + ᩫ
|-
| 46. || ᨠᩰᩬᩡ (เอาะไม่มีตัวสะกด) || ᨠ + ᩰ + ᩬ + ᩡ
|-
| 47. || ᨠᩬᩢ (เอาะมีตัวสะกด) || ᨠ + ᩬ + ᩢ
|-
| 48. || ᨠᩱ (ไอ) || ᨠ + ᩱ
|- style="background-color:moccasin"
| 49. || ᨠᩲ (ใอของคำเมือง) || ᨠ + ᩲ
|- style="background:linear-gradient(to bottom, pink 0%, lightblue 100%);"
| 50. || ᨠᩭ (ออยของเขิน/ลื้อ/ยอง) || ᨠ + ᩭ
|-
| 51. || ᨠᩙ (อังอีกแบบหนึ่ง) || ᨠ + ᩙ
|-
| 52. || ᨠᩥᩴ (อิงอีกแบบหนึ่ง) || ᨠ + ᩥ + ᩴ
|}
|}
[1] ภาษาคำเมือง: คำพิเศษที่ไม่ต้องสะกดตามนี้ ได้แก่ [[ᨣᩴ᩵]] (ก็) และ [[ᨷᩴ᩵]] (บ่, ไม่)
; วรรณยุกต์
# ถ้ามีรูปสระหน้า สระล่าง สระบน ให้ใส่วรรณยุกต์หลังจากสระเหล่านี้ครบแล้ว
# ถ้าไม่มีสระ หรือมีแต่สระหลัง สามารถใส่วรรณยุกต์หลังจาก (กลุ่ม) พยัญชนะต้นได้ทันที
; สัญลักษณ์อื่น ๆ
# ไม้ซ้ำ ᩻ ใช้งานได้สามอย่าง ได้แก่
#* คำซ้ำ ให้ใส่ไม้ซ้ำที่ท้ายพยางค์หรือคำที่สะกดสำเร็จแล้ว เหมือนไม้ยมก
#* อักษรนำและอักษรตาม (อย่างคำเขมร) ให้ใส่ไม้ซ้ำหลังจากอักษรตาม (พยัญชนะตัวที่สอง) ทันที แล้วจึงตามด้วยสระ/วรรณยุกต์ต่อไป
#* ใช้เป็นตัวแก้ความกำกวมว่า พยัญชนะสองตัวที่ติดกัน คืออักษรนำและอักษรตาม มิใช่ตัวสะกด (ในกรณีที่สะกดตามปกติแล้วรูปเหมือนกัน) ตำแหน่งที่ใส่เหมือนข้อที่แล้ว
# ไม้กั๋งไหล ᩘ ใช้แทน งฺ ในคำบาลี/สันสกฤต โดยวางบนพยัญชนะตัวถัดไป
# ระห้าม ᩺ ใช้แทน รฺ ในคำบาลี/สันสกฤต โดยวางบนพยัญชนะตัวถัดไป หรือใช้แทนทัณฑฆาตที่อยู่ท้ายคำ
# การันต์ ᩼ ใช้แทนทัณฑฆาตของเขินและลื้อ ส่วนคำเมืองใช้ ระห้าม ᩺
; อื่น ๆ
# ภาษาเขิน, ภาษาลื้อ, ภาษายอง ที่เขียนด้วยอักษรไทธรรม ใช้อักขรวิธีเดียวกับภาษาคำเมือง เว้นแต่จะกำหนดไว้ในตาราง
</div>
tj9kb6j76jx44mh20iy6f09od6zyny1
5720680
5720679
2026-04-20T15:52:52Z
OctraBot
3198
5720680
wikitext
text/x-wiki
<div class="Lana" lang="nod">
; พยัญชนะ
# ละ ที่เป็นอักษรตามหรืออักษรควบกล้ำ สามารถใช้ ᩖ หรือ ᩠ + ᩃ ก็ได้ วิกิพจนานุกรมกำหนดให้ ᩖ เป็นรายการหลัก
# ละตังหลาย ᩗ ใช้ใน [[ᨴᩗᩘᩣ]] (ตังหลาย) เท่านั้น
# หาง ᩛ ใช้ตามหลังพยัญชนะบางตัว ในคำที่ยืมมาจากบาลี/สันสกฤต (และบาลี/สันสกฤตโดยตรง)
#* หากตามหลังพยัญชนะ ᨭ (ฏ) ᨮ (ฐ) ᨯ (ฑ) ᨰ (ฒ) ᨱ (ณ) จะเท่ากับ ᩠ + ᨮ (+ฐ)
#* หากตามหลังพยัญชนะ ᨲ (ต) ᨳ (ถ) ᨴ (ท) ᨵ (ธ) ᨶ (น) จะเท่ากับ ᩠ + ᨳ (+ถ)
#* หากตามหลังพยัญชนะ ᨷ (ป) ᨹ (ผ) ᨻ (พ) ᨽ (ภ) ᨾ (ม) จะเท่ากับ ᩠ + ᨻ (+พ)
# สะใหญ่ ᩔ (สฺส) พบได้ในคำที่ยืมมาจากบาลี (และบาลีโดยตรง)
# พยัญชนะสะกด <u>โดยปกติ</u>จะเขียนไว้เป็นเชิงของตัวอักษรก่อนหน้า ซึ่งอาจเป็นพยัญชนะหรือสระก็ได้
#* เว้นแต่ ตัวอักษรก่อนหน้าเป็นสระล่าง หรือพยัญชนะเชิง จะเขียนเป็นตัวเต็มแทน
#* ตัวสะกด ᩠ + ᨿ (+ย) เขียนเป็นพยัญชนะเชิงเสมอ (สำหรับคำเมือง)
; สระ
# ใส่สระหลังจาก (กลุ่ม) พยัญชนะต้นเสมอ ตัว ᨿ, ᩅ, ᩋ ที่ปรากฏในรูปสระ ก็ใช้หลักเดียวกัน
# ถ้ามีสระหลายรูปประกอบกัน ให้ใส่สระหน้า สระล่าง สระบน และสระหลัง <u>ตามลำดับ</u>
#* ตัวเชิง ᨿ, ᩅ เป็นสระล่าง แต่สระออยของเขิน ᩭ เป็นสระหลัง
# สระอำเขียนต่างจากภาษาไทยคือ เขียนลากข้างก่อน แล้วตามด้วยนิคหิต ส่วนวรรณยุกต์อยู่บนพยัญชนะ (ก่อนสระ)
หมายเหตุ: ใช้ ᨠ เป็นพยัญชนะสำหรับเกาะ
{|
|-
|
{| class="wikitable"
|-
| 1. || ᨠᩫ (โอะมีตัวสะกด) || ᨠ + ᩫ
|- style="background-color:lightgreen"
| 2. || ᨠᩴ (อัง และนิคหิตของบาลี) || ᨠ + ᩴ
|- style="background-color:lightgreen"
| 3. || ᨠᩘ (ใช้ในตังหลาย และ งฺ ของบาลี) || ᨠ + ᩘ
|-
| 4. || ᨠᩢ (อะมีตัวสะกด) || ᨠ + ᩢ
|-
| 5. || ᨠ᩠ᩅᩫᩡ (อัวะ) || ᨠ + ᩠ + ᩅ + ᩫ + ᩡ
|-
| 6. || ᨠ᩠ᩅᩫ (อัวไม่มีตัวสะกด) || ᨠ + ᩠ + ᩅ + ᩫ
|-
| 7. || ᨠ᩠ᩅ (อัวมีตัวสะกด) || ᨠ + ᩠ + ᩅ
|- style="background-color:moccasin"
| 8. || ᨠᩬᩴ (ออไม่มีตัวสะกดของคำเมือง)<sup>[1]</sup> || ᨠ + ᩬ + ᩴ
|-
| 9. || ᨠᩬ (ออมีตัวสะกด) || ᨠ + ᩬ
|-
| 10. || ᨠᩡ (อะไม่มีตัวสะกด/ไม่เติมก็ได้) || ᨠ + ᩡ
|-
| 11. || ᨠᩣ (อาต่ำ) || ᨠ + ᩣ
|-
| 12. || ᨠᩤ (อาสูง) || ᨠ + ᩤ
|-
| 13. || ᨠᩣᩴ (อำต่ำ) || ᨠ + ᩣ + ᩴ
|-
| 14. || ᨠᩤᩴ (อำสูง) || ᨠ + ᩤ + ᩴ
|-
| 15. || ᨠᩥ (อิ) || ᨠ + ᩥ
|-
| 16. || ᨠᩦ (อี) || ᨠ + ᩦ
|-
| 17. || ᨠᩧ (อึ) || ᨠ + ᩧ
|-
| 18. || ᨠᩨ (อือ) || ᨠ + ᩨ
|-
| 19. || ᨠᩩ (อุ) || ᨠ + ᩩ
|-
| 20. || ᨠᩪ (อู) || ᨠ + ᩪ
|}
|
{| class="wikitable"
|-
| 21. || ᨠᩮᩡ (เอะไม่มีตัวสะกด) || ᨠ + ᩮ + ᩡ
|-
| 22. || ᨠᩮᩢ (เอะมีตัวสะกด) || ᨠ + ᩮ + ᩢ
|-
| 23. || ᨠᩮ (เอ) || ᨠ + ᩮ
|-
| 24. || ᨠᩯᩡ (แอะไม่มีตัวสะกด) || ᨠ + ᩯ + ᩡ
|-
| 25. || ᨠᩯᩢ (แอะมีตัวสะกด) || ᨠ + ᩯ + ᩢ
|-
| 26. || ᨠᩯ (แอ) || ᨠ + ᩯ
|-
| 27. || ᨠᩮᩬᩥᩡ (เออะ) || ᨠ + ᩮ + ᩬ + ᩥ + ᩡ
|-
| 28. || ᨠᩮᩬᩥ (เออไม่มีตัวสะกด หรือเอือมีตัวสะกด) || ᨠ + ᩮ + ᩬ + ᩥ
|-
| 29. || ᨠᩮᩥ (เออมีตัวสะกด) || ᨠ + ᩮ + ᩥ
|- style="background-color:pink"
| 30. || ᨠᩮᩬᩨᩡ (เออะของเขิน) || ᨠ + ᩮ + ᩬ + ᩨ + ᩡ
|- style="background-color:pink"
| 31. || ᨠᩮᩬᩨ (เออของเขิน) || ᨠ + ᩮ + ᩬ + ᩨ
|-
| 32. || ᨠᩮᩢᩣ (เอาต่ำ) || ᨠ + ᩮ + ᩢ + ᩣ
|-
| 33. || ᨠᩮᩢᩤ (เอาสูง) || ᨠ + ᩮ + ᩢ + ᩤ
|- style="background-color:lightgreen"
| 34. || ᨠᩮᩣ (โอต่ำของบาลี) || ᨠ + ᩮ + ᩣ
|- style="background-color:lightgreen"
| 35. || ᨠᩮᩤ (โอสูงของบาลี) || ᨠ + ᩮ + ᩤ
|- style="background-color:pink"
| 36. || ᨠᩳ (ออไม่มีตัวสะกดของเขิน) || ᨠ + ᩳ
|- style="background-color:lightblue"
| 37. || ᨠᩬᩳ (ออไม่มีตัวสะกดของลื้อ/ยอง) || ᨠ + ᩬ + ᩳ
|-
| 38. || ᨠ᩠ᨿᩮᩡ (เอียะ) || ᨠ + ᩠ + ᨿ + ᩮ + ᩡ
|-
| 39. || ᨠ᩠ᨿᩮ (เอียไม่มีตัวสะกด) || ᨠ + ᩠ + ᨿ + ᩮ
|-
| 40. || ᨠ᩠ᨿ (เอียมีตัวสะกด) || ᨠ + ᩠ + ᨿ
|}
|
{| class="wikitable"
|-
| 41. || ᨠᩮᩬᩥᩋᩡ (เอือะ) || ᨠ + ᩮ + ᩬ + ᩥ + ᩋ + ᩡ
|-
| 42. || ᨠᩮᩬᩥᩋ (เอือไม่มีตัวสะกด) || ᨠ + ᩮ + ᩬ + ᩥ + ᩋ
|-
| 43. || ᨠᩰᩡ (โอะไม่มีตัวสะกด) || ᨠ + ᩰ + ᩡ
|-
| 44. || ᨠᩰ (โอไม่มีตัวสะกด) || ᨠ + ᩰ
|-
| 45. || ᨠᩰᩫ (โอมีตัวสะกด) || ᨠ + ᩰ + ᩫ
|-
| 46. || ᨠᩰᩬᩡ (เอาะไม่มีตัวสะกด) || ᨠ + ᩰ + ᩬ + ᩡ
|-
| 47. || ᨠᩬᩢ (เอาะมีตัวสะกด) || ᨠ + ᩬ + ᩢ
|-
| 48. || ᨠᩱ (ไอ) || ᨠ + ᩱ
|- style="background-color:moccasin"
| 49. || ᨠᩲ (ใอของคำเมือง) || ᨠ + ᩲ
|- style="background:linear-gradient(to bottom, pink 0%, lightblue 100%);"
| 50. || ᨠᩭ (ออยของเขิน/ลื้อ/ยอง) || ᨠ + ᩭ
|-
| 51. || ᨠᩙ (อังอีกแบบหนึ่ง) || ᨠ + ᩙ
|-
| 52. || ᨠᩥᩴ (อิงอีกแบบหนึ่ง) || ᨠ + ᩥ + ᩴ
|}
|}
[1] ภาษาคำเมือง: คำพิเศษที่ไม่ต้องสะกดตามนี้ ได้แก่ [[ᨣᩴ᩵]] (ก็) และ [[ᨷᩴ᩵]] (บ่, ไม่)
; วรรณยุกต์
# ถ้ามีรูปสระหน้า สระล่าง สระบน ให้ใส่วรรณยุกต์หลังจากสระเหล่านี้ครบแล้ว
# ถ้าไม่มีสระ หรือมีแต่สระหลัง สามารถใส่วรรณยุกต์หลังจาก (กลุ่ม) พยัญชนะต้นได้ทันที
; สัญลักษณ์อื่น ๆ
# ไม้ซ้ำ ᩻ ใช้งานได้สามอย่าง ได้แก่
#* คำซ้ำ ให้ใส่ไม้ซ้ำที่ท้ายพยางค์หรือคำที่สะกดสำเร็จแล้ว เหมือนไม้ยมก
#* อักษรนำและอักษรตาม (อย่างคำเขมร) ให้ใส่ไม้ซ้ำหลังจากอักษรตาม (พยัญชนะตัวที่สอง) ทันที แล้วจึงตามด้วยสระ/วรรณยุกต์ต่อไป
#* ใช้เป็นตัวแก้ความกำกวมว่า พยัญชนะสองตัวที่ติดกัน คืออักษรนำและอักษรตาม มิใช่ตัวสะกด (ในกรณีที่สะกดตามปกติแล้วรูปเหมือนกัน) ตำแหน่งที่ใส่เหมือนข้อที่แล้ว
# ไม้กั๋งไหล ᩘ ใช้แทน งฺ ในคำบาลี/สันสกฤต โดยวางบนพยัญชนะตัวถัดไป
# ระห้าม ᩺ ใช้แทน รฺ ในคำบาลี/สันสกฤต โดยวางบนพยัญชนะตัวถัดไป หรือใช้แทนทัณฑฆาตที่อยู่ท้ายคำ
# การันต์ ᩼ ใช้แทนทัณฑฆาตของเขินและลื้อ ส่วนคำเมืองใช้ ระห้าม ᩺
; อื่น ๆ
# ภาษาเขิน, ภาษาลื้อ, ภาษายอง ที่เขียนด้วยอักษรไทธรรม ใช้อักขรวิธีเดียวกับภาษาคำเมือง เว้นแต่จะกำหนดไว้ในตาราง
</div>
lbvclqxminzaqfe5a5o1fdhj3kxlhqu
5720681
5720680
2026-04-20T15:53:45Z
OctraBot
3198
5720681
wikitext
text/x-wiki
<div class="Lana" lang="nod">
; พยัญชนะ
# ละ ที่เป็นอักษรตามหรืออักษรควบกล้ำ สามารถใช้ ᩖ หรือ ᩠ + ᩃ ก็ได้ วิกิพจนานุกรมกำหนดให้ ᩖ เป็นรายการหลัก
# ละตังหลาย ᩗ ใช้ใน [[ᨴᩗᩘᩣ]] (ตังหลาย) เท่านั้น
# หาง ᩛ ใช้ตามหลังพยัญชนะบางตัว ในคำที่ยืมมาจากบาลี/สันสกฤต (และบาลี/สันสกฤตโดยตรง)
#* หากตามหลังพยัญชนะ ᨭ (ฏ) ᨮ (ฐ) ᨯ (ฑ) ᨰ (ฒ) ᨱ (ณ) จะเท่ากับ ᩠ + ᨮ (+ฐ)
#* หากตามหลังพยัญชนะ ᨲ (ต) ᨳ (ถ) ᨴ (ท) ᨵ (ธ) ᨶ (น) จะเท่ากับ ᩠ + ᨳ (+ถ)
#* หากตามหลังพยัญชนะ ᨷ (ป) ᨹ (ผ) ᨻ (พ) ᨽ (ภ) ᨾ (ม) จะเท่ากับ ᩠ + ᨻ (+พ)
# สะใหญ่ ᩔ (สฺส) พบได้ในคำที่ยืมมาจากบาลี (และบาลีโดยตรง)
# พยัญชนะสะกด <u>โดยปกติ</u>จะเขียนไว้เป็นเชิงของตัวอักษรก่อนหน้า ซึ่งอาจเป็นพยัญชนะหรือสระก็ได้
#* หากตัวอักษรก่อนหน้าเป็นสระล่าง หรือมีพยัญชนะเชิงอยู่แล้ว จะเขียนเป็นตัวเต็มแทน
#* ตัวสะกด ᩠ + ᨿ (+ย) เขียนเป็นพยัญชนะเชิงเสมอ (สำหรับคำเมือง)
; สระ
# ใส่สระหลังจาก (กลุ่ม) พยัญชนะต้นเสมอ ตัว ᨿ, ᩅ, ᩋ ที่ปรากฏในรูปสระ ก็ใช้หลักเดียวกัน
# ถ้ามีสระหลายรูปประกอบกัน ให้ใส่สระหน้า สระล่าง สระบน และสระหลัง <u>ตามลำดับ</u>
#* ตัวเชิง ᨿ, ᩅ เป็นสระล่าง แต่สระออยของเขิน ᩭ เป็นสระหลัง
# สระอำเขียนต่างจากภาษาไทยคือ เขียนลากข้างก่อน แล้วตามด้วยนิคหิต ส่วนวรรณยุกต์อยู่บนพยัญชนะ (ก่อนสระ)
หมายเหตุ: ใช้ ᨠ เป็นพยัญชนะสำหรับเกาะ
{|
|-
|
{| class="wikitable"
|-
| 1. || ᨠᩫ (โอะมีตัวสะกด) || ᨠ + ᩫ
|- style="background-color:lightgreen"
| 2. || ᨠᩴ (อัง และนิคหิตของบาลี) || ᨠ + ᩴ
|- style="background-color:lightgreen"
| 3. || ᨠᩘ (ใช้ในตังหลาย และ งฺ ของบาลี) || ᨠ + ᩘ
|-
| 4. || ᨠᩢ (อะมีตัวสะกด) || ᨠ + ᩢ
|-
| 5. || ᨠ᩠ᩅᩫᩡ (อัวะ) || ᨠ + ᩠ + ᩅ + ᩫ + ᩡ
|-
| 6. || ᨠ᩠ᩅᩫ (อัวไม่มีตัวสะกด) || ᨠ + ᩠ + ᩅ + ᩫ
|-
| 7. || ᨠ᩠ᩅ (อัวมีตัวสะกด) || ᨠ + ᩠ + ᩅ
|- style="background-color:moccasin"
| 8. || ᨠᩬᩴ (ออไม่มีตัวสะกดของคำเมือง)<sup>[1]</sup> || ᨠ + ᩬ + ᩴ
|-
| 9. || ᨠᩬ (ออมีตัวสะกด) || ᨠ + ᩬ
|-
| 10. || ᨠᩡ (อะไม่มีตัวสะกด/ไม่เติมก็ได้) || ᨠ + ᩡ
|-
| 11. || ᨠᩣ (อาต่ำ) || ᨠ + ᩣ
|-
| 12. || ᨠᩤ (อาสูง) || ᨠ + ᩤ
|-
| 13. || ᨠᩣᩴ (อำต่ำ) || ᨠ + ᩣ + ᩴ
|-
| 14. || ᨠᩤᩴ (อำสูง) || ᨠ + ᩤ + ᩴ
|-
| 15. || ᨠᩥ (อิ) || ᨠ + ᩥ
|-
| 16. || ᨠᩦ (อี) || ᨠ + ᩦ
|-
| 17. || ᨠᩧ (อึ) || ᨠ + ᩧ
|-
| 18. || ᨠᩨ (อือ) || ᨠ + ᩨ
|-
| 19. || ᨠᩩ (อุ) || ᨠ + ᩩ
|-
| 20. || ᨠᩪ (อู) || ᨠ + ᩪ
|}
|
{| class="wikitable"
|-
| 21. || ᨠᩮᩡ (เอะไม่มีตัวสะกด) || ᨠ + ᩮ + ᩡ
|-
| 22. || ᨠᩮᩢ (เอะมีตัวสะกด) || ᨠ + ᩮ + ᩢ
|-
| 23. || ᨠᩮ (เอ) || ᨠ + ᩮ
|-
| 24. || ᨠᩯᩡ (แอะไม่มีตัวสะกด) || ᨠ + ᩯ + ᩡ
|-
| 25. || ᨠᩯᩢ (แอะมีตัวสะกด) || ᨠ + ᩯ + ᩢ
|-
| 26. || ᨠᩯ (แอ) || ᨠ + ᩯ
|-
| 27. || ᨠᩮᩬᩥᩡ (เออะ) || ᨠ + ᩮ + ᩬ + ᩥ + ᩡ
|-
| 28. || ᨠᩮᩬᩥ (เออไม่มีตัวสะกด หรือเอือมีตัวสะกด) || ᨠ + ᩮ + ᩬ + ᩥ
|-
| 29. || ᨠᩮᩥ (เออมีตัวสะกด) || ᨠ + ᩮ + ᩥ
|- style="background-color:pink"
| 30. || ᨠᩮᩬᩨᩡ (เออะของเขิน) || ᨠ + ᩮ + ᩬ + ᩨ + ᩡ
|- style="background-color:pink"
| 31. || ᨠᩮᩬᩨ (เออของเขิน) || ᨠ + ᩮ + ᩬ + ᩨ
|-
| 32. || ᨠᩮᩢᩣ (เอาต่ำ) || ᨠ + ᩮ + ᩢ + ᩣ
|-
| 33. || ᨠᩮᩢᩤ (เอาสูง) || ᨠ + ᩮ + ᩢ + ᩤ
|- style="background-color:lightgreen"
| 34. || ᨠᩮᩣ (โอต่ำของบาลี) || ᨠ + ᩮ + ᩣ
|- style="background-color:lightgreen"
| 35. || ᨠᩮᩤ (โอสูงของบาลี) || ᨠ + ᩮ + ᩤ
|- style="background-color:pink"
| 36. || ᨠᩳ (ออไม่มีตัวสะกดของเขิน) || ᨠ + ᩳ
|- style="background-color:lightblue"
| 37. || ᨠᩬᩳ (ออไม่มีตัวสะกดของลื้อ/ยอง) || ᨠ + ᩬ + ᩳ
|-
| 38. || ᨠ᩠ᨿᩮᩡ (เอียะ) || ᨠ + ᩠ + ᨿ + ᩮ + ᩡ
|-
| 39. || ᨠ᩠ᨿᩮ (เอียไม่มีตัวสะกด) || ᨠ + ᩠ + ᨿ + ᩮ
|-
| 40. || ᨠ᩠ᨿ (เอียมีตัวสะกด) || ᨠ + ᩠ + ᨿ
|}
|
{| class="wikitable"
|-
| 41. || ᨠᩮᩬᩥᩋᩡ (เอือะ) || ᨠ + ᩮ + ᩬ + ᩥ + ᩋ + ᩡ
|-
| 42. || ᨠᩮᩬᩥᩋ (เอือไม่มีตัวสะกด) || ᨠ + ᩮ + ᩬ + ᩥ + ᩋ
|-
| 43. || ᨠᩰᩡ (โอะไม่มีตัวสะกด) || ᨠ + ᩰ + ᩡ
|-
| 44. || ᨠᩰ (โอไม่มีตัวสะกด) || ᨠ + ᩰ
|-
| 45. || ᨠᩰᩫ (โอมีตัวสะกด) || ᨠ + ᩰ + ᩫ
|-
| 46. || ᨠᩰᩬᩡ (เอาะไม่มีตัวสะกด) || ᨠ + ᩰ + ᩬ + ᩡ
|-
| 47. || ᨠᩬᩢ (เอาะมีตัวสะกด) || ᨠ + ᩬ + ᩢ
|-
| 48. || ᨠᩱ (ไอ) || ᨠ + ᩱ
|- style="background-color:moccasin"
| 49. || ᨠᩲ (ใอของคำเมือง) || ᨠ + ᩲ
|- style="background:linear-gradient(to bottom, pink 0%, lightblue 100%);"
| 50. || ᨠᩭ (ออยของเขิน/ลื้อ/ยอง) || ᨠ + ᩭ
|-
| 51. || ᨠᩙ (อังอีกแบบหนึ่ง) || ᨠ + ᩙ
|-
| 52. || ᨠᩥᩴ (อิงอีกแบบหนึ่ง) || ᨠ + ᩥ + ᩴ
|}
|}
[1] ภาษาคำเมือง: คำพิเศษที่ไม่ต้องสะกดตามนี้ ได้แก่ [[ᨣᩴ᩵]] (ก็) และ [[ᨷᩴ᩵]] (บ่, ไม่)
; วรรณยุกต์
# ถ้ามีรูปสระหน้า สระล่าง สระบน ให้ใส่วรรณยุกต์หลังจากสระเหล่านี้ครบแล้ว
# ถ้าไม่มีสระ หรือมีแต่สระหลัง สามารถใส่วรรณยุกต์หลังจาก (กลุ่ม) พยัญชนะต้นได้ทันที
; สัญลักษณ์อื่น ๆ
# ไม้ซ้ำ ᩻ ใช้งานได้สามอย่าง ได้แก่
#* คำซ้ำ ให้ใส่ไม้ซ้ำที่ท้ายพยางค์หรือคำที่สะกดสำเร็จแล้ว เหมือนไม้ยมก
#* อักษรนำและอักษรตาม (อย่างคำเขมร) ให้ใส่ไม้ซ้ำหลังจากอักษรตาม (พยัญชนะตัวที่สอง) ทันที แล้วจึงตามด้วยสระ/วรรณยุกต์ต่อไป
#* ใช้เป็นตัวแก้ความกำกวมว่า พยัญชนะสองตัวที่ติดกัน คืออักษรนำและอักษรตาม มิใช่ตัวสะกด (ในกรณีที่สะกดตามปกติแล้วรูปเหมือนกัน) ตำแหน่งที่ใส่เหมือนข้อที่แล้ว
# ไม้กั๋งไหล ᩘ ใช้แทน งฺ ในคำบาลี/สันสกฤต โดยวางบนพยัญชนะตัวถัดไป
# ระห้าม ᩺ ใช้แทน รฺ ในคำบาลี/สันสกฤต โดยวางบนพยัญชนะตัวถัดไป หรือใช้แทนทัณฑฆาตที่อยู่ท้ายคำ
# การันต์ ᩼ ใช้แทนทัณฑฆาตของเขินและลื้อ ส่วนคำเมืองใช้ ระห้าม ᩺
; อื่น ๆ
# ภาษาเขิน, ภาษาลื้อ, ภาษายอง ที่เขียนด้วยอักษรไทธรรม ใช้อักขรวิธีเดียวกับภาษาคำเมือง เว้นแต่จะกำหนดไว้ในตาราง
</div>
30akd007a6l88syyev9wkgmvlkl5wyf
5720682
5720681
2026-04-20T15:55:24Z
OctraBot
3198
5720682
wikitext
text/x-wiki
<div class="Lana" lang="nod">
; พยัญชนะ
# ละ ที่เป็นอักษรตามหรืออักษรควบกล้ำ สามารถใช้ ᩖ หรือ ᩠ + ᩃ ก็ได้ วิกิพจนานุกรมกำหนดให้ ᩖ เป็นรายการหลัก
# ละตังหลาย ᩗ ใช้ใน [[ᨴᩗᩘᩣ]] (ตังหลาย) เท่านั้น
# หาง ᩛ ใช้ตามหลังพยัญชนะบางตัว ในคำที่ยืมมาจากบาลี/สันสกฤต (และบาลี/สันสกฤตโดยตรง)
#* หากตามหลังพยัญชนะ ᨭ (ฏ) ᨮ (ฐ) ᨯ (ฑ) ᨰ (ฒ) ᨱ (ณ) จะเท่ากับ ᩠ + ᨮ (+ฐ)
#* หากตามหลังพยัญชนะ ᨲ (ต) ᨳ (ถ) ᨴ (ท) ᨵ (ธ) ᨶ (น) จะเท่ากับ ᩠ + ᨳ (+ถ)
#* หากตามหลังพยัญชนะ ᨷ (ป) ᨹ (ผ) ᨻ (พ) ᨽ (ภ) ᨾ (ม) จะเท่ากับ ᩠ + ᨻ (+พ)
# สะใหญ่ ᩔ (สฺส) พบได้ในคำที่ยืมมาจากบาลี (และบาลีโดยตรง)
# พยัญชนะสะกด <u>โดยปกติ</u>จะเขียนไว้เป็นเชิงของตัวอักษรก่อนหน้า ซึ่งอาจเป็นพยัญชนะหรือสระก็ตาม
#* หากตัวอักษรก่อนหน้าเป็นสระล่าง หรือมีพยัญชนะเชิงอยู่แล้ว จะเขียนเป็นตัวเต็มแทน
#* ตัวสะกด ᩠ + ᨿ (+ย) เขียนเป็นพยัญชนะเชิงเสมอ (สำหรับคำเมือง)
; สระ
# ใส่สระหลังจาก (กลุ่ม) พยัญชนะต้นเสมอ ตัว ᨿ, ᩅ, ᩋ ที่ปรากฏในรูปสระ ก็ใช้หลักเดียวกัน
# ถ้ามีสระหลายรูปประกอบกัน ให้ใส่สระหน้า สระล่าง สระบน และสระหลัง <u>ตามลำดับ</u>
#* ตัวเชิง ᨿ, ᩅ เป็นสระล่าง แต่สระออยของเขิน ᩭ เป็นสระหลัง
# สระอำเขียนต่างจากภาษาไทยคือ เขียนลากข้างก่อน แล้วตามด้วยนิคหิต ส่วนวรรณยุกต์อยู่บนพยัญชนะ (ก่อนสระ)
หมายเหตุ: ใช้ ᨠ เป็นพยัญชนะสำหรับเกาะ
{|
|-
|
{| class="wikitable"
|-
| 1. || ᨠᩫ (โอะมีตัวสะกด) || ᨠ + ᩫ
|- style="background-color:lightgreen"
| 2. || ᨠᩴ (อัง และนิคหิตของบาลี) || ᨠ + ᩴ
|- style="background-color:lightgreen"
| 3. || ᨠᩘ (ใช้ในตังหลาย และ งฺ ของบาลี) || ᨠ + ᩘ
|-
| 4. || ᨠᩢ (อะมีตัวสะกด) || ᨠ + ᩢ
|-
| 5. || ᨠ᩠ᩅᩫᩡ (อัวะ) || ᨠ + ᩠ + ᩅ + ᩫ + ᩡ
|-
| 6. || ᨠ᩠ᩅᩫ (อัวไม่มีตัวสะกด) || ᨠ + ᩠ + ᩅ + ᩫ
|-
| 7. || ᨠ᩠ᩅ (อัวมีตัวสะกด) || ᨠ + ᩠ + ᩅ
|- style="background-color:moccasin"
| 8. || ᨠᩬᩴ (ออไม่มีตัวสะกดของคำเมือง)<sup>[1]</sup> || ᨠ + ᩬ + ᩴ
|-
| 9. || ᨠᩬ (ออมีตัวสะกด) || ᨠ + ᩬ
|-
| 10. || ᨠᩡ (อะไม่มีตัวสะกด/ไม่เติมก็ได้) || ᨠ + ᩡ
|-
| 11. || ᨠᩣ (อาต่ำ) || ᨠ + ᩣ
|-
| 12. || ᨠᩤ (อาสูง) || ᨠ + ᩤ
|-
| 13. || ᨠᩣᩴ (อำต่ำ) || ᨠ + ᩣ + ᩴ
|-
| 14. || ᨠᩤᩴ (อำสูง) || ᨠ + ᩤ + ᩴ
|-
| 15. || ᨠᩥ (อิ) || ᨠ + ᩥ
|-
| 16. || ᨠᩦ (อี) || ᨠ + ᩦ
|-
| 17. || ᨠᩧ (อึ) || ᨠ + ᩧ
|-
| 18. || ᨠᩨ (อือ) || ᨠ + ᩨ
|-
| 19. || ᨠᩩ (อุ) || ᨠ + ᩩ
|-
| 20. || ᨠᩪ (อู) || ᨠ + ᩪ
|}
|
{| class="wikitable"
|-
| 21. || ᨠᩮᩡ (เอะไม่มีตัวสะกด) || ᨠ + ᩮ + ᩡ
|-
| 22. || ᨠᩮᩢ (เอะมีตัวสะกด) || ᨠ + ᩮ + ᩢ
|-
| 23. || ᨠᩮ (เอ) || ᨠ + ᩮ
|-
| 24. || ᨠᩯᩡ (แอะไม่มีตัวสะกด) || ᨠ + ᩯ + ᩡ
|-
| 25. || ᨠᩯᩢ (แอะมีตัวสะกด) || ᨠ + ᩯ + ᩢ
|-
| 26. || ᨠᩯ (แอ) || ᨠ + ᩯ
|-
| 27. || ᨠᩮᩬᩥᩡ (เออะ) || ᨠ + ᩮ + ᩬ + ᩥ + ᩡ
|-
| 28. || ᨠᩮᩬᩥ (เออไม่มีตัวสะกด หรือเอือมีตัวสะกด) || ᨠ + ᩮ + ᩬ + ᩥ
|-
| 29. || ᨠᩮᩥ (เออมีตัวสะกด) || ᨠ + ᩮ + ᩥ
|- style="background-color:pink"
| 30. || ᨠᩮᩬᩨᩡ (เออะของเขิน) || ᨠ + ᩮ + ᩬ + ᩨ + ᩡ
|- style="background-color:pink"
| 31. || ᨠᩮᩬᩨ (เออของเขิน) || ᨠ + ᩮ + ᩬ + ᩨ
|-
| 32. || ᨠᩮᩢᩣ (เอาต่ำ) || ᨠ + ᩮ + ᩢ + ᩣ
|-
| 33. || ᨠᩮᩢᩤ (เอาสูง) || ᨠ + ᩮ + ᩢ + ᩤ
|- style="background-color:lightgreen"
| 34. || ᨠᩮᩣ (โอต่ำของบาลี) || ᨠ + ᩮ + ᩣ
|- style="background-color:lightgreen"
| 35. || ᨠᩮᩤ (โอสูงของบาลี) || ᨠ + ᩮ + ᩤ
|- style="background-color:pink"
| 36. || ᨠᩳ (ออไม่มีตัวสะกดของเขิน) || ᨠ + ᩳ
|- style="background-color:lightblue"
| 37. || ᨠᩬᩳ (ออไม่มีตัวสะกดของลื้อ/ยอง) || ᨠ + ᩬ + ᩳ
|-
| 38. || ᨠ᩠ᨿᩮᩡ (เอียะ) || ᨠ + ᩠ + ᨿ + ᩮ + ᩡ
|-
| 39. || ᨠ᩠ᨿᩮ (เอียไม่มีตัวสะกด) || ᨠ + ᩠ + ᨿ + ᩮ
|-
| 40. || ᨠ᩠ᨿ (เอียมีตัวสะกด) || ᨠ + ᩠ + ᨿ
|}
|
{| class="wikitable"
|-
| 41. || ᨠᩮᩬᩥᩋᩡ (เอือะ) || ᨠ + ᩮ + ᩬ + ᩥ + ᩋ + ᩡ
|-
| 42. || ᨠᩮᩬᩥᩋ (เอือไม่มีตัวสะกด) || ᨠ + ᩮ + ᩬ + ᩥ + ᩋ
|-
| 43. || ᨠᩰᩡ (โอะไม่มีตัวสะกด) || ᨠ + ᩰ + ᩡ
|-
| 44. || ᨠᩰ (โอไม่มีตัวสะกด) || ᨠ + ᩰ
|-
| 45. || ᨠᩰᩫ (โอมีตัวสะกด) || ᨠ + ᩰ + ᩫ
|-
| 46. || ᨠᩰᩬᩡ (เอาะไม่มีตัวสะกด) || ᨠ + ᩰ + ᩬ + ᩡ
|-
| 47. || ᨠᩬᩢ (เอาะมีตัวสะกด) || ᨠ + ᩬ + ᩢ
|-
| 48. || ᨠᩱ (ไอ) || ᨠ + ᩱ
|- style="background-color:moccasin"
| 49. || ᨠᩲ (ใอของคำเมือง) || ᨠ + ᩲ
|- style="background:linear-gradient(to bottom, pink 0%, lightblue 100%);"
| 50. || ᨠᩭ (ออยของเขิน/ลื้อ/ยอง) || ᨠ + ᩭ
|-
| 51. || ᨠᩙ (อังอีกแบบหนึ่ง) || ᨠ + ᩙ
|-
| 52. || ᨠᩥᩴ (อิงอีกแบบหนึ่ง) || ᨠ + ᩥ + ᩴ
|}
|}
[1] ภาษาคำเมือง: คำพิเศษที่ไม่ต้องสะกดตามนี้ ได้แก่ [[ᨣᩴ᩵]] (ก็) และ [[ᨷᩴ᩵]] (บ่, ไม่)
; วรรณยุกต์
# ถ้ามีรูปสระหน้า สระล่าง สระบน ให้ใส่วรรณยุกต์หลังจากสระเหล่านี้ครบแล้ว
# ถ้าไม่มีสระ หรือมีแต่สระหลัง สามารถใส่วรรณยุกต์หลังจาก (กลุ่ม) พยัญชนะต้นได้ทันที
; สัญลักษณ์อื่น ๆ
# ไม้ซ้ำ ᩻ ใช้งานได้สามอย่าง ได้แก่
#* คำซ้ำ ให้ใส่ไม้ซ้ำที่ท้ายพยางค์หรือคำที่สะกดสำเร็จแล้ว เหมือนไม้ยมก
#* อักษรนำและอักษรตาม (อย่างคำเขมร) ให้ใส่ไม้ซ้ำหลังจากอักษรตาม (พยัญชนะตัวที่สอง) ทันที แล้วจึงตามด้วยสระ/วรรณยุกต์ต่อไป
#* ใช้เป็นตัวแก้ความกำกวมว่า พยัญชนะสองตัวที่ติดกัน คืออักษรนำและอักษรตาม มิใช่ตัวสะกด (ในกรณีที่สะกดตามปกติแล้วรูปเหมือนกัน) ตำแหน่งที่ใส่เหมือนข้อที่แล้ว
# ไม้กั๋งไหล ᩘ ใช้แทน งฺ ในคำบาลี/สันสกฤต โดยวางบนพยัญชนะตัวถัดไป
# ระห้าม ᩺ ใช้แทน รฺ ในคำบาลี/สันสกฤต โดยวางบนพยัญชนะตัวถัดไป หรือใช้แทนทัณฑฆาตที่อยู่ท้ายคำ
# การันต์ ᩼ ใช้แทนทัณฑฆาตของเขินและลื้อ ส่วนคำเมืองใช้ ระห้าม ᩺
; อื่น ๆ
# ภาษาเขิน, ภาษาลื้อ, ภาษายอง ที่เขียนด้วยอักษรไทธรรม ใช้อักขรวิธีเดียวกับภาษาคำเมือง เว้นแต่จะกำหนดไว้ในตาราง
</div>
pwrtbkr9taduu8la2loswbh8m59w7yx
ဢဝ်
0
166994
5720726
1893283
2026-04-21T04:49:55Z
Ai Ku Karng
17824
/* ภาษาไทใหญ่ */
5720726
wikitext
text/x-wiki
{{also/auto}}
== ภาษาไทใหญ่ ==
=== รากศัพท์ ===
{{inh+|shn|tai-pro|*ʔawᴬ}}; ร่วมเชื้อสายกับ{{cog|th|เอา}}, {{cog|nod|ᩐᩣ}}, {{cog|lo|ເອົາ}}, {{cog|khb|ᦀᧁ}}, {{cog|blt|ꪹꪮꪱ}}, {{cog|aho|𑜒𑜧}} หรือ {{m|aho|𑜒𑜧𑜈𑜫}} หรือ {{m|aho|𑜒𑜨𑜧}}, {{cog|za|aeu}}, {{cog|tdd|ᥟᥝ}}
=== การออกเสียง ===
{{shn-pron}}
=== คำกริยา ===
{{shn-verb}}
# [[เอา]]
# [[ทำให้]]
#: {{ux|shn|'''ဢဝ်'''[[ႁၢႆ]]|'''ทำ'''หาย}}
#: {{ux|shn|[[မၼ်း]]'''ဢဝ်'''[[ငိုၼ်း]][[ႁၢႆ]]|มัน'''ทำ'''เงินหาย}}
erxsga6hkzdokiirhkhjiewu59h2hoh
Chiang Mai
0
169809
5720704
1610303
2026-04-21T01:55:51Z
OctraBot
3198
บอต: แทนที่ข้อความโดยอัตโนมัติ (-\|เมืองใหญ่ในไทย\}\} +|นครในไทย}})
5720704
wikitext
text/x-wiki
== ภาษาอังกฤษ ==
{{wikipedia|lang=en}}
=== รากศัพท์ ===
{{bor|en|th|เชียงใหม่}}
=== คำวิสามานยนาม ===
{{en-proper noun|head=Chiang Mai}}
# [[เชียงใหม่]] (ทั้งจังหวัดและเมือง)
{{topics|en|จังหวัดในไทย|นครในไทย}}
q2p828hw75to4dkl8x3x42gvznvqw6m
ᥓᥣᥭᥰ
0
174184
5720738
1647641
2026-04-21T06:27:17Z
Ai Ku Karng
17824
/* ภาษาไทใต้คง */
5720738
wikitext
text/x-wiki
== ภาษาไทใต้คง ==
=== รากศัพท์ ===
{{inh+|tdd|tai-pro|*ʑaːjᴬ||เพศชาย}}; ร่วมเชื้อสายกับ{{cog|th|ชาย}}, {{cog|nod|ᨩᩣ᩠ᨿ}}, {{cog|lo|ຊາຍ}}, {{cog|khb|ᦋᦻ}}, {{cog|shn|ၸၢႆး}}, {{cog|blt|ꪋꪱꪥ}}, {{cog|aho|𑜋𑜩}}, {{cog|za|sai}}
=== การออกเสียง ===
* {{IPA|tdd|/t͡saːj˥˧/}}
=== คำนาม ===
{{tdd-noun}}
# [[ผู้ชาย]], [[ชาย]]
oefzwoe88g93w4ws2v5em4oz398ekjj
ᥐᥣ
0
174198
5720747
1643221
2026-04-21T06:53:32Z
Ai Ku Karng
17824
/* ภาษาไทใต้คง */
5720747
wikitext
text/x-wiki
== ภาษาไทใต้คง ==
=== การออกเสียง ===
* {{IPA|tdd|/kaː˧˧/}}
=== รากศัพท์ 1 ===
==== คำนาม ====
{{tdd-noun}}
# [[กา]] (นก)
=== รากศัพท์ 2 ===
{{bor+|tdd|ltc|-}} {{ltc-l|價}}; ร่วมเชื้อสายกับ{{cog|th|ค่า}}, {{cog|lo|ຄ່າ}}, {{cog|tts|ค่า}}, {{cog|nod|ᨣ᩵ᩤ}}, {{cog|kkh|ᨣ᩵ᩤ}}, {{cog|khb|ᦅᦱᧈ}}, {{cog|shn|ၵႃႈ}}, {{cog|blt|ꪁ꪿ꪱ}}, {{cog|aho|𑜀𑜠}}
==== คำนาม ====
{{tdd-noun}}
# [[ราคา]], [[ค่า]]
=== รากศัพท์ 3 ===
แผลงมาจาก {{m|tdd|ᥐᥣᥳ}}
==== คำกริยา ====
{{tdd-verb}}
# {{lb|tdd|สกรรม}} [[ค้า]], [[ทำ]][[การค้า]]
hcitzqq0njtmo0r3iy9gdl8btxkrfdj
ᥛᥤᥰ
0
174210
5720723
1647678
2026-04-21T04:40:16Z
Ai Ku Karng
17824
/* ภาษาไทใต้คง */
5720723
wikitext
text/x-wiki
== ภาษาไทใต้คง ==
=== รากศัพท์ ===
{{inh+|tdd|tai-pro|*miːᴬ}}; ร่วมเชื้อสายกับ{{cog|th|มี}}, {{cog|nod|ᨾᩦ}}, {{cog|kkh|ᨾᩦ}}, {{cog|lo|ມີ}}, {{cog|khb|ᦙᦲ}}, {{cog|blt|ꪣꪲ}}, {{cog|shn|မီး}}, {{cog|za|miz}}
=== การออกเสียง ===
* {{IPA|tdd|/miː˥˧/}}
=== คำกริยา ===
{{tdd-verb}}
# [[มี]]
rtufcetbbjts85zkrz89f1wwfubxunz
5720776
5720723
2026-04-21T07:14:18Z
Ai Ku Karng
17824
5720776
wikitext
text/x-wiki
== ภาษาไทใต้คง ==
=== รากศัพท์ ===
{{inh+|tdd|tai-pro|*miːᴬ}}; ร่วมเชื้อสายกับ{{cog|th|มี}}, {{cog|nod|ᨾᩦ}}, {{cog|kkh|ᨾᩦ}}, {{cog|lo|ມີ}}, {{cog|khb|ᦙᦲ}}, {{cog|blt|ꪣꪲ}}, {{cog|shn|မီး}}, {{cog|za|miz}}
=== การออกเสียง ===
* {{IPA|tdd|/mi˥˧/}}
=== คำกริยา ===
{{tdd-verb}}
# [[มี]]
a2p0d5twwvtvsj9v6x9zwj6qp82t8ee
มอดูล:data consistency check
828
211720
5720745
2691044
2026-04-21T06:47:28Z
OctraBot
3198
5720745
Scribunto
text/plain
-- TODO:
-- ietf_subtag field used with a 2/3-letter langauge/family code except qaa-qtz, or a 4-letter script code.
-- Check against files containing up-to-date ISO data, to cross-check validity.
local export = {}
local mw = mw
local require = require
local string = string
local Array = require("Module:array")
local m_en_utilities = require("Module:en-utilities")
local m_etym_languages_canonical_names = require("Module:etymology languages/canonical names")
local m_etym_languages_codes = require("Module:etymology languages/code to canonical name")
local m_etym_languages_data = require("Module:etymology languages/data")
local m_families = require("Module:families")
local m_families_canonical_names = require("Module:families/canonical names")
local m_families_codes = require("Module:families/code to canonical name")
local m_families_data = require("Module:families/data")
local m_languages = require("Module:languages")
local m_languages_canonical_names = require("Module:languages/canonical names")
local m_languages_codes = require("Module:languages/code to canonical name")
local m_languages_data_all = require("Module:languages/data/all")
local m_load = require("Module:load")
local m_scripts = require("Module:scripts")
local m_scripts_canonical_names = require("Module:scripts/canonical names")
local m_scripts_codes = require("Module:scripts/code to canonical name")
local m_scripts_data = require("Module:scripts/data")
local m_str_utils = require("Module:string utilities")
local m_table = require("Module:table")
local add_indefinite_article = m_en_utilities.add_indefinite_article
local codepoint = m_str_utils.codepoint
local concat = table.concat
local dump = mw.dumpObject
local format = string.format
local gcodepoint = m_str_utils.gcodepoint
local get_data_module_name = m_languages.getDataModuleName
local get_family_by_code = m_families.getByCode
local get_family_by_canonical_name = m_families.getByCanonicalName
local get_indefinite_article = m_en_utilities.get_indefinite_article
local get_language_by_code = m_languages.getByCode
local get_language_by_canonical_name = m_languages.getByCanonicalName
local get_script_by_code = m_scripts.getByCode
local get_script_by_canonical_name = m_scripts.getByCanonicalName
local gmatch = string.gmatch
local gsub = string.gsub
local insert = table.insert
local ipairs = ipairs
local is_callable = require("Module:fun").is_callable
local is_positive_integer = require("Module:math").is_positive_integer
local is_known_language_tag = mw.language.isKnownLanguageTag
local isutf8 = mw.ustring.isutf8
local json_decode = mw.text.jsonDecode
local language_link = require("Module:links").language_link
local list_to_set = m_table.listToSet
local list_to_text = mw.text.listToText
local load_data = m_load.load_data
local log = mw.log
local main_loader = package.loaders[2]
local make_family = m_families.makeObject
local make_lang = m_languages.makeObject
local make_script = m_scripts.makeObject
local match = string.match
local new_title = mw.title.new
local next = next
local pairs = pairs
local pcall = pcall
local remove_comments = require("Module:string/removeComments")
local safe_require = m_load.safe_require
local sorted_pairs = m_table.sortedPairs
local split = m_str_utils.split
local sub = string.sub
local table_len = m_table.length
local tag_text = require("Module:script utilities").tag_text
local type = type
local umatch = m_str_utils.match
local unpack = unpack or table.unpack -- Lua 5.2 compatibility
local aliases = require("Module:languages/data").aliases
local messages
local function discrepancy(modname, ...)
local success, result = pcall(function(...)
messages[modname]:insert(format(...))
end, ...)
if not success then
log(result, ...)
end
end
local messages_mt = {}
function messages_mt:__index(k)
local val = Array()
self[k] = val
return val
end
local all_codes = {}
local language_names = {}
local etym_language_names = {}
local family_names = {}
local script_names = {}
local nonempty_families = {}
local allowed_empty_families = {tbq = true}
local nonempty_scripts = {}
local function link(obj, code_first)
return type(obj) == "string" and obj or
code_first and format("<code>%s</code> (%s)", obj:getCode(), obj:makeCategoryLink()) or
format("%s (<code>%s</code>)", obj:makeCategoryLink(), obj:getCode())
end
local function check_data_keys(...)
local valid_keys = Array(...):toSet()
return function (modname, obj, data)
local invalid_keys
for k in pairs(data) do
if not valid_keys[k] then
if not invalid_keys then
invalid_keys = Array(k)
else
invalid_keys:insert(k)
end
end
end
if invalid_keys == nil then
return
end
local plural = #invalid_keys ~= 1
discrepancy(modname,
"The data key%s %s for %s %s invalid.",
plural and "s" or "",
invalid_keys:map(function(key)
return "<code>" .. key .. "</code>"
end):concat(", "),
link(obj),
plural and "are" or "is"
)
end
end
-- Modification of isArray in [[Module:table]].
-- This assumes all keys are either integers or non-numbers.
-- If there are fractional numbers, the results might be incorrect.
-- For instance, find_gap{"a", "b", [0.5] = true} evaluates to 3, but there
-- isn't a gap at 3 in the sense of there being an integer key greater than 3.
local function find_gap(t, can_contain_non_number_keys)
local i = 0
for k in pairs(t) do
if not (can_contain_non_number_keys and type(k) ~= "number") then
i = i + 1
if t[i] == nil then
return i
end
end
end
end
local function check_true_or_string_or_nil(modname, obj, data, key)
local field = data[key]
if not (field == nil or field == true or type(field) == "string") then
discrepancy(modname,
"%s has %s <code>%s</code> value that is not <code>nil</code>, <code>true</code> or a string: <code>%s</code>",
link(obj), get_indefinite_article(key), key, dump(data[key])
)
end
end
local function check_array(modname, obj, data, array_name, parent_array_name, can_contain_non_number_keys)
local parent_table = data
if parent_array_name then
parent_table = assert(data[parent_array_name], parent_array_name)
parent_array_name = "the <code>" .. parent_array_name .. "</code> field in "
else
parent_array_name = ""
end
local array_type = type(parent_table[array_name])
if array_type == "table" then
local gap = find_gap(parent_table[array_name], can_contain_non_number_keys)
if gap then
discrepancy(modname,
"The <code>%s</code> array in %sthe data table for %s has a gap at index %d.",
array_name,
parent_array_name,
link(obj),
gap
)
else
return true
end
else
discrepancy(modname,
"The <code>%s</code> field in %sthe data table for %s should be an array (table) but is %s.",
array_name,
parent_array_name,
link(obj),
array_type == "nil" and "nil" or "a " .. array_type
)
end
end
local function check_no_alias_codes(modname, mod_data)
local lookup, discrepancies = {}, {}
for k, v in pairs(mod_data) do
local check = lookup[v]
if check then
discrepancies[check] = discrepancies[check] or {"<code>" .. check .. "</code>"}
insert(discrepancies[check], "<code>" .. k .. "</code>")
else
lookup[v] = k
end
end
for _, v in pairs(discrepancies) do
discrepancy(modname,
"The codes %s are currently alias codes. Only one code should be used in the data.",
list_to_text(v, ", ", " and ")
)
end
end
local function check_wikidata_item(modname, obj, data, key)
local data_item = data[key]
if data_item == nil or is_positive_integer(data_item) then
return
end
discrepancy(modname,
"%s has a Wikidata item ID that is not a positive integer: <code>%s</code>",
link(obj), dump(data_item)
)
end
local function check_name_field(modname, obj, data, canonical_name, data_key, allow_nested, allow_canonical_name_in_table)
local array = data[data_key]
if not array then
return
end
check_array(modname, obj, data, data_key, nil, true)
local names = {}
local function check_other_name(other_name)
if not allow_canonical_name_in_table and other_name == canonical_name then
discrepancy(modname,
"%s has its canonical name (<code>%s</code>) repeated in the table of <code>%s</code>.",
link(obj), dump(canonical_name), data_key
)
end
if names[other_name] then
discrepancy(modname,
"The name %s is found twice or more in the list of <code>%s</code> for %s.",
other_name, data_key, link(obj)
)
end
names[other_name] = true
end
for _, other_name in ipairs(array) do
if type(other_name) == "table" then
if not allow_nested then
discrepancy(modname,
"A nested table is found in the list of <code>%s</code> for %s, but isn't allowed.",
data_key, link(obj)
)
else
for _, on in ipairs(other_name) do
check_other_name(on)
end
end
else
check_other_name(other_name)
end
end
end
local function check_other_names_aliases_varieties(modname, obj, data, canonical_name)
if data.other_names then
check_name_field(modname, obj, data, canonical_name, "other_names")
end
if data.aliases then
check_name_field(modname, obj, data, canonical_name, "aliases")
end
if data.varieties then
-- Sometimes a variety legitimately has the same name as the language as a whole, so allow that.
check_name_field(modname, obj, data, canonical_name, "varieties", "allow_nested", "allow_canonical_name_in_table")
end
end
local function validate_pattern(pattern, modname, obj, standard_chars)
if type(pattern) ~= "string" then
return discrepancy(modname,
"\"%s\", the %spattern for %s, is not a string.",
pattern, standard_chars and "standard character " or "", link(obj)
)
elseif not isutf8(pattern) then
return discrepancy(modname,
"%s specifies a pattern for for %scharacter detection which is not valid UTF-8: <code>%s</code>",
link(obj), standard_chars and "standard " or "", dump(pattern)
)
end
local ranges
for lower, higher in gmatch(pattern, "(.[\128-\191]*)%-%%?(.[\128-\191]*)") do
if codepoint(lower) >= codepoint(higher) then
ranges = ranges or Array()
insert(ranges, { lower, higher })
end
end
if ranges and ranges[1] then
local plural = #ranges ~= 1 and "s" or ""
discrepancy(modname,
"%s specifies an invalid pattern " ..
"for %scharacter detection: <code>%s</code>. The first codepoint%s " ..
"in the range%s %s %s must be less than or equal to the second.",
link(obj), standard_chars and "standard " or "", dump(pattern), plural, plural,
ranges:map(function(range)
return format(range[1] .. "-" .. range[2] .. " (U+%X, U+%X)", codepoint(range[1]), codepoint(range[2]))
end):concat(", "),
#ranges ~= 1 and "are" or "is"
)
end
local success, result = pcall(umatch, "", "[" .. pattern .. "]")
if not success then
discrepancy(modname,
"%s specifies an invalid pattern for %scharacter detection: <code>%s</code> (%s)",
link(obj), standard_chars and "standard " or "", dump(pattern), result
)
end
end
local remove_exceptions_addition = 0xF0000
local maximum_code_point = 0x10FFFF
local remove_exceptions_maximum_code_point = maximum_code_point - remove_exceptions_addition
-- TODO: check modules exist.
-- TODO: validate script codes and check inner tables.
local function check_replacement_data(modname, obj, data, key, func_name)
local replacements = data[key]
if replacements == nil then
return
end
local replacements_type = type(replacements)
if replacements_type == "string" then
local mod = main_loader("Module:" .. replacements)
if not mod then
discrepancy(modname,
"The <code>%s</code> field in the data table for %s specifies the module [[Module:%s]], which does not exist.",
key, link(obj), replacements
)
else
mod = mod()
if not (type(mod) == "table" and is_callable(mod[func_name])) then
discrepancy(modname,
"The <code>%s</code> field in the data table for %s specifies the module [[Module:%s]], which exists, but does not contain the expected function <code>%s()</code>.",
key, link(obj), replacements, func_name
)
end
end
return
elseif replacements_type ~= "table" then
discrepancy(modname,
"The <code>%s</code> field in the data table for %s must be a string or table, not a %s.",
key, link(obj), replacements_type
)
return
end
local from, to = replacements.from, replacements.to
if (from ~= nil) ~= (to ~= nil) then
discrepancy(modname,
"The <code>from</code> and <code>to</code> arrays in the <code>%s</code> table for %s are not both defined or both undefined.",
key, link(obj)
)
elseif from then
for _, k in ipairs {"from", "to"} do
check_array(modname, obj, data, k, key)
end
end
local remove_diacritics = replacements.remove_diacritics
if not (remove_diacritics == nil or type(remove_diacritics) == "string") then
discrepancy(modname,
"The <code>remove_diacritics</code> field in the <code>%s</code> table for %s table must be a string.",
key, link(obj)
)
end
local remove_exceptions = replacements.remove_exceptions
if remove_exceptions then
if check_array(modname, obj, data, "remove_exceptions", key) then
for sequence_i, sequence in ipairs(remove_exceptions) do
local code_point_i = 0
for code_point in gcodepoint(sequence) do
code_point_i = code_point_i + 1
if code_point > remove_exceptions_maximum_code_point then
discrepancy(modname,
"Code point #%d (0x%04X) in field #%d of the <code>remove_exceptions</code> array for %s is over U+%04X.",
code_point_i, code_point, sequence_i, link(obj), remove_exceptions_maximum_code_point
)
end
end
end
end
end
if from and to and table_len(to) > table_len(from) then
discrepancy(modname,
"The <code>from</code> array in the <code>%s</code> table for %s must be shorter or the same length as the <code>to</code> array.",
key, link(obj)
)
end
end
local function check_replacements_data(modname, obj, data)
for _, replacement_spec in ipairs{
{"translit", "tr"},
{"display_text", "makeDisplayText"},
{"strip_diacritics", "stripDiacritics"},
{"sort_key", "makeSortKey"},
} do
check_replacement_data(modname, obj, data, unpack(replacement_spec))
end
end
local function has_ancestor(lang, code)
for _, anc in ipairs(lang:getAncestors()) do
if code == anc:getCode() or has_ancestor(anc, code) then
return true
end
end
end
local function get_default_ancestors(lang)
if lang:hasType("language", "etymology-only") then
local parent = lang:getParent()
if not has_ancestor(parent, lang:getCode()) then
return parent:getAncestorCodes()
end
end
local fam_code, def_anc = lang:getFamilyCode()
while fam_code and fam_code ~= "qfa-not" do
local fam = m_families_data[fam_code]
def_anc = fam.protoLanguage or
m_languages_data_all[fam_code .. "-pro"] and fam_code .. "-pro" or
m_etym_languages_data[fam_code .. "-pro"] and fam_code .. "-pro"
if def_anc and def_anc ~= lang:getCode() then
return {def_anc}
end
fam_code = fam[3]
end
end
local function iterate_ancestor(obj, modname, anc_code)
local anc = get_language_by_code(anc_code, nil, true)
if not anc then
discrepancy(modname,
"%s lists the invalid language code <code>%s</code> as its ancestor.",
link(obj), dump(anc_code)
)
return
end
local anc_fam = anc:getFamily()
if not anc_fam then
discrepancy(modname,
"%s has no family.",
link(anc)
)
return
end
local anc_fam_code = anc_fam:getCode()
local def_ancs = get_default_ancestors(obj)
if def_ancs then
for _, def_anc in ipairs(def_ancs) do
def_anc = get_language_by_code(def_anc, nil, true)
if def_anc and (
anc_code == def_anc:getCode() or
has_ancestor(def_anc, anc_code) or
def_anc:hasParent(anc_code) and not has_ancestor(anc, def_anc:getCode())
) then
discrepancy(modname,
"%s has the ancestor %s listed in its ancestor field, which is redundant, since it is determined to be ancestral automatically.",
link(obj), link(anc)
)
end
end
end
if not obj:inFamily(anc_fam_code) then
discrepancy(modname,
"%s has %s set as an ancestor, but is not in the %s.",
link(obj), link(anc), link(anc_fam)
)
end
local fam, proto = obj
repeat
fam = fam:getFamily()
proto = fam and fam:getProtoLanguage()
until proto or not fam or fam:getCode() == "qfa-not"
if proto and not (
proto:getCode() == anc:getCode() or
proto:hasAncestor(anc:getCode()) or
anc:hasAncestor(proto:getCode())
) then
local fam = obj:getFamily()
discrepancy(modname,
"%s is in the %s and has %s set as an ancestor, but it is not possible to form an ancestral chain between them.",
link(obj), link(fam), link(anc)
)
end
end
local function check_ancestors(modname, obj, data)
local ancestors = data.ancestors
if ancestors == nil then
return
end
local ancestors_type = type(ancestors)
if ancestors_type == "string" then
ancestors = split(ancestors, ",", true, true)
elseif ancestors_type ~= "table" then
discrepancy(modname,
"The <code>ancestors</code> field in the data table for %s must be a string or table, not a %s.",
link(obj), ancestors_type
)
end
for _, anc in ipairs(ancestors) do
iterate_ancestor(obj, modname, anc)
end
end
local function check_wikimedia_codes(modname, obj, data)
local wikimedia_codes = data.wikimedia_codes
if wikimedia_codes == nil then
return
end
local wikimedia_codes_type = type(wikimedia_codes)
if wikimedia_codes_type == "string" then
wikimedia_codes = split(wikimedia_codes, ",", true, true)
elseif wikimedia_codes_type ~= "table" then
discrepancy(modname,
"The <code>wikimedia_codes</code> field in the data table for %s must be a string or table, not a %s.",
link(obj), wikimedia_codes_type
)
end
for _, code in ipairs(wikimedia_codes) do
if not is_known_language_tag(code) then
discrepancy(modname,
"%s lists the invalid Wikimedia code <code>%s</code> in the <code>wikimedia_codes</code> field.",
link(obj), dump(code)
)
end
end
end
local function check_code_to_name_and_name_to_code_maps(
source_module_type,
source_module_description,
code_to_module_map, name_to_code_map,
code_to_name_modname, code_to_name_module,
name_to_code_modname, name_to_code_module
)
local function check_code_and_name(modname, code, canonical_name)
-- Check the code is in code_to_module_map and that it didn't originate from the wrong data module.
local check_mod = code_to_module_map[code] or code_to_module_map[aliases[code]]
if not (check_mod and match(check_mod, "^" .. source_module_type .. "/data")) then
if not name_to_code_map[canonical_name] then
discrepancy(modname,
"The code <code>%s</code> and the canonical name %s should be removed; they are not found in %s.",
code, canonical_name, source_module_description
)
else
discrepancy(modname,
"<code>%s</code>, the code for the canonical name %s, is wrong; it should be <code>%s</code>.",
code, canonical_name, name_to_code_map[canonical_name]
)
end
elseif not name_to_code_map[canonical_name] then
local data_table = require("Module:" .. code_to_module_map[code])[code]
discrepancy(modname,
"%s, the canonical name for the code <code>%s</code>, is wrong; it should be %s.",
canonical_name, code, data_table[1]
)
end
end
for code, canonical_name in pairs(code_to_name_module) do
check_code_and_name(code_to_name_modname, code, canonical_name)
end
for canonical_name, code in pairs(name_to_code_module) do
check_code_and_name(name_to_code_modname, code, canonical_name)
end
end
local function check_extraneous_extra_data(
data_modname, data_module, extra_data_modname, extra_data_module)
for code, _ in pairs(extra_data_module) do
if not data_module[code] then
discrepancy(extra_data_modname,
"The code <code>%s</code> is not found in [[Module:%s]], and should be removed from [[Module:%s]].",
code, data_modname, extra_data_modname
)
end
end
end
-- TODO: add collision check between the canonical names "X" and "X [Ll]anguage".
local function check_languages(frame)
local check_language_data_keys = check_data_keys(
1, 2, 3, 4, -- canonical name, Wikidata item, family, scripts
"display_text", "generate_forms", "strip_diacritics", "sort_key",
"other_names", "aliases", "varieties", "ietf_subtag",
"type", "ancestors", "pseudo_families",
"wikimedia_codes", "wikipedia_article", "standard_chars",
"translit", "override_translit", "link_tr",
"dotted_dotless_i"
)
local function check_language(modname, code, data, extra_modname, extra_data)
local obj, code_modname, canonical_name = make_lang(code, data, true), get_data_module_name(code), data[1]
-- FIXME: this module should use the prefixed module name throughout.
code_modname = code_modname:gsub("^Module:", "")
if code_modname ~= modname then
if code_modname == "languages/data/2" then
discrepancy(modname,
"%s is a two-letter code, so should be moved to [[Module:%s]].",
link(obj), code_modname
)
elseif code_modname == "languages/data/exceptional" then
discrepancy(modname,
"%s is an exceptional code, as it does not consist of two or three lowercase letters, so should be moved to [[Module:%s]].",
link(obj), code_modname
)
else
discrepancy(modname,
"%s is a three-letter code beginning with '%s', so should be moved to [[Module:%s]].",
link(obj), sub(code, 1, 1), code_modname
)
end
end
check_language_data_keys(modname, obj, data)
if all_codes[code] then
discrepancy(modname,
"The code <code>%s</code> is not unique; it is also defined in [[Module:%s]].",
code, all_codes[code]
)
else
if not m_languages_codes[code] then
discrepancy("languages/code to canonical name",
"The code %s is missing.",
link(obj, true)
)
end
all_codes[code] = modname
end
-- TODO: these checks should be consolidated with the proto-language checks in the family data,
-- since bad settings there affect the warnings here (e.g. xxx-pro assigned to yyy when xxx also
-- doesn't not exist - a warning that xxx has "no family" would be misleading).
if sub(code, -4) == "-pro" then
local fam_code = sub(code, 1, -5)
local fam = get_language_by_code(fam_code, nil, true, true)
if not fam then
discrepancy(modname,
"'''Proto-language with no family''': %s should be the proto-language of <code>%s</code>, which doesn't exist.",
link(obj), dump(fam_code)
)
elseif not fam:hasType("family") then
discrepancy(modname,
"'''Proto-language with no family''': %s should be the proto-language of <code>%s</code>, but %s is not a family.",
link(obj), dump(fam_code), link(fam)
)
else
-- Reinstate this as low-priority once message priorities have been implemented.
-- local expected_name = "Proto-" .. fam:getCanonicalName()
-- if canonical_name ~= expected_name then
-- discrepancy(modname,
-- "%s does not have the expected name \"%s\", even though it is the proto-language of the %s.",
-- link(obj), expected_name, link(fam)
-- )
-- end
end
end
if not canonical_name then
discrepancy(modname,
"The code <code>%s</code> has no canonical name specified.",
code
)
elseif language_names[canonical_name] then
local canonical_lang = get_language_by_canonical_name(canonical_name)
if not canonical_lang then
discrepancy(modname,
"%s has a canonical name that cannot be looked up.",
link(obj)
)
elseif data.main_code ~= canonical_lang:getCode() then
discrepancy(modname,
"%s has a canonical name that is not unique; it is also used by the code <code>%s</code>.",
link(obj), language_names[canonical_name]
)
end
else
if not m_languages_canonical_names[canonical_name] then
discrepancy("languages/canonical names",
"The canonical name %s is missing.",
link(obj)
)
end
language_names[canonical_name] = code
end
check_wikidata_item(modname, obj, data, 2)
if extra_data then
check_other_names_aliases_varieties(modname, obj, extra_data, canonical_name)
end
local lang_type = data.type
if lang_type and not (lang_type == "regular" or lang_type == "reconstructed" or lang_type == "appendix-constructed") then
discrepancy(modname,
"%s is of the invalid type <code>%s</code>.",
link(obj), lang_type
)
end
if data.aliases then
discrepancy(modname,
"%s has an <code>aliases</code> key in [[Module:%s]]. This must be moved to [[Module:%s]].",
link(obj), modname, extra_modname
)
end
if data.varieties then
discrepancy(modname,
"%s has the <code>varieties</code> key in [[Module:%s]]. This must be moved to [[Module:%s]].",
link(obj), modname, extra_modname
)
end
if data.other_names then
discrepancy(modname,
"%s has the <code>other_names</code> key in [[Module:%s]]. This must be moved to [[Module:%s]].",
link(obj), modname, extra_modname
)
end
if not extra_data then
discrepancy(extra_modname,
"%s has data in [[Module:%s]], but does not have corresponding data in [[Module:%s]].",
link(obj), modname, extra_modname
)
--[[elseif extra_data.other_names then
discrepancy(extra_modname,
"%s has <code>other_names</code> key, but these should be changed to either <code>aliases</code> or <code>varieties</code>.",
link(obj)
)]]
end
local sc = data[4]
if sc then
if type(sc) == "string" then
sc = split(sc, "%s*,%s*", true)
end
if type(sc) == "table" then
if not sc[1] then
discrepancy(modname,
"%s has no scripts listed.",
link(obj)
)
else
for _, sccode in ipairs(sc) do
local cur_sc = m_scripts_data[sccode]
if not (cur_sc or sccode == "All" or sccode == "Hants") then
discrepancy(modname,
"%s lists the invalid script code <code>%s</code>.",
link(obj), dump(sccode)
)
--[[elseif not cur_sc.characters then
discrepancy(modname,
"%s lists the %s, which does not have any characters.",
link(obj), link(get_script_by_code(sccode))
)]]
end
nonempty_scripts[sccode] = true
end
end
else
discrepancy(modname,
"The %s field for %s must be a table or string.",
4, link(obj)
)
end
end
if data.ancestors then
check_ancestors(modname, obj, data)
end
if data.wikimedia_codes then
check_wikimedia_codes(modname, obj, data)
end
if data[3] then
local family = data[3]
if not m_families_data[family] then
discrepancy(modname,
"%s has the invalid family code <code>%s</code>.",
link(obj), dump(family)
)
end
nonempty_families[family] = true
end
check_replacements_data(modname, obj, data)
if data.standard_chars then
if type(data.standard_chars) == "table" then
local sccodes = {}
for _, sccode in ipairs(sc) do
sccodes[sccode] = true
end
for sccode in pairs(data.standard_chars) do
if not (sccodes[sccode] or sccode == 1) then
discrepancy(modname,
"The field %s in the <code>standard_chars</code> table for %s does not match any script for that language.",
sccode, link(obj)
)
end
end
elseif data.standard_chars and type(data.standard_chars) ~= "string" then
discrepancy(modname,
"The <code>standard_chars</code> field in the data table for %s must be a string or table.",
link(obj)
)
end
end
check_true_or_string_or_nil(modname, obj, data, "override_translit")
check_true_or_string_or_nil(modname, obj, data, "link_tr")
-- This doesn't apply any more since scripts can be script-wide translit methods.
-- if data.override_translit and not data.translit then
-- discrepancy(modname,
-- "%s has the <code>override_translit</code> field set, but no transliteration module",
-- link(obj)
-- )
-- end
end
local function check_module(modname)
local mod_data = load_data("Module:" .. modname)
local extra_modname = modname .. "/extra"
local extra_mod_data = load_data("Module:" .. extra_modname)
for code, data in pairs(mod_data) do
check_language(modname, code, data, extra_modname, extra_mod_data[code])
end
check_no_alias_codes(modname, mod_data)
check_no_alias_codes(extra_modname, extra_mod_data)
check_extraneous_extra_data(modname, mod_data, extra_modname, extra_mod_data)
end
-- Check two-letter codes
check_module(
"languages/data/2"
)
-- Check three-letter codes
for i = 0x61, 0x7A do -- a to z
check_module(
format("languages/data/3/%c", i)
)
end
-- Check exceptional codes
check_module(
"languages/data/exceptional"
)
-- These checks must be done while all_codes only contains language codes:
-- that is, after language data modules have been processed, but before
-- etymology languages, families, and scripts have.
check_code_to_name_and_name_to_code_maps(
"languages",
"a submodule of [[Module:languages]]",
all_codes, language_names,
"languages/code to canonical name", m_languages_codes,
"languages/canonical names", m_languages_canonical_names
)
-- Check [[Template:langname-lite]]
local modname = "Template:langname-lite"
for code, name in gmatch(remove_comments(new_title(modname):getContent()), "\n\t*|#*([^\n]+)=([^\n]*)") do
if #code > 1 and code ~= "default" then
for _, code in pairs(split(code, "|", true)) do
local lang = get_language_by_code(code, nil, true, true)
if match(name, "etymcode") then
local nonEtym_name = frame:preprocess(name)
local nonEtym_real_name = lang:getFullName()
if nonEtym_name ~= nonEtym_real_name then
discrepancy(modname,
"Code: <code>%s</code>. Saw name: %s. Expected name: %s.",
code, nonEtym_name, nonEtym_real_name
)
end
name = frame:preprocess(gsub(name, "{{{allow etym|}}}", "1"))
elseif match(name, "familycode") then
name = match(name, "familycode|(.-)|")
else
name = name
end
if not lang then
discrepancy(modname,
"Code: <code>%s</code>. Saw name: %s. Language not present in data.",
code, name
)
else
local real_name = lang:getCanonicalName()
if name ~= real_name then
discrepancy(modname,
"Code: <code>%s</code>. Saw name: %s. Expected name: %s.",
code, name, real_name
)
end
end
end
end
end
end
local function check_etym_languages()
local modname = "etymology languages/data"
local check_etymology_language_data_keys = check_data_keys(
1, 2, 3, 4, -- canonical name, Wikidata item, family, scripts
"parent", "display_text", "generate_forms", "strip_diacritics", "sort_key",
"other_names", "aliases", "varieties", "ietf_subtag",
"type", "main_code", "ancestors", "pseudo_families",
"wikimedia_codes", "wikipedia_article", "standard_chars",
"translit", "override_translit", "link_tr",
"dotted_dotless_i"
)
local checked = {}
for code, data in pairs(m_etym_languages_data) do
local obj, canonical_name, parent = make_lang(code, data, true), data[1], data.parent
check_etymology_language_data_keys(modname, obj, data)
if all_codes[code] then
discrepancy(modname,
"The code <code>%s</code> is not unique; it is also defined in [[Module:%s]].",
code, all_codes[code]
)
else
if not m_etym_languages_codes[code] then
discrepancy("etymology languages/code to canonical name",
"The code %s is missing.",
link(obj, true)
)
end
all_codes[code] = modname
end
if not canonical_name then
discrepancy(modname,
"The code <code>%s</code> has no canonical name specified.",
code
)
elseif language_names[canonical_name] then
local canonical_lang = get_language_by_canonical_name(canonical_name, nil, true)
if not canonical_lang then
discrepancy(modname,
"%s has a canonical name that cannot be looked up.",
link(obj)
)
elseif data.main_code ~= canonical_lang:getCode() then
discrepancy(modname,
"%s has a canonical name that is not unique; it is also used by the code <code>%s</code>.",
link(obj), language_names[canonical_name]
)
end
else
if not m_etym_languages_canonical_names[canonical_name] then
discrepancy("etymology languages/canonical names",
"The canonical name %s is missing.",
link(obj)
)
end
etym_language_names[canonical_name] = code
end
check_other_names_aliases_varieties(modname, obj, data, canonical_name)
if parent then
if type(parent) ~= "string" then
discrepancy(modname,
"%s has a parent code that is %s rather than a string.",
link(obj), parent == nil and "nil" or "a " .. type(parent)
)
elseif not (m_languages_data_all[parent] or m_etym_languages_data[parent]) then
discrepancy(modname,
"%s has the invalid parent code <code>%s</code>%s.",
link(obj), dump(parent), m_families_data[parent] and " (a family code)" or ""
)
end
nonempty_families[parent] = true
else
discrepancy(modname,
"%s has no parent code.",
link(obj)
)
end
if data.ancestors then
check_ancestors(modname, obj, data)
end
if data.wikimedia_codes then
check_wikimedia_codes(modname, obj, data)
end
if data[3] then
local family = data[3]
if not m_families_data[family] then
discrepancy(modname,
"%s has the invalid family code <code>%s</code>.",
link(obj), dump(family))
end
nonempty_families[family] = true
end
check_replacements_data(modname, obj, data)
check_wikidata_item(modname, obj, data, 2)
local stack = {}
while data do
if checked[code] then
break
elseif stack[code] then
local parent = data.parent
discrepancy(modname,
"%s has a cyclic parental relationship to %s",
link(make_lang(code, data, true)),
link(get_language_by_code(parent, nil, true))
)
break
end
stack[code] = true
code = data.parent
data = m_etym_languages_data[code]
end
for code in pairs(stack) do
checked[code] = true
end
end
check_no_alias_codes(modname, m_etym_languages_data)
check_code_to_name_and_name_to_code_maps(
"etymology languages",
"[[Module:etymology languages/data]]",
all_codes, etym_language_names,
"etymology languages/code to canonical name", m_etym_languages_codes,
"etymology languages/canonical names", m_etym_languages_canonical_names)
end
-- TODO: add collision check between the canonical names "X" and "X [Ll]anguages".
local function check_families()
local modname = "families/data"
local check_family_data_keys = check_data_keys(
1, 2, 3, -- canonical name, Wikidata item, (parent) family
"type", "ietf_subtag",
"protoLanguage", "other_names", "aliases", "varieties", "pseudo_families", "categoryName"
)
local checked, double_check_if_empty = {["qfa-not"] = true}, {}
for code, data in pairs(m_families_data) do
local obj, canonical_name, family, protolang = make_family(code, data), data[1], data[3], data.protoLanguage
check_family_data_keys(modname, obj, data)
if all_codes[code] then
discrepancy(modname,
"The code <code>%s</code> is not unique; it is also defined in [[Module:%s]].",
code, all_codes[code]
)
else
if not m_families_codes[code] then
discrepancy("families/code to canonical name",
"The code %s is missing.",
link(obj, true)
)
end
all_codes[code] = modname
end
if not canonical_name then
discrepancy(modname,
"The code <code>%s</code> has no canonical name specified.",
code
)
elseif family_names[canonical_name] then
local canonical_family = get_family_by_canonical_name(canonical_name)
if not canonical_family then
discrepancy(modname,
"%s has a canonical name that cannot be looked up.",
link(obj)
)
elseif data.main_code ~= canonical_family:getCode() then
discrepancy(modname,
"%s has a canonical name that is not unique; it is also used by the code <code>%s</code>.",
link(obj), family_names[canonical_name]
)
end
else
if not m_families_canonical_names[canonical_name] then
discrepancy("families/canonical names",
"The canonical name %s is missing.",
link(obj)
)
end
family_names[canonical_name] = code
end
check_other_names_aliases_varieties(modname, obj, data, canonical_name)
if family then
if family == code and code ~= "qfa-not" then
discrepancy(modname,
"%s has itself as its family.",
link(obj)
)
elseif not m_families_data[family] then
discrepancy(modname,
"%s has the invalid parent family code <code>%s</code>.",
link(obj), dump(family)
)
end
nonempty_families[family] = true
end
if protolang then
local protolang_obj = get_language_by_code(protolang, nil, true)
if not protolang_obj then
discrepancy(modname,
"%s has the invalid proto-language code <code>%s</code>.",
link(obj), dump(protolang)
)
elseif protolang == code .. "-pro" then
discrepancy(modname,
"%s has %s listed as its proto-language, which is redundant, since it is determined to be the proto-language automatically.",
link(obj), link(protolang_obj)
)
elseif sub(protolang, -4) == "-pro" then
discrepancy(modname,
"%s has %s listed as its proto-language, which is supposed to be the proto-language for the family <code>%s</code>.", link(obj), link(protolang_obj), sub(protolang, 1, -5)
)
end
end
check_wikidata_item(modname, obj, data, 2)
-- Could be a false-positive if a child family occurs on a later
-- iteration, so set aside any that fail for a second check. This avoids
-- having to iterate through the whole list of families once
-- nonempty_families has been fully populated.
if not (nonempty_families[code] or allowed_empty_families[code]) then
double_check_if_empty[code] = obj
end
local stack = {}
while data do
if checked[code] then
break
elseif stack[code] then
local parent = data[3]
discrepancy(modname,
"%s has a cyclic familial relationship to %s",
link(make_family(code, data)),
link(get_family_by_code(parent))
)
break
end
stack[code] = true
code = data[3]
data = m_families_data[code]
end
for code in pairs(stack) do
checked[code] = true
end
end
-- Any languages set aside as candidates for having no children are checked
-- again, now that nonempty_families is definitely complete.
for code, obj in next, double_check_if_empty do
if not (nonempty_families[code] or allowed_empty_families[code]) then
discrepancy(modname,
"%s has no child families or languages.",
link(obj)
)
end
end
check_no_alias_codes(modname, m_families_data)
check_code_to_name_and_name_to_code_maps(
"families",
"[[Module:families/data]]",
all_codes, family_names,
"families/code to canonical name", m_families_codes,
"families/canonical names", m_families_canonical_names)
end
-- TODO: add collision check between the canonical names "X" and "X [Ss]cript".
local function check_scripts()
local modname = "scripts/data"
local check_script_data_keys = check_data_keys(
1, 2, 3, -- canonical name, Wikidata item, writing systems
"other_names", "aliases", "varieties", "parent", "ietf_subtag", "type",
"wikipedia_article", "ranges", "characters", "spaces", "capitalized", "translit", "direction",
"character_category", "normalizationFixes", "sort_by_scraping",
"display_text", "sort_key", "strip_diacritics"
)
-- Just to satisfy requirements of check_code_to_name_and_name_to_code_maps.
local script_code_to_module_map = {}
for code, data in pairs(m_scripts_data) do
local obj, canonical_name = make_script(code, data), data[1]
if not m_scripts_codes[code] and #code == 4 then
discrepancy("scripts/code to canonical name",
"The code %s is missing",
link(obj, true)
)
end
check_script_data_keys(modname, obj, data)
if not canonical_name then
discrepancy(modname,
"The code <code>%s</code> has no canonical name specified.",
code
)
elseif script_names[canonical_name] then
local canonical_script = get_script_by_canonical_name(canonical_name)
if not canonical_script then
discrepancy(modname,
"%s has a canonical name that cannot be looked up.",
link(obj)
)
--[[elseif data.main_code ~= canonical_script:getCode() then
discrepancy(modname,
"%s has a canonical name that is not unique; it is also used by the code <code>%s</code>.",
link(obj), script_names[canonical_name]
)]]
end
else
if not m_scripts_canonical_names[canonical_name] and #code == 4 then
discrepancy("scripts/canonical names",
"The canonical name %s is missing.",
link(obj)
)
end
script_names[canonical_name] = code
end
check_other_names_aliases_varieties(modname, obj, data, canonical_name)
if not nonempty_scripts[code] then
discrepancy(modname,
"%s is not used by any language%s.",
link(obj), data.characters and ""
or " and has no characters listed for auto-detection")
--[[elseif not data.characters then
discrepancy(modname,
"%s has no characters listed for auto-detection.",
link(obj)
)--]]
end
if data.characters then
validate_pattern(data.characters, modname, obj, false)
end
check_wikidata_item(modname, obj, data, 2)
script_code_to_module_map[code] = modname
end
check_no_alias_codes(modname, m_scripts_data)
check_code_to_name_and_name_to_code_maps(
"scripts",
"a submodule of [[Module:scripts]]",
script_code_to_module_map, script_names,
"scripts/code to canonical name", m_scripts_codes,
"scripts/canonical names", m_scripts_canonical_names)
end
-- FIXME: this is quite messy.
local function check_wikidata_languages()
local data = json_decode(new_title("Module:languages/data/wikidata.json"):getContent())
local seen = {{}, {}, {}, [5] = {}}
for _, item in ipairs(data) do
local id = item.id
for k, v in pairs(item) do
if k ~= "id" then
local _seen = seen[k]
for _, code in ipairs(v) do
local _code = code[1]
local _type = type(_seen[_code])
if _type == "table" then
insert(_seen[_code], id)
elseif _type == "string" then
_seen[_code] = {_seen[_code], id}
else
_seen[_code] = id
end
end
end
end
end
local modname = "languages/data/wikidata.json"
for k, v in pairs(seen) do
for code, ids in pairs(v) do
if type(ids) == "table" then
local t = {}
for i, id in ipairs(ids) do
t[i] = format("<code>[[d:%s|%s]]</code>", id, id)
end
discrepancy(modname,
"<code>%s</code> is set as an ISO 639-%d code on multiple items: %s.",
code, k, list_to_text(t)
)
end
end
end
end
local function check_labels()
local check_label_data_keys = check_data_keys(
"display", "Wikipedia", "glossary",
"plain_categories", "topical_categories", "pos_categories", "regional_categories", "sense_categories",
"omit_preComma", "omit_postComma", "omit_preSpace",
"deprecated", "track"
)
local function check_label(modname, code, data)
local _type = type(data)
if _type == "table" then
check_label_data_keys(modname, code, data)
elseif _type ~= "string" then
discrepancy(modname,
"The data for the label <code>%s</code> is %s %s; only tables and strings are allowed.",
code, add_indefinite_article(_type)
)
end
end
for _, module in ipairs{"", "/regional", "/topical"} do
local modname = "Module:labels/data" .. module
module = require(modname)
for label, data in pairs(module) do
check_label(modname, label, data)
end
end
for code in pairs(m_languages_codes) do
local modname = "Module:labels/data/lang/" .. code
local module = safe_require(modname)
if module then
for label, data in pairs(module) do
check_label(modname, label, data)
end
end
end
end
local function check_zh_trad_simp()
local m_ts = require("Module:zh/data/ts")
local m_st = require("Module:zh/data/st")
local ruby = require("Module:ja-ruby").ruby_auto
local lang = get_language_by_code("zh")
local Hant = get_script_by_code("Hant")
local Hans = get_script_by_code("Hans")
local data = {[0] = m_st, m_ts}
local mod = {[0] = "st", "ts"}
local var = {[0] = "Simp.", "Trad."}
local sc = {[0] = Hans, Hant}
local function find_stable_loop(chars, other, j)
local display = ruby({["markup"] = "[" .. other .. "](" .. var[(j+1)%2] .. ")"})
display = language_link{term = other, alt = display, lang = lang, sc = sc[(j+1)%2], tr = "-"}
insert(chars, display)
if data[(j+1)%2][other] == other then
insert(chars, other)
return chars, 1
elseif not data[(j+1)%2][other] then
insert(chars, "not found")
return chars, 2
elseif data[j%2][data[(j+1)%2][other]] ~= other then
return find_stable_loop(chars, data[(j+1)%2][other], j + 1)
else
local display = ruby({["markup"] = "[" .. data[(j+1)%2][other] .. "](" .. var[j%2] .. ")"})
display = language_link{term = data[(j+1)%2][other], alt = display, lang = lang, sc = sc[j%2], tr = "-"}
insert(chars, display .. " (")
display = ruby({["markup"] = "[" .. data[j%2][data[(j+1)%2][other]] .. "](" .. var[(j+1)%2] .. ")"})
display = language_link{term = data[j%2][data[(j+1)%2][other]], alt = display, lang = lang, sc = sc[(j+1)%2], tr = "-"}
insert(chars, display .. " etc.)")
return chars, 3
end
return chars
end
for i = 0, 1, 1 do
for ch, other_ch in pairs(data[i]) do
if data[(i+1)%2][other_ch] ~= ch then
local chars, issue = {}
local display = ruby({["markup"] = "[" .. ch .. "](" .. var[i] .. ")"})
display = language_link{term = ch, alt = display, lang = lang, sc = sc[i], tr = "-"}
insert(chars, display)
chars, issue = find_stable_loop(chars, other_ch, i)
if issue == 1 or issue == 2 then
local sc_this, mod_this, j = {}
if match(chars[#chars-1], var[(i+1)%2]) then
j = 1
else
j = 0
end
mod_this = mod[(i+j)%2]
sc_this = {[0] = sc[(i+j)%2], sc[(i+j+1)%2]}
for k, ch in ipairs(chars) do
chars[k] = tag_text(ch, lang, sc_this[k%2], "term")
end
local modname = "zh/data/" .. mod_this
if issue == 1 then
discrepancy(modname,
"character references itself: %s",
concat(chars, " → ")
)
elseif issue == 2 then
discrepancy(modname,
"missing character: %s",
concat(chars, " → ")
)
end
elseif issue == 3 then
for j, ch in ipairs(chars) do
chars[j] = tag_text(ch, lang, sc[(i+j)%2], "term")
end
discrepancy("zh/data/" .. mod[i],
"possible mismatched character: %s",
concat(chars, " → ")
)
end
end
end
end
end
local function check_serialization(modname)
local serializers = {
["Hani-sortkey/data/serialized"] = "Hani-sortkey/serializer",
}
if not serializers[modname] then
return nil
end
local serializer = serializers[modname]
local current_data = require("Module:" .. serializer).main(true)
local stored_data = require("Module:" .. modname)
if current_data ~= stored_data then
discrepancy(modname,
"<strong><u>Important!</u> Serialized data is out of sync. Use [[Module:%s]] to update it. If you have made any changes to the underlying data, the serialized data <u>must</u> be updated before these changes will take effect.</strong>",
serializer
)
end
end
local find_code = require("Module:memoize")(function(message)
return match(message, "<code>([^<]+)</code>")
end)
local function compare_messages(message1, message2)
local code1, code2 = find_code(message1), find_code(message2)
if code1 and code2 then
return code1 < code2
else
return message1 < message2
end
end
-- Warning: cannot be called twice in the same module invocation because
-- some module-global variables are not reset between calls.
local function do_checks(frame, modules)
messages = setmetatable({}, messages_mt)
if modules["zh/data/ts"] or modules["zh/data/st"] then
check_zh_trad_simp()
end
check_languages(frame)
check_etym_languages()
-- families and scripts must be checked AFTER languages; languages checks fill out
-- the nonempty_families and nonempty_scripts tables, used for testing if a family/script
-- is ever used in the data
check_families()
check_scripts()
check_wikidata_languages()
if modules["labels/data"] then
check_labels()
end
for module in pairs(modules) do
check_serialization(module)
end
setmetatable(messages, nil)
for _, msglist in pairs(messages) do
msglist:sort(compare_messages)
end
local ret = messages
messages = nil
return ret
end
local function format_message(modname, msglist)
local header; if match(modname, "^Module:") or match(modname, "^Template:") then
header = "===[[" .. modname .. "]]==="
else
header = "===[[Module:" .. modname .. "]]==="
end
return header .. msglist:map(function(msg)
return "\n* " .. msg
end):concat()
end
function export.check_modules_t(frame)
local args = frame.args
local modules = list_to_set(args)
local ret = Array()
local messages = do_checks(frame, modules)
for _, module in ipairs(args) do
local msglist = messages[module]
if msglist then
ret:insert(format_message(module, msglist))
end
end
return ret:concat("\n")
end
function export.perform(frame)
local messages = do_checks(frame, {})
-- Format the messages
local ret = Array()
for modname, msglist in sorted_pairs(messages) do
ret:insert(format_message(modname, msglist))
end
-- Are there any messages?
-- TODO: check how many messages there are.
if false then --if i == 1 then
return "<b class=\"success\">Glory to Arstotzka.</b>"
else
ret:insert(1, "<b class=\"warning\">Discrepancies detected:</b>")
return ret:concat("\n")
end
end
return export
gl7cr7quegesqg20o8ga2k3wf0lzrq0
5720748
5720745
2026-04-21T06:53:40Z
OctraBot
3198
5720748
Scribunto
text/plain
-- TODO:
-- ietf_subtag field used with a 2/3-letter langauge/family code except qaa-qtz, or a 4-letter script code.
-- Check against files containing up-to-date ISO data, to cross-check validity.
local export = {}
local mw = mw
local require = require
local string = string
local Array = require("Module:array")
local m_en_utilities = require("Module:en-utilities")
local m_etym_languages_canonical_names = require("Module:etymology languages/canonical names")
local m_etym_languages_codes = require("Module:etymology languages/code to canonical name")
local m_etym_languages_data = require("Module:etymology languages/data")
local m_families = require("Module:families")
local m_families_canonical_names = require("Module:families/canonical names")
local m_families_codes = require("Module:families/code to canonical name")
local m_families_data = require("Module:families/data")
local m_languages = require("Module:languages")
local m_languages_canonical_names = require("Module:languages/canonical names")
local m_languages_codes = require("Module:languages/code to canonical name")
local m_languages_data_all = require("Module:languages/data/all")
local m_load = require("Module:load")
local m_scripts = require("Module:scripts")
local m_scripts_canonical_names = require("Module:scripts/canonical names")
local m_scripts_codes = require("Module:scripts/code to canonical name")
local m_scripts_data = require("Module:scripts/data")
local m_str_utils = require("Module:string utilities")
local m_table = require("Module:table")
local add_indefinite_article = m_en_utilities.add_indefinite_article
local codepoint = m_str_utils.codepoint
local concat = table.concat
local dump = mw.dumpObject
local format = string.format
local gcodepoint = m_str_utils.gcodepoint
local get_data_module_name = m_languages.getDataModuleName
local get_family_by_code = m_families.getByCode
local get_family_by_canonical_name = m_families.getByCanonicalName
local get_indefinite_article = m_en_utilities.get_indefinite_article
local get_language_by_code = m_languages.getByCode
local get_language_by_canonical_name = m_languages.getByCanonicalName
local get_script_by_code = m_scripts.getByCode
local get_script_by_canonical_name = m_scripts.getByCanonicalName
local gmatch = string.gmatch
local gsub = string.gsub
local insert = table.insert
local ipairs = ipairs
local is_callable = require("Module:fun").is_callable
local is_positive_integer = require("Module:math").is_positive_integer
local is_known_language_tag = mw.language.isKnownLanguageTag
local isutf8 = mw.ustring.isutf8
local json_decode = mw.text.jsonDecode
local language_link = require("Module:links").language_link
local list_to_set = m_table.listToSet
local list_to_text = mw.text.listToText
local load_data = m_load.load_data
local log = mw.log
local main_loader = package.loaders[2]
local make_family = m_families.makeObject
local make_lang = m_languages.makeObject
local make_script = m_scripts.makeObject
local match = string.match
local new_title = mw.title.new
local next = next
local pairs = pairs
local pcall = pcall
local remove_comments = require("Module:string/removeComments")
local safe_require = m_load.safe_require
local sorted_pairs = m_table.sortedPairs
local split = m_str_utils.split
local sub = string.sub
local table_len = m_table.length
local tag_text = require("Module:script utilities").tag_text
local type = type
local umatch = m_str_utils.match
local unpack = unpack or table.unpack -- Lua 5.2 compatibility
local aliases = require("Module:languages/data").aliases
local messages
local function discrepancy(modname, ...)
local success, result = pcall(function(...)
messages[modname]:insert(format(...))
end, ...)
if not success then
log(result, ...)
end
end
local messages_mt = {}
function messages_mt:__index(k)
local val = Array()
self[k] = val
return val
end
local all_codes = {}
local language_names = {}
local etym_language_names = {}
local family_names = {}
local script_names = {}
local nonempty_families = {}
local allowed_empty_families = {tbq = true}
local nonempty_scripts = {}
local function link(obj, code_first)
return type(obj) == "string" and obj or
code_first and format("<code>%s</code> (%s)", obj:getCode(), obj:makeCategoryLink()) or
format("%s (<code>%s</code>)", obj:makeCategoryLink(), obj:getCode())
end
local function check_data_keys(...)
local valid_keys = Array(...):toSet()
return function (modname, obj, data)
local invalid_keys
for k in pairs(data) do
if not valid_keys[k] then
if not invalid_keys then
invalid_keys = Array(k)
else
invalid_keys:insert(k)
end
end
end
if invalid_keys == nil then
return
end
local plural = #invalid_keys ~= 1
discrepancy(modname,
"The data key%s %s for %s %s invalid.",
plural and "s" or "",
invalid_keys:map(function(key)
return "<code>" .. key .. "</code>"
end):concat(", "),
link(obj),
plural and "are" or "is"
)
end
end
-- Modification of isArray in [[Module:table]].
-- This assumes all keys are either integers or non-numbers.
-- If there are fractional numbers, the results might be incorrect.
-- For instance, find_gap{"a", "b", [0.5] = true} evaluates to 3, but there
-- isn't a gap at 3 in the sense of there being an integer key greater than 3.
local function find_gap(t, can_contain_non_number_keys)
local i = 0
for k in pairs(t) do
if not (can_contain_non_number_keys and type(k) ~= "number") then
i = i + 1
if t[i] == nil then
return i
end
end
end
end
local function check_true_or_string_or_nil(modname, obj, data, key)
local field = data[key]
if not (field == nil or field == true or type(field) == "string") then
discrepancy(modname,
"%s has %s <code>%s</code> value that is not <code>nil</code>, <code>true</code> or a string: <code>%s</code>",
link(obj), get_indefinite_article(key), key, dump(data[key])
)
end
end
local function check_array(modname, obj, data, array_name, parent_array_name, can_contain_non_number_keys)
local parent_table = data
if parent_array_name then
parent_table = assert(data[parent_array_name], parent_array_name)
parent_array_name = "the <code>" .. parent_array_name .. "</code> field in "
else
parent_array_name = ""
end
local array_type = type(parent_table[array_name])
if array_type == "table" then
local gap = find_gap(parent_table[array_name], can_contain_non_number_keys)
if gap then
discrepancy(modname,
"The <code>%s</code> array in %sthe data table for %s has a gap at index %d.",
array_name,
parent_array_name,
link(obj),
gap
)
else
return true
end
else
discrepancy(modname,
"The <code>%s</code> field in %sthe data table for %s should be an array (table) but is %s.",
array_name,
parent_array_name,
link(obj),
array_type == "nil" and "nil" or "a " .. array_type
)
end
end
local function check_no_alias_codes(modname, mod_data)
local lookup, discrepancies = {}, {}
for k, v in pairs(mod_data) do
local check = lookup[v]
if check then
discrepancies[check] = discrepancies[check] or {"<code>" .. check .. "</code>"}
insert(discrepancies[check], "<code>" .. k .. "</code>")
else
lookup[v] = k
end
end
for _, v in pairs(discrepancies) do
discrepancy(modname,
"The codes %s are currently alias codes. Only one code should be used in the data.",
list_to_text(v, ", ", " and ")
)
end
end
local function check_wikidata_item(modname, obj, data, key)
local data_item = data[key]
if data_item == nil or is_positive_integer(data_item) then
return
end
discrepancy(modname,
"%s has a Wikidata item ID that is not a positive integer: <code>%s</code>",
link(obj), dump(data_item)
)
end
local function check_name_field(modname, obj, data, canonical_name, data_key, allow_nested, allow_canonical_name_in_table)
local array = data[data_key]
if not array then
return
end
check_array(modname, obj, data, data_key, nil, true)
local names = {}
local function check_other_name(other_name)
if not allow_canonical_name_in_table and other_name == canonical_name then
discrepancy(modname,
"%s has its canonical name (<code>%s</code>) repeated in the table of <code>%s</code>.",
link(obj), dump(canonical_name), data_key
)
end
if names[other_name] then
discrepancy(modname,
"The name %s is found twice or more in the list of <code>%s</code> for %s.",
other_name, data_key, link(obj)
)
end
names[other_name] = true
end
for _, other_name in ipairs(array) do
if type(other_name) == "table" then
if not allow_nested then
discrepancy(modname,
"A nested table is found in the list of <code>%s</code> for %s, but isn't allowed.",
data_key, link(obj)
)
else
for _, on in ipairs(other_name) do
check_other_name(on)
end
end
else
check_other_name(other_name)
end
end
end
local function check_other_names_aliases_varieties(modname, obj, data, canonical_name)
if data.other_names then
check_name_field(modname, obj, data, canonical_name, "other_names")
end
if data.aliases then
check_name_field(modname, obj, data, canonical_name, "aliases")
end
if data.varieties then
-- Sometimes a variety legitimately has the same name as the language as a whole, so allow that.
check_name_field(modname, obj, data, canonical_name, "varieties", "allow_nested", "allow_canonical_name_in_table")
end
end
local function validate_pattern(pattern, modname, obj, standard_chars)
if type(pattern) ~= "string" then
return discrepancy(modname,
"\"%s\", the %spattern for %s, is not a string.",
pattern, standard_chars and "standard character " or "", link(obj)
)
elseif not isutf8(pattern) then
return discrepancy(modname,
"%s specifies a pattern for for %scharacter detection which is not valid UTF-8: <code>%s</code>",
link(obj), standard_chars and "standard " or "", dump(pattern)
)
end
local ranges
for lower, higher in gmatch(pattern, "(.[\128-\191]*)%-%%?(.[\128-\191]*)") do
if codepoint(lower) >= codepoint(higher) then
ranges = ranges or Array()
insert(ranges, { lower, higher })
end
end
if ranges and ranges[1] then
local plural = #ranges ~= 1 and "s" or ""
discrepancy(modname,
"%s specifies an invalid pattern " ..
"for %scharacter detection: <code>%s</code>. The first codepoint%s " ..
"in the range%s %s %s must be less than or equal to the second.",
link(obj), standard_chars and "standard " or "", dump(pattern), plural, plural,
ranges:map(function(range)
return format(range[1] .. "-" .. range[2] .. " (U+%X, U+%X)", codepoint(range[1]), codepoint(range[2]))
end):concat(", "),
#ranges ~= 1 and "are" or "is"
)
end
local success, result = pcall(umatch, "", "[" .. pattern .. "]")
if not success then
discrepancy(modname,
"%s specifies an invalid pattern for %scharacter detection: <code>%s</code> (%s)",
link(obj), standard_chars and "standard " or "", dump(pattern), result
)
end
end
local remove_exceptions_addition = 0xF0000
local maximum_code_point = 0x10FFFF
local remove_exceptions_maximum_code_point = maximum_code_point - remove_exceptions_addition
-- TODO: check modules exist.
-- TODO: validate script codes and check inner tables.
local function check_replacement_data(modname, obj, data, key, func_name)
local replacements = data[key]
if replacements == nil then
return
end
local replacements_type = type(replacements)
if replacements_type == "string" then
local mod = main_loader("Module:" .. replacements)
if not mod then
discrepancy(modname,
"The <code>%s</code> field in the data table for %s specifies the module [[Module:%s]], which does not exist.",
key, link(obj), replacements
)
else
mod = mod()
if not (type(mod) == "table" and is_callable(mod[func_name])) then
discrepancy(modname,
"The <code>%s</code> field in the data table for %s specifies the module [[Module:%s]], which exists, but does not contain the expected function <code>%s()</code>.",
key, link(obj), replacements, func_name
)
end
end
return
elseif replacements_type ~= "table" then
discrepancy(modname,
"The <code>%s</code> field in the data table for %s must be a string or table, not a %s.",
key, link(obj), replacements_type
)
return
end
local from, to = replacements.from, replacements.to
if (from ~= nil) ~= (to ~= nil) then
discrepancy(modname,
"The <code>from</code> and <code>to</code> arrays in the <code>%s</code> table for %s are not both defined or both undefined.",
key, link(obj)
)
elseif from then
for _, k in ipairs {"from", "to"} do
check_array(modname, obj, data, k, key)
end
end
local remove_diacritics = replacements.remove_diacritics
if not (remove_diacritics == nil or type(remove_diacritics) == "string") then
discrepancy(modname,
"The <code>remove_diacritics</code> field in the <code>%s</code> table for %s table must be a string.",
key, link(obj)
)
end
local remove_exceptions = replacements.remove_exceptions
if remove_exceptions then
if check_array(modname, obj, data, "remove_exceptions", key) then
for sequence_i, sequence in ipairs(remove_exceptions) do
local code_point_i = 0
for code_point in gcodepoint(sequence) do
code_point_i = code_point_i + 1
if code_point > remove_exceptions_maximum_code_point then
discrepancy(modname,
"Code point #%d (0x%04X) in field #%d of the <code>remove_exceptions</code> array for %s is over U+%04X.",
code_point_i, code_point, sequence_i, link(obj), remove_exceptions_maximum_code_point
)
end
end
end
end
end
if from and to and table_len(to) > table_len(from) then
discrepancy(modname,
"The <code>from</code> array in the <code>%s</code> table for %s must be shorter or the same length as the <code>to</code> array.",
key, link(obj)
)
end
end
local function check_replacements_data(modname, obj, data)
for _, replacement_spec in ipairs{
{"translit", "tr"},
{"display_text", "makeDisplayText"},
{"strip_diacritics", "stripDiacritics"},
{"sort_key", "makeSortKey"},
} do
check_replacement_data(modname, obj, data, unpack(replacement_spec))
end
end
local function has_ancestor(lang, code)
for _, anc in ipairs(lang:getAncestors()) do
if code == anc:getCode() or has_ancestor(anc, code) then
return true
end
end
end
local function get_default_ancestors(lang)
if lang:hasType("language", "etymology-only") then
local parent = lang:getParent()
if not has_ancestor(parent, lang:getCode()) then
return parent:getAncestorCodes()
end
end
local fam_code, def_anc = lang:getFamilyCode()
while fam_code and fam_code ~= "qfa-not" do
local fam = m_families_data[fam_code]
def_anc = fam.protoLanguage or
m_languages_data_all[fam_code .. "-pro"] and fam_code .. "-pro" or
m_etym_languages_data[fam_code .. "-pro"] and fam_code .. "-pro"
if def_anc and def_anc ~= lang:getCode() then
return {def_anc}
end
fam_code = fam[3]
end
end
local function iterate_ancestor(obj, modname, anc_code)
local anc = get_language_by_code(anc_code, nil, true)
if not anc then
discrepancy(modname,
"%s lists the invalid language code <code>%s</code> as its ancestor.",
link(obj), dump(anc_code)
)
return
end
local anc_fam = anc:getFamily()
if not anc_fam then
discrepancy(modname,
"%s has no family.",
link(anc)
)
return
end
local anc_fam_code = anc_fam:getCode()
local def_ancs = get_default_ancestors(obj)
if def_ancs then
for _, def_anc in ipairs(def_ancs) do
def_anc = get_language_by_code(def_anc, nil, true)
if def_anc and (
anc_code == def_anc:getCode() or
has_ancestor(def_anc, anc_code) or
def_anc:hasParent(anc_code) and not has_ancestor(anc, def_anc:getCode())
) then
discrepancy(modname,
"%s has the ancestor %s listed in its ancestor field, which is redundant, since it is determined to be ancestral automatically.",
link(obj), link(anc)
)
end
end
end
if not obj:inFamily(anc_fam_code) then
discrepancy(modname,
"%s has %s set as an ancestor, but is not in the %s.",
link(obj), link(anc), link(anc_fam)
)
end
local fam, proto = obj
repeat
fam = fam:getFamily()
proto = fam and fam:getProtoLanguage()
until proto or not fam or fam:getCode() == "qfa-not"
if proto and not (
proto:getCode() == anc:getCode() or
proto:hasAncestor(anc:getCode()) or
anc:hasAncestor(proto:getCode())
) then
local fam = obj:getFamily()
discrepancy(modname,
"%s is in the %s and has %s set as an ancestor, but it is not possible to form an ancestral chain between them.",
link(obj), link(fam), link(anc)
)
end
end
local function check_ancestors(modname, obj, data)
local ancestors = data.ancestors
if ancestors == nil then
return
end
local ancestors_type = type(ancestors)
if ancestors_type == "string" then
ancestors = split(ancestors, ",", true, true)
elseif ancestors_type ~= "table" then
discrepancy(modname,
"The <code>ancestors</code> field in the data table for %s must be a string or table, not a %s.",
link(obj), ancestors_type
)
end
for _, anc in ipairs(ancestors) do
iterate_ancestor(obj, modname, anc)
end
end
local function check_wikimedia_codes(modname, obj, data)
local wikimedia_codes = data.wikimedia_codes
if wikimedia_codes == nil then
return
end
local wikimedia_codes_type = type(wikimedia_codes)
if wikimedia_codes_type == "string" then
wikimedia_codes = split(wikimedia_codes, ",", true, true)
elseif wikimedia_codes_type ~= "table" then
discrepancy(modname,
"The <code>wikimedia_codes</code> field in the data table for %s must be a string or table, not a %s.",
link(obj), wikimedia_codes_type
)
end
for _, code in ipairs(wikimedia_codes) do
if not is_known_language_tag(code) then
discrepancy(modname,
"%s lists the invalid Wikimedia code <code>%s</code> in the <code>wikimedia_codes</code> field.",
link(obj), dump(code)
)
end
end
end
local function check_code_to_name_and_name_to_code_maps(
source_module_type,
source_module_description,
code_to_module_map, name_to_code_map,
code_to_name_modname, code_to_name_module,
name_to_code_modname, name_to_code_module
)
local function check_code_and_name(modname, code, canonical_name)
-- Check the code is in code_to_module_map and that it didn't originate from the wrong data module.
local check_mod = code_to_module_map[code] or code_to_module_map[aliases[code]]
if not (check_mod and match(check_mod, "^" .. source_module_type .. "/data")) then
if not name_to_code_map[canonical_name] then
discrepancy(modname,
"The code <code>%s</code> and the canonical name %s should be removed; they are not found in %s.",
code, canonical_name, source_module_description
)
else
discrepancy(modname,
"<code>%s</code>, the code for the canonical name %s, is wrong; it should be <code>%s</code>.",
code, canonical_name, name_to_code_map[canonical_name]
)
end
elseif not name_to_code_map[canonical_name] then
local data_table = require("Module:" .. code_to_module_map[code])[code]
discrepancy(modname,
"%s, the canonical name for the code <code>%s</code>, is wrong; it should be %s.",
canonical_name, code, data_table[1]
)
end
end
for code, canonical_name in pairs(code_to_name_module) do
check_code_and_name(code_to_name_modname, code, canonical_name)
end
for canonical_name, code in pairs(name_to_code_module) do
check_code_and_name(name_to_code_modname, code, canonical_name)
end
end
local function check_extraneous_extra_data(
data_modname, data_module, extra_data_modname, extra_data_module)
for code, _ in pairs(extra_data_module) do
if not data_module[code] then
discrepancy(extra_data_modname,
"The code <code>%s</code> is not found in [[Module:%s]], and should be removed from [[Module:%s]].",
code, data_modname, extra_data_modname
)
end
end
end
-- TODO: add collision check between the canonical names "X" and "X [Ll]anguage".
local function check_languages(frame)
local check_language_data_keys = check_data_keys(
1, 2, 3, 4, -- canonical name, Wikidata item, family, scripts
"display_text", "generate_forms", "strip_diacritics", "sort_key",
"other_names", "aliases", "varieties", "ietf_subtag",
"type", "ancestors", "pseudo_families",
"wikimedia_codes", "wikipedia_article", "standard_chars",
"translit", "override_translit", "link_tr",
"dotted_dotless_i"
)
local function check_language(modname, code, data, extra_modname, extra_data)
local obj, code_modname, canonical_name = make_lang(code, data, true), get_data_module_name(code), data[1]
-- FIXME: this module should use the prefixed module name throughout.
code_modname = code_modname:gsub("^Module:", "")
if code_modname ~= modname then
if code_modname == "languages/data/2" then
discrepancy(modname,
"%s is a two-letter code, so should be moved to [[Module:%s]].",
link(obj), code_modname
)
elseif code_modname == "languages/data/exceptional" then
discrepancy(modname,
"%s is an exceptional code, as it does not consist of two or three lowercase letters, so should be moved to [[Module:%s]].",
link(obj), code_modname
)
else
discrepancy(modname,
"%s is a three-letter code beginning with '%s', so should be moved to [[Module:%s]].",
link(obj), sub(code, 1, 1), code_modname
)
end
end
check_language_data_keys(modname, obj, data)
if all_codes[code] then
discrepancy(modname,
"The code <code>%s</code> is not unique; it is also defined in [[Module:%s]].",
code, all_codes[code]
)
else
if not m_languages_codes[code] then
discrepancy("languages/code to canonical name",
"The code %s is missing.",
link(obj, true)
)
end
all_codes[code] = modname
end
-- TODO: these checks should be consolidated with the proto-language checks in the family data,
-- since bad settings there affect the warnings here (e.g. xxx-pro assigned to yyy when xxx also
-- doesn't not exist - a warning that xxx has "no family" would be misleading).
if sub(code, -4) == "-pro" then
local fam_code = sub(code, 1, -5)
local fam = get_language_by_code(fam_code, nil, true, true)
if not fam then
discrepancy(modname,
"'''Proto-language with no family''': %s should be the proto-language of <code>%s</code>, which doesn't exist.",
link(obj), dump(fam_code)
)
elseif not fam:hasType("family") then
discrepancy(modname,
"'''Proto-language with no family''': %s should be the proto-language of <code>%s</code>, but %s is not a family.",
link(obj), dump(fam_code), link(fam)
)
else
-- Reinstate this as low-priority once message priorities have been implemented.
-- local expected_name = "Proto-" .. fam:getCanonicalName()
-- if canonical_name ~= expected_name then
-- discrepancy(modname,
-- "%s does not have the expected name \"%s\", even though it is the proto-language of the %s.",
-- link(obj), expected_name, link(fam)
-- )
-- end
end
end
if not canonical_name then
discrepancy(modname,
"The code <code>%s</code> has no canonical name specified.",
code
)
elseif language_names[canonical_name] then
local canonical_lang = get_language_by_canonical_name(canonical_name)
if not canonical_lang then
discrepancy(modname,
"%s has a canonical name that cannot be looked up.",
link(obj)
)
elseif data.main_code ~= canonical_lang:getCode() then
discrepancy(modname,
"%s has a canonical name that is not unique; it is also used by the code <code>%s</code>.",
link(obj), language_names[canonical_name]
)
end
else
if not m_languages_canonical_names[canonical_name] then
discrepancy("languages/canonical names",
"The canonical name %s is missing.",
link(obj)
)
end
language_names[canonical_name] = code
end
check_wikidata_item(modname, obj, data, 2)
if extra_data then
check_other_names_aliases_varieties(modname, obj, extra_data, canonical_name)
end
local lang_type = data.type
if lang_type and not (lang_type == "regular" or lang_type == "reconstructed" or lang_type == "appendix-constructed") then
discrepancy(modname,
"%s is of the invalid type <code>%s</code>.",
link(obj), lang_type
)
end
if data.aliases then
discrepancy(modname,
"%s has an <code>aliases</code> key in [[Module:%s]]. This must be moved to [[Module:%s]].",
link(obj), modname, extra_modname
)
end
if data.varieties then
discrepancy(modname,
"%s has the <code>varieties</code> key in [[Module:%s]]. This must be moved to [[Module:%s]].",
link(obj), modname, extra_modname
)
end
if data.other_names then
discrepancy(modname,
"%s has the <code>other_names</code> key in [[Module:%s]]. This must be moved to [[Module:%s]].",
link(obj), modname, extra_modname
)
end
if not extra_data then
discrepancy(extra_modname,
"%s has data in [[Module:%s]], but does not have corresponding data in [[Module:%s]].",
link(obj), modname, extra_modname
)
--[[elseif extra_data.other_names then
discrepancy(extra_modname,
"%s has <code>other_names</code> key, but these should be changed to either <code>aliases</code> or <code>varieties</code>.",
link(obj)
)]]
end
local sc = data[4]
if sc then
if type(sc) == "string" then
sc = split(sc, "%s*,%s*", true)
end
if type(sc) == "table" then
if not sc[1] then
discrepancy(modname,
"%s has no scripts listed.",
link(obj)
)
else
for _, sccode in ipairs(sc) do
local cur_sc = m_scripts_data[sccode]
if not (cur_sc or sccode == "All" or sccode == "Hants") then
discrepancy(modname,
"%s lists the invalid script code <code>%s</code>.",
link(obj), dump(sccode)
)
--[[elseif not cur_sc.characters then
discrepancy(modname,
"%s lists the %s, which does not have any characters.",
link(obj), link(get_script_by_code(sccode))
)]]
end
nonempty_scripts[sccode] = true
end
end
else
discrepancy(modname,
"The %s field for %s must be a table or string.",
4, link(obj)
)
end
end
if data.ancestors then
check_ancestors(modname, obj, data)
end
if data.wikimedia_codes then
check_wikimedia_codes(modname, obj, data)
end
if data[3] then
local family = data[3]
if not m_families_data[family] then
discrepancy(modname,
"%s has the invalid family code <code>%s</code>.",
link(obj), dump(family)
)
end
nonempty_families[family] = true
end
check_replacements_data(modname, obj, data)
if data.standard_chars then
if type(data.standard_chars) == "table" then
local sccodes = {}
for _, sccode in ipairs(sc) do
sccodes[sccode] = true
end
for sccode in pairs(data.standard_chars) do
if not (sccodes[sccode] or sccode == 1) then
discrepancy(modname,
"The field %s in the <code>standard_chars</code> table for %s does not match any script for that language.",
sccode, link(obj)
)
end
end
elseif data.standard_chars and type(data.standard_chars) ~= "string" then
discrepancy(modname,
"The <code>standard_chars</code> field in the data table for %s must be a string or table.",
link(obj)
)
end
end
check_true_or_string_or_nil(modname, obj, data, "override_translit")
check_true_or_string_or_nil(modname, obj, data, "link_tr")
-- This doesn't apply any more since scripts can be script-wide translit methods.
-- if data.override_translit and not data.translit then
-- discrepancy(modname,
-- "%s has the <code>override_translit</code> field set, but no transliteration module",
-- link(obj)
-- )
-- end
end
local function check_module(modname)
local mod_data = load_data("Module:" .. modname)
local extra_modname = modname .. "/extra"
local extra_mod_data = load_data("Module:" .. extra_modname)
for code, data in pairs(mod_data) do
check_language(modname, code, data, extra_modname, extra_mod_data[code])
end
check_no_alias_codes(modname, mod_data)
check_no_alias_codes(extra_modname, extra_mod_data)
check_extraneous_extra_data(modname, mod_data, extra_modname, extra_mod_data)
end
-- Check two-letter codes
check_module(
"languages/data/2"
)
-- Check three-letter codes
for i = 0x61, 0x7A do -- a to z
check_module(
format("languages/data/3/%c", i)
)
end
-- Check exceptional codes
check_module(
"languages/data/exceptional"
)
-- These checks must be done while all_codes only contains language codes:
-- that is, after language data modules have been processed, but before
-- etymology languages, families, and scripts have.
check_code_to_name_and_name_to_code_maps(
"languages",
"a submodule of [[Module:languages]]",
all_codes, language_names,
"languages/code to canonical name", m_languages_codes,
"languages/canonical names", m_languages_canonical_names
)
--[===[ not to check langname-lite because we don't use it
-- Check [[Template:langname-lite]]
local modname = "Template:langname-lite"
for code, name in gmatch(remove_comments(new_title(modname):getContent()), "\n\t*|#*([^\n]+)=([^\n]*)") do
if #code > 1 and code ~= "default" then
for _, code in pairs(split(code, "|", true)) do
local lang = get_language_by_code(code, nil, true, true)
if match(name, "etymcode") then
local nonEtym_name = frame:preprocess(name)
local nonEtym_real_name = lang:getFullName()
if nonEtym_name ~= nonEtym_real_name then
discrepancy(modname,
"Code: <code>%s</code>. Saw name: %s. Expected name: %s.",
code, nonEtym_name, nonEtym_real_name
)
end
name = frame:preprocess(gsub(name, "{{{allow etym|}}}", "1"))
elseif match(name, "familycode") then
name = match(name, "familycode|(.-)|")
else
name = name
end
if not lang then
discrepancy(modname,
"Code: <code>%s</code>. Saw name: %s. Language not present in data.",
code, name
)
else
local real_name = lang:getCanonicalName()
if name ~= real_name then
discrepancy(modname,
"Code: <code>%s</code>. Saw name: %s. Expected name: %s.",
code, name, real_name
)
end
end
end
end
end
--]===]
end
local function check_etym_languages()
local modname = "etymology languages/data"
local check_etymology_language_data_keys = check_data_keys(
1, 2, 3, 4, -- canonical name, Wikidata item, family, scripts
"parent", "display_text", "generate_forms", "strip_diacritics", "sort_key",
"other_names", "aliases", "varieties", "ietf_subtag",
"type", "main_code", "ancestors", "pseudo_families",
"wikimedia_codes", "wikipedia_article", "standard_chars",
"translit", "override_translit", "link_tr",
"dotted_dotless_i"
)
local checked = {}
for code, data in pairs(m_etym_languages_data) do
local obj, canonical_name, parent = make_lang(code, data, true), data[1], data.parent
check_etymology_language_data_keys(modname, obj, data)
if all_codes[code] then
discrepancy(modname,
"The code <code>%s</code> is not unique; it is also defined in [[Module:%s]].",
code, all_codes[code]
)
else
if not m_etym_languages_codes[code] then
discrepancy("etymology languages/code to canonical name",
"The code %s is missing.",
link(obj, true)
)
end
all_codes[code] = modname
end
if not canonical_name then
discrepancy(modname,
"The code <code>%s</code> has no canonical name specified.",
code
)
elseif language_names[canonical_name] then
local canonical_lang = get_language_by_canonical_name(canonical_name, nil, true)
if not canonical_lang then
discrepancy(modname,
"%s has a canonical name that cannot be looked up.",
link(obj)
)
elseif data.main_code ~= canonical_lang:getCode() then
discrepancy(modname,
"%s has a canonical name that is not unique; it is also used by the code <code>%s</code>.",
link(obj), language_names[canonical_name]
)
end
else
if not m_etym_languages_canonical_names[canonical_name] then
discrepancy("etymology languages/canonical names",
"The canonical name %s is missing.",
link(obj)
)
end
etym_language_names[canonical_name] = code
end
check_other_names_aliases_varieties(modname, obj, data, canonical_name)
if parent then
if type(parent) ~= "string" then
discrepancy(modname,
"%s has a parent code that is %s rather than a string.",
link(obj), parent == nil and "nil" or "a " .. type(parent)
)
elseif not (m_languages_data_all[parent] or m_etym_languages_data[parent]) then
discrepancy(modname,
"%s has the invalid parent code <code>%s</code>%s.",
link(obj), dump(parent), m_families_data[parent] and " (a family code)" or ""
)
end
nonempty_families[parent] = true
else
discrepancy(modname,
"%s has no parent code.",
link(obj)
)
end
if data.ancestors then
check_ancestors(modname, obj, data)
end
if data.wikimedia_codes then
check_wikimedia_codes(modname, obj, data)
end
if data[3] then
local family = data[3]
if not m_families_data[family] then
discrepancy(modname,
"%s has the invalid family code <code>%s</code>.",
link(obj), dump(family))
end
nonempty_families[family] = true
end
check_replacements_data(modname, obj, data)
check_wikidata_item(modname, obj, data, 2)
local stack = {}
while data do
if checked[code] then
break
elseif stack[code] then
local parent = data.parent
discrepancy(modname,
"%s has a cyclic parental relationship to %s",
link(make_lang(code, data, true)),
link(get_language_by_code(parent, nil, true))
)
break
end
stack[code] = true
code = data.parent
data = m_etym_languages_data[code]
end
for code in pairs(stack) do
checked[code] = true
end
end
check_no_alias_codes(modname, m_etym_languages_data)
check_code_to_name_and_name_to_code_maps(
"etymology languages",
"[[Module:etymology languages/data]]",
all_codes, etym_language_names,
"etymology languages/code to canonical name", m_etym_languages_codes,
"etymology languages/canonical names", m_etym_languages_canonical_names)
end
-- TODO: add collision check between the canonical names "X" and "X [Ll]anguages".
local function check_families()
local modname = "families/data"
local check_family_data_keys = check_data_keys(
1, 2, 3, -- canonical name, Wikidata item, (parent) family
"type", "ietf_subtag",
"protoLanguage", "other_names", "aliases", "varieties", "pseudo_families", "categoryName"
)
local checked, double_check_if_empty = {["qfa-not"] = true}, {}
for code, data in pairs(m_families_data) do
local obj, canonical_name, family, protolang = make_family(code, data), data[1], data[3], data.protoLanguage
check_family_data_keys(modname, obj, data)
if all_codes[code] then
discrepancy(modname,
"The code <code>%s</code> is not unique; it is also defined in [[Module:%s]].",
code, all_codes[code]
)
else
if not m_families_codes[code] then
discrepancy("families/code to canonical name",
"The code %s is missing.",
link(obj, true)
)
end
all_codes[code] = modname
end
if not canonical_name then
discrepancy(modname,
"The code <code>%s</code> has no canonical name specified.",
code
)
elseif family_names[canonical_name] then
local canonical_family = get_family_by_canonical_name(canonical_name)
if not canonical_family then
discrepancy(modname,
"%s has a canonical name that cannot be looked up.",
link(obj)
)
elseif data.main_code ~= canonical_family:getCode() then
discrepancy(modname,
"%s has a canonical name that is not unique; it is also used by the code <code>%s</code>.",
link(obj), family_names[canonical_name]
)
end
else
if not m_families_canonical_names[canonical_name] then
discrepancy("families/canonical names",
"The canonical name %s is missing.",
link(obj)
)
end
family_names[canonical_name] = code
end
check_other_names_aliases_varieties(modname, obj, data, canonical_name)
if family then
if family == code and code ~= "qfa-not" then
discrepancy(modname,
"%s has itself as its family.",
link(obj)
)
elseif not m_families_data[family] then
discrepancy(modname,
"%s has the invalid parent family code <code>%s</code>.",
link(obj), dump(family)
)
end
nonempty_families[family] = true
end
if protolang then
local protolang_obj = get_language_by_code(protolang, nil, true)
if not protolang_obj then
discrepancy(modname,
"%s has the invalid proto-language code <code>%s</code>.",
link(obj), dump(protolang)
)
elseif protolang == code .. "-pro" then
discrepancy(modname,
"%s has %s listed as its proto-language, which is redundant, since it is determined to be the proto-language automatically.",
link(obj), link(protolang_obj)
)
elseif sub(protolang, -4) == "-pro" then
discrepancy(modname,
"%s has %s listed as its proto-language, which is supposed to be the proto-language for the family <code>%s</code>.", link(obj), link(protolang_obj), sub(protolang, 1, -5)
)
end
end
check_wikidata_item(modname, obj, data, 2)
-- Could be a false-positive if a child family occurs on a later
-- iteration, so set aside any that fail for a second check. This avoids
-- having to iterate through the whole list of families once
-- nonempty_families has been fully populated.
if not (nonempty_families[code] or allowed_empty_families[code]) then
double_check_if_empty[code] = obj
end
local stack = {}
while data do
if checked[code] then
break
elseif stack[code] then
local parent = data[3]
discrepancy(modname,
"%s has a cyclic familial relationship to %s",
link(make_family(code, data)),
link(get_family_by_code(parent))
)
break
end
stack[code] = true
code = data[3]
data = m_families_data[code]
end
for code in pairs(stack) do
checked[code] = true
end
end
-- Any languages set aside as candidates for having no children are checked
-- again, now that nonempty_families is definitely complete.
for code, obj in next, double_check_if_empty do
if not (nonempty_families[code] or allowed_empty_families[code]) then
discrepancy(modname,
"%s has no child families or languages.",
link(obj)
)
end
end
check_no_alias_codes(modname, m_families_data)
check_code_to_name_and_name_to_code_maps(
"families",
"[[Module:families/data]]",
all_codes, family_names,
"families/code to canonical name", m_families_codes,
"families/canonical names", m_families_canonical_names)
end
-- TODO: add collision check between the canonical names "X" and "X [Ss]cript".
local function check_scripts()
local modname = "scripts/data"
local check_script_data_keys = check_data_keys(
1, 2, 3, -- canonical name, Wikidata item, writing systems
"other_names", "aliases", "varieties", "parent", "ietf_subtag", "type",
"wikipedia_article", "ranges", "characters", "spaces", "capitalized", "translit", "direction",
"character_category", "normalizationFixes", "sort_by_scraping",
"display_text", "sort_key", "strip_diacritics"
)
-- Just to satisfy requirements of check_code_to_name_and_name_to_code_maps.
local script_code_to_module_map = {}
for code, data in pairs(m_scripts_data) do
local obj, canonical_name = make_script(code, data), data[1]
if not m_scripts_codes[code] and #code == 4 then
discrepancy("scripts/code to canonical name",
"The code %s is missing",
link(obj, true)
)
end
check_script_data_keys(modname, obj, data)
if not canonical_name then
discrepancy(modname,
"The code <code>%s</code> has no canonical name specified.",
code
)
elseif script_names[canonical_name] then
local canonical_script = get_script_by_canonical_name(canonical_name)
if not canonical_script then
discrepancy(modname,
"%s has a canonical name that cannot be looked up.",
link(obj)
)
--[[elseif data.main_code ~= canonical_script:getCode() then
discrepancy(modname,
"%s has a canonical name that is not unique; it is also used by the code <code>%s</code>.",
link(obj), script_names[canonical_name]
)]]
end
else
if not m_scripts_canonical_names[canonical_name] and #code == 4 then
discrepancy("scripts/canonical names",
"The canonical name %s is missing.",
link(obj)
)
end
script_names[canonical_name] = code
end
check_other_names_aliases_varieties(modname, obj, data, canonical_name)
if not nonempty_scripts[code] then
discrepancy(modname,
"%s is not used by any language%s.",
link(obj), data.characters and ""
or " and has no characters listed for auto-detection")
--[[elseif not data.characters then
discrepancy(modname,
"%s has no characters listed for auto-detection.",
link(obj)
)--]]
end
if data.characters then
validate_pattern(data.characters, modname, obj, false)
end
check_wikidata_item(modname, obj, data, 2)
script_code_to_module_map[code] = modname
end
check_no_alias_codes(modname, m_scripts_data)
check_code_to_name_and_name_to_code_maps(
"scripts",
"a submodule of [[Module:scripts]]",
script_code_to_module_map, script_names,
"scripts/code to canonical name", m_scripts_codes,
"scripts/canonical names", m_scripts_canonical_names)
end
-- FIXME: this is quite messy.
local function check_wikidata_languages()
local data = json_decode(new_title("Module:languages/data/wikidata.json"):getContent())
local seen = {{}, {}, {}, [5] = {}}
for _, item in ipairs(data) do
local id = item.id
for k, v in pairs(item) do
if k ~= "id" then
local _seen = seen[k]
for _, code in ipairs(v) do
local _code = code[1]
local _type = type(_seen[_code])
if _type == "table" then
insert(_seen[_code], id)
elseif _type == "string" then
_seen[_code] = {_seen[_code], id}
else
_seen[_code] = id
end
end
end
end
end
local modname = "languages/data/wikidata.json"
for k, v in pairs(seen) do
for code, ids in pairs(v) do
if type(ids) == "table" then
local t = {}
for i, id in ipairs(ids) do
t[i] = format("<code>[[d:%s|%s]]</code>", id, id)
end
discrepancy(modname,
"<code>%s</code> is set as an ISO 639-%d code on multiple items: %s.",
code, k, list_to_text(t)
)
end
end
end
end
local function check_labels()
local check_label_data_keys = check_data_keys(
"display", "Wikipedia", "glossary",
"plain_categories", "topical_categories", "pos_categories", "regional_categories", "sense_categories",
"omit_preComma", "omit_postComma", "omit_preSpace",
"deprecated", "track"
)
local function check_label(modname, code, data)
local _type = type(data)
if _type == "table" then
check_label_data_keys(modname, code, data)
elseif _type ~= "string" then
discrepancy(modname,
"The data for the label <code>%s</code> is %s %s; only tables and strings are allowed.",
code, add_indefinite_article(_type)
)
end
end
for _, module in ipairs{"", "/regional", "/topical"} do
local modname = "Module:labels/data" .. module
module = require(modname)
for label, data in pairs(module) do
check_label(modname, label, data)
end
end
for code in pairs(m_languages_codes) do
local modname = "Module:labels/data/lang/" .. code
local module = safe_require(modname)
if module then
for label, data in pairs(module) do
check_label(modname, label, data)
end
end
end
end
local function check_zh_trad_simp()
local m_ts = require("Module:zh/data/ts")
local m_st = require("Module:zh/data/st")
local ruby = require("Module:ja-ruby").ruby_auto
local lang = get_language_by_code("zh")
local Hant = get_script_by_code("Hant")
local Hans = get_script_by_code("Hans")
local data = {[0] = m_st, m_ts}
local mod = {[0] = "st", "ts"}
local var = {[0] = "Simp.", "Trad."}
local sc = {[0] = Hans, Hant}
local function find_stable_loop(chars, other, j)
local display = ruby({["markup"] = "[" .. other .. "](" .. var[(j+1)%2] .. ")"})
display = language_link{term = other, alt = display, lang = lang, sc = sc[(j+1)%2], tr = "-"}
insert(chars, display)
if data[(j+1)%2][other] == other then
insert(chars, other)
return chars, 1
elseif not data[(j+1)%2][other] then
insert(chars, "not found")
return chars, 2
elseif data[j%2][data[(j+1)%2][other]] ~= other then
return find_stable_loop(chars, data[(j+1)%2][other], j + 1)
else
local display = ruby({["markup"] = "[" .. data[(j+1)%2][other] .. "](" .. var[j%2] .. ")"})
display = language_link{term = data[(j+1)%2][other], alt = display, lang = lang, sc = sc[j%2], tr = "-"}
insert(chars, display .. " (")
display = ruby({["markup"] = "[" .. data[j%2][data[(j+1)%2][other]] .. "](" .. var[(j+1)%2] .. ")"})
display = language_link{term = data[j%2][data[(j+1)%2][other]], alt = display, lang = lang, sc = sc[(j+1)%2], tr = "-"}
insert(chars, display .. " etc.)")
return chars, 3
end
return chars
end
for i = 0, 1, 1 do
for ch, other_ch in pairs(data[i]) do
if data[(i+1)%2][other_ch] ~= ch then
local chars, issue = {}
local display = ruby({["markup"] = "[" .. ch .. "](" .. var[i] .. ")"})
display = language_link{term = ch, alt = display, lang = lang, sc = sc[i], tr = "-"}
insert(chars, display)
chars, issue = find_stable_loop(chars, other_ch, i)
if issue == 1 or issue == 2 then
local sc_this, mod_this, j = {}
if match(chars[#chars-1], var[(i+1)%2]) then
j = 1
else
j = 0
end
mod_this = mod[(i+j)%2]
sc_this = {[0] = sc[(i+j)%2], sc[(i+j+1)%2]}
for k, ch in ipairs(chars) do
chars[k] = tag_text(ch, lang, sc_this[k%2], "term")
end
local modname = "zh/data/" .. mod_this
if issue == 1 then
discrepancy(modname,
"character references itself: %s",
concat(chars, " → ")
)
elseif issue == 2 then
discrepancy(modname,
"missing character: %s",
concat(chars, " → ")
)
end
elseif issue == 3 then
for j, ch in ipairs(chars) do
chars[j] = tag_text(ch, lang, sc[(i+j)%2], "term")
end
discrepancy("zh/data/" .. mod[i],
"possible mismatched character: %s",
concat(chars, " → ")
)
end
end
end
end
end
local function check_serialization(modname)
local serializers = {
["Hani-sortkey/data/serialized"] = "Hani-sortkey/serializer",
}
if not serializers[modname] then
return nil
end
local serializer = serializers[modname]
local current_data = require("Module:" .. serializer).main(true)
local stored_data = require("Module:" .. modname)
if current_data ~= stored_data then
discrepancy(modname,
"<strong><u>Important!</u> Serialized data is out of sync. Use [[Module:%s]] to update it. If you have made any changes to the underlying data, the serialized data <u>must</u> be updated before these changes will take effect.</strong>",
serializer
)
end
end
local find_code = require("Module:memoize")(function(message)
return match(message, "<code>([^<]+)</code>")
end)
local function compare_messages(message1, message2)
local code1, code2 = find_code(message1), find_code(message2)
if code1 and code2 then
return code1 < code2
else
return message1 < message2
end
end
-- Warning: cannot be called twice in the same module invocation because
-- some module-global variables are not reset between calls.
local function do_checks(frame, modules)
messages = setmetatable({}, messages_mt)
if modules["zh/data/ts"] or modules["zh/data/st"] then
check_zh_trad_simp()
end
check_languages(frame)
check_etym_languages()
-- families and scripts must be checked AFTER languages; languages checks fill out
-- the nonempty_families and nonempty_scripts tables, used for testing if a family/script
-- is ever used in the data
check_families()
check_scripts()
check_wikidata_languages()
if modules["labels/data"] then
check_labels()
end
for module in pairs(modules) do
check_serialization(module)
end
setmetatable(messages, nil)
for _, msglist in pairs(messages) do
msglist:sort(compare_messages)
end
local ret = messages
messages = nil
return ret
end
local function format_message(modname, msglist)
local header; if match(modname, "^Module:") or match(modname, "^Template:") then
header = "===[[" .. modname .. "]]==="
else
header = "===[[Module:" .. modname .. "]]==="
end
return header .. msglist:map(function(msg)
return "\n* " .. msg
end):concat()
end
function export.check_modules_t(frame)
local args = frame.args
local modules = list_to_set(args)
local ret = Array()
local messages = do_checks(frame, modules)
for _, module in ipairs(args) do
local msglist = messages[module]
if msglist then
ret:insert(format_message(module, msglist))
end
end
return ret:concat("\n")
end
function export.perform(frame)
local messages = do_checks(frame, {})
-- Format the messages
local ret = Array()
for modname, msglist in sorted_pairs(messages) do
ret:insert(format_message(modname, msglist))
end
-- Are there any messages?
-- TODO: check how many messages there are.
if false then --if i == 1 then
return "<b class=\"success\">Glory to Arstotzka.</b>"
else
ret:insert(1, "<b class=\"warning\">Discrepancies detected:</b>")
return ret:concat("\n")
end
end
return export
c42m7m0j905eljkvxvftn433l22ax1o
ເຂັນ
0
243506
5720683
1890237
2026-04-20T21:01:08Z
Alifshinobi
397
5720683
wikitext
text/x-wiki
{{also/auto}}
== ภาษาลาว ==
=== การออกเสียง ===
{{lo-pron}}
=== รากศัพท์ 1 ===
ร่วมเชื้อสายกับ{{cog|th|เข็น}}
==== คำกริยา ====
{{lo-verb}}
# [[เข็น]]
#: {{syn|lo|ຍູ້|ດັນ}}
=== รากศัพท์ 2 ===
ร่วมเชื้อสายกับ{{cog|th|เข็ญ}}, {{cog|nod|ᨡᩮ᩠ᨶ}} หรือ {{m|nod|ᨡᩮᩢ᩠ᨶ}}, {{cog|kkh|ᨡᩮ᩠ᨶ}}, {{cog|khb|ᦃᦲᧃ}} หรือ {{m|khb|ᦵᦃᧃ}}, {{cog|shn|ၶဵၼ်}}, {{cog|aho|𑜁𑜢𑜃𑜫}}
=== คำคุณศัพท์ ===
{{lo-adj}}
# [[เข็ญ]], [[โชค]][[ร้าย]], [[อยู่]][[ใน]][[ภัย]][[อันตราย]]
=== คำนาม ===
{{lo-noun}}
# [[ความเข็ญ]], [[ความ]][[โชค]][[ร้าย]], [[ภัย]][[อันตราย]]
=== รากศัพท์ 3 ===
ร่วมเชื้อสายกับ{{cog|tts|เข็น}}
==== คำกริยา ====
{{lo-verb}}
# {{lb|lo|สกรรม}} [[ปั่น]] (ใช้แก่ฝ้ายหรือไหมเป็นต้น)
7hw8lrv983x516wkh3a0vvcthp01av2
ᥕᥒ
0
270949
5720744
5652919
2026-04-21T06:46:22Z
Ai Ku Karng
17824
/* ภาษาไทใต้คง */
5720744
wikitext
text/x-wiki
== ภาษาไทใต้คง ==
=== การออกเสียง ===
* {{IPA|tdd|/jaŋ˧˧/}}
=== คำกริยาวิเศษณ์ ===
{{tdd-adv}}
# [[ไม่]], [[ยัง]]ไม่
==== คำพ้องความ ====
* {{l|tdd|ᥟᥛᥱ}}
8he202zaum4ulqw4awvvaflafmxf7a0
ᥕᥝᥳ
0
270953
5720729
5715180
2026-04-21T05:18:37Z
Ai Ku Karng
17824
/* ภาษาไทใต้คง */
5720729
wikitext
text/x-wiki
== ภาษาไทใต้คง ==
=== รากศัพท์ ===
ร่วมเชื้อสายกับ{{cog|shn|ယဝ်ႉ}}
=== การออกเสียง ===
* {{IPA|tdd|/jaw˦˧/}}
=== คำอนุภาค ===
{{tdd-part}}
# [[แล้ว]] (ใช้แสดงการกระทำที่เสร็จสิ้นไปแล้ว)
#: {{syn|tdd|ᥞᥝᥳ}}
i307aas1wnw1zg56zmhk0gs6w4c8eez
ᥞᥝᥰ
0
270976
5720727
1422750
2026-04-21T05:06:52Z
Ai Ku Karng
17824
/* ภาษาไทใต้คง */
5720727
wikitext
text/x-wiki
== ภาษาไทใต้คง ==
=== รากศัพท์ ===
{{inh+|th|tai-pro|*rawᴬ}}; ร่วมเชื้อสายกับ{{cog|shn|ႁဝ်း}}, {{cog|lo|ເຮົາ}}, {{cog|nod|ᩁᩮᩢᩣ}}, {{cog|khb|ᦣᧁ}}, {{cog|blt|ꪹꪭꪱ}}, {{cog|tts|เฮา}}, {{cog|aho|𑜍𑜧}}, {{m|aho|𑜍𑜈𑜫}} หรือ {{m|aho|𑜍𑜧𑜈𑜫}}, {{cog|pcc|rauz}}, {{cog|za|raeuz}}
=== การออกเสียง ===
* {{IPA|tdd|/haw˥˧/}}
=== คำสรรพนาม ===
{{tdd-pronoun}}
# [[เรา]], [[พวก]]เรา (รวมผู้ฟัง)
45lo2z4od1pvg745l6cd48xyqoxmn0b
ᥟᥛᥱ
0
270978
5720741
5652920
2026-04-21T06:33:47Z
Ai Ku Karng
17824
5720741
wikitext
text/x-wiki
== ภาษาไทใต้คง ==
=== รากศัพท์ ===
ร่วมเชื้อสายกับ{{cog|shn|ဢမ်ႇ}}
=== การออกเสียง ===
* {{IPA|tdd|/ʔam˩˩/}}
=== คำกริยาวิเศษณ์ ===
{{tdd-adv}}
# [[ไม่]]
==== คำพ้องความ ====
* {{l|tdd|ᥕᥒ}}
g94cxhj7vfvgu5f8794q9kn1ii30b91
ᥑᥤᥲ
0
271120
5720787
1650905
2026-04-21T07:20:24Z
Ai Ku Karng
17824
/* ภาษาไทใต้คง */
5720787
wikitext
text/x-wiki
== ภาษาไทใต้คง ==
=== การออกเสียง ===
* {{IPA|tdd|/xi˧˩/}}
=== รากศัพท์ 1 ===
{{inh+|tdd|tai-pro|*C̬.qɯjꟲ}}; ร่วมเชื้อสายกับ{{cog|th|ขี้}}, {{cog|nod|ᨡᩦ᩶}}, {{cog|lo|ຂີ້}}, {{cog|khb|ᦃᦲᧉ}}, {{cog|shn|ၶီႈ}}, {{cog|aho|𑜁𑜣}}, {{cog|za|haex}}, {{cog|skb|ไกฺ}}
==== คำนาม ====
{{tdd-noun}}
# [[ขี้]]
==== คำกริยา ====
{{tdd-verb}}
# [[ขี้]]
=== รากศัพท์ 2 ===
==== รูปแบบอื่น ====
* {{l|tdd|ᥔᥤᥲ}}
==== คำนาม ====
{{tdd-noun}}
# [[ซี่]]
6yayhtp9yrreee4genw7bns8p9xvoq4
ᥑᥤᥳ
0
271122
5720783
1422484
2026-04-21T07:17:21Z
Ai Ku Karng
17824
/* ภาษาไทใต้คง */
5720783
wikitext
text/x-wiki
== ภาษาไทใต้คง ==
=== การออกเสียง ===
* {{IPA|tdd|/xi˦˧/}}
=== คำนาม ===
{{tdd-noun}}
# [[ธง]]
7tn1c2brwaxgp7nvhrv11c717e82nrv
ᥐᥣᥓᥣᥒᥲ
0
274125
5720821
1422433
2026-04-21T08:06:08Z
Ai Ku Karng
17824
/* ภาษาไทใต้คง */
5720821
wikitext
text/x-wiki
== ภาษาไทใต้คง ==
=== รากศัพท์ ===
{{com|tdd|ᥐᥣ|ᥓᥣᥒᥲ|t1=ค่า|t2=จ้าง}}
=== การออกเสียง ===
* {{IPA|tdd|/kaː˧˧.t͡saːŋ˧˩/}}
=== คำนาม ===
{{tdd-noun}}
# [[ค่าจ้าง]]
6la21mkeifmqt0p75pirwgzafohrp2d
ᥐᥣᥑᥢᥴ
0
274126
5720771
1422432
2026-04-21T07:03:29Z
Ai Ku Karng
17824
/* ภาษาไทใต้คง */
5720771
wikitext
text/x-wiki
== ภาษาไทใต้คง ==
=== รากศัพท์ ===
{{com|tdd|ᥐᥣ|ᥑᥢᥴ|t1=ค่า|t2=ค่า}}
=== การออกเสียง ===
* {{IPA|tdd|/kaː˧˧.xan˨˦/}}
=== คำนาม ===
{{tdd-noun}}
# [[ค่า]], [[ราคา]]
pr9800nei8bmf3yx9n9oyo43uxyv9gd
มอดูล:place
828
283922
5720701
5715280
2026-04-21T01:50:08Z
OctraBot
3198
5720701
Scribunto
text/plain
local export = {}
local force_cat = false -- set to true for testing
local m_placetypes = require("Module:place/placetypes")
local m_links = require("Module:links")
local memoize = require("Module:memoize")
local m_strutils = require("Module:string utilities")
local m_table = require("Module:table")
local debug_track_module = "Module:debug/track"
local en_utilities_module = "Module:en-utilities"
local form_of_module = "Module:form of"
local languages_module = "Module:languages"
local parse_interface_module = "Module:parse interface"
local parse_utilities_module = "Module:parse utilities"
local parameter_utilities_module = "Module:parameter utilities"
local utilities_module = "Module:utilities"
local enlang = require(languages_module).getByCode("en")
local rmatch = m_strutils.match
local rfind = m_strutils.find
local ulen = m_strutils.len
local split = m_strutils.split
local dump = mw.dumpObject
local insert = table.insert
local concat = table.concat
local pluralize = require(en_utilities_module).pluralize
local extend = m_table.extend
local unpack = unpack or table.unpack -- Lua 5.2 compatibility
local internal_error = m_placetypes.internal_error
local process_error = m_placetypes.process_error
local placetype_data = m_placetypes.placetype_data
--[==[ intro:
===Introduction===
This module implements {{tl|place}}, which is a template for standardizing the description and categorization of
toponyms (terms that refer to locations such as cities, countries, rivers, etc.). The following modules support this
template:
* [[Module:place]]: The main module.
* [[Module:place/placetypes]]: A module containing data on placetypes, as well as utilities for working with placetypes;
category generation handlers for adding categories based on placetypes; and display handlers for displaying holonyms
(i.e. containing locations) of a specific type. FIXME: Maybe split out the code from the data.
* [[Module:place/locations]]: A module containing data on known locations, as well as utilities for working with
such locations. FIXME: Maybe split out the code from the data.
* [[Module:category tree/topic/Places]]: A category tree module for generating the descriptions of all
categories generated by {{tl|place}}.
* [[Module:place doc]]: A module that generates documentation tables describing known placetypes and locations.
===Basic terminology===
The basic terminology used in this and associated {{tl|place}} modules is:
* A ''location'' (or equivalently, a ''place'') is any geographic feature (either natural or geopolitical), either on
the surface of the Earth or elsewhere. Examples of types of natural places are rivers, mountains, seas and moons;
examples of types of geopolitical places are cities, countries, neighborhoods and roads. A ''known location'' is
specifically a location whose properties are specified in the {{tl|place}} modules; more on them below.
* Specific places are identified by names, referred to as ''toponyms'' or ''placenames''. A given place will often have
multiple names, and a given toponym may be ambiguous, referring to multiple possible locations. Specifically:
** There may be names including different amounts of disambiguating information (`Tucson` vs. `Tucson, Arizona` vs.
`Tucson, Arizona, USA` or `New York` vs. `New York City` vs. `New York, New York`); abbreviations (`NYC`
for `New York City`, `USA` for `United States of America`); ''official'' vs. ''short'' names (e.g.
`Union of Soviet Socialist Republics` vs. `Soviet Union`); spelling variations (`Cracow` vs. `Krakow` vs. `Kraków`);
current vs. former names (`Saint Petersburg` vs. `Leningrad` vs. `Petrograd`); [[exonym]]s vs. [[endonym]]s (e.g.
`Tavastia Proper` vs. `Kanta-Häme`, both referring to the same administrative region in Finland); alternative names
not due to any of the above reasons (`Bashkiria` vs. `Bashkortostan`); etc. In addition, each language that has an
opportunity to refer to the place will have its own name, with the same sorts of variations as exist in English.
** Examples of ambiguous toponyms are `New York` (either a city or a state); `Georgia` (either a state of the US or an
independent country in the Caucasus Mountains); `Paris` (either the capital of France or various small cities and
towns in the US); `Mexico` (either a country, a state of that country, or the capital city of that country); and
`San Antonio` (besides being a major city in Texas, it is the name of dozens of settlements of all sorts throughout
the US and Latin America, and a least 181 distinct [[barangay]]s in the Philippines).
* A ''placetype'' is the (or a) type that a location belongs to (e.g. `city`, `state`, `river`, `administrative region`,
`[[regional county municipality]]`, etc.).
** It is common for locations to be described using multiple placetypes, and even sometimes known locations have
multiple placetypes that they may be identified by (e.g. American Samoa can be identified either as an
`unincorporated territory`, an `overseas territory` or just a `territory`). Both the {{tl|place}} template and the
known location data allow a given location to be identified by multiple placetypes. When in doubt as to the correct
placetype or placetypes for a given location, generally follow how Wikipedia describes the place.
** Some placetypes themselves are ambiguous; e.g. an ''area'' can variously refer to a top-level administrative division
(specifically of Kuwait); a geographic region, generally without unambiguously defined borders; or a section of a
city, similar to a neighborhood. The term ''district'' is similarly ambiguous. A ''[[prefecture]]'' in the context of
Japan is similar to a province, but a prefecture in France is the capital of a ''[[department]]'' (which is similar
to a county). Some of this ambiguity is currently handled automatically; e.g. the ambiguity of areas and districts is
handled by looking at the ''holonyms'', or containing locations, specified for a given place. But sometimes it is
necessary to use a qualifier before the placetype to disambiguate; for example to refer to a French prefecture, use
the placetype `French prefecture` instead of just `prefecture`. (FIXME: Handle this automatically.)
* A ''holonym'', in the context of a description of a place, is a placename that refers to a larger-sized entity that
contains the location being described. For example, `Arizona` and `United States` are holonyms of `Tucson`, and
`United States` is a holonym of `Arizona`.
* A ''place invocation'' consists of the invocation of {{tl|place}}, including all its parameters. Place invocations
may contain one or more ''place descriptions'', each of which provides a description of the location, including its
placetype or types, any holonyms, and any additional raw text needed to properly explain the place in context. Place
invocations may also contain named parameters specifying zero or more English ''glosses'' or translations (for
foreign-language toponyms) and any attached ''extra information'' such as the capital, largest city, official name,
modern name or full name. Multiple place descriptions in a single invocation are separated by a numbered parameter
starting with a semicolon, and are used when it is necessary to provide two or more definitions of a single location
for proper categorization. For example, [[Vatican City]] is defined both as a city-state in Southern Europe and as an
enclave within the city of Rome, follows:
: {{tl|place|en|city-state|r/Southern Europe|;,|an <<enclave>> within the city of [[Rome]], [[Italy]]|cat=Places in Rome|official=Vatican City State}}.
Similar things need to be done for places like [[Crimea]] that are claimed by two different countries with different
definitions and administrative structures.
** There are two types of place descriptions, ''new-style'' and ''old-style''. (The use of the terms "new" and "old"
indicates chronological precedence in the development of {{tl|place}}, but is not meant to pass any value judgments
on the two types, and does not indicate any intent to deprecate old-style descriptions. Both types of descriptions
are useful; for example, old-style descriptions are generally more succinct but less flexible.) The above invocation
shows both types: an old-style description followed by a new-style description. Old style descriptions use multiple
numbered parameters, where the first parameter (after the language code) specifies the placetype or types, and
following parameters specify either holonyms (which are always of the form ` ``placetype``/``placename`` `) or raw
text (which is identifiable by not having a slash in it). New-style descriptions use a single parameter, where both
placetypes and holonyms are surrounded by double angle brackets, and all remaining text is raw (displayed as-is). In
both types of descriptions, holonyms include a slash in them to separate the placetype (which is mandatory and often
abbreviated) from the placename.
** In the context of a place description, there are two types of placetypes. The ''entry placetypes'' are the placetypes
of the place being described, while the ''holonym placetypes'' are the placetypes of the holonyms that the place
being described is located within. Currently, a given place can have multiple placetypes specified (e.g. [[Normandy]]
is specified using the ''compound placetype'' `administrative region/former province/and/medieval kingdom`) while a
given holonym can have only one placetype associated with it. Holonym placetypes are frequently abbreviated (e.g.
`r` for `region`, `s` for `state`, `co` for `county`, etc.), while stylistically it is preferred to spell out the
entry placetype (except for some long placetypes with well-known abbreviations, such as `CDP` or `cdp` for
`[[census-designated place]]`).
** All holonyms in place descriptions are automatically linked as if surrounded by {{tl|l|en|...}}; i.e. if double
brackets do not occur in the holonym, the entire holonym will be linked to the corresponding Wiktionary article. For
this reason, the holonym should generally be in the same format as the canonical Wiktionary article describing the
location; see below).
* A ''known location'' is a location whose properties are specifically defined in the {{tl|place}} modules. Generally
each such location has an associated category, and known locations exist in a containment hierarchy, where the
immediately containing known location is known as the ''container'' of the location and the chain of successive
containing locations is known as the ''container trail''. Generally the location's container corresponds to the first
parent of its category. Note that some known locations belong to more than one immediate container; for example,
Russia belongs to both Europe and Asia.
===More about placetypes===
# The following general categories of placetypes exist:
## ''Natural features'' such as lakes, mountains, mountain ranges, islands, archipelagoes, moons, stars, asteroids, etc.
## ''Continents'', ''supercontinents'' (groupings of continents where it makes sense, such as `America` and `Eurasia`)
and ''continent-level regions'' (grouping of countries in a given continent, such as `Central America` and
`Polynesia`).
## ''Political entities'', which are generally classified as either ''polities'' (top-level entities such as countries),
''subpolities'' or ''political divisions'' (non-sovereign divisions, often specifically ''administrative divisions'',
of a polity, where an administrative division has a governmental or statistical function and almost always has
unambiguously defined boundaries), or ''settlements'' (e.g. cities; towns; villages; and divisions of a city such as
neighborhoods, wards, [[barrio]]s and [[barangay]]s, which may or may not be formal administrative divisions and
may or may not have unambiguous boundaries).
## ''Geographic regions'', which refer to recognized areas of the Earth (either with a natural geographic, political or
cultural significance, often of a historical nature). Such regions can be of greatly varying size, may exist either
within a single country or spanning multiple countries or (more often) parts of multiple countries, and may not have
well-defined boundaries. They should be distinguished from ''administrative regions'', which exist within a single
country and have well-defined boundaries and a political or administrative function. Geographic regions are
categorized using the generic term ''geographic and cultural areas'' to emphasize that (a) they have no
administrative significance; (b) they may vary greatly in size; and (c) their cohesion is due either to natural
geographic boundaries, such as rivers or mountain ranges, or to sharing some cultural characteristics.
## ''Man-made structures'' below the level of a settlement or neighborhood, such as airports, roads, individual
buildings, and the like. (Note that such structures, even if named, often do not meet the [[WT:CFI]] criteria; this
is particularly the case for roads.)
# Placetypes support aliases, and the mapping to canonical form happens early on in the processing. For example, `state`
can be abbreviated as `s`; `administrative region` as `adr`; `regional county municipality` as `rcomun`; etc. Some
placetype aliases handle alternative spellings rather than abbreviations. For example, `departmental capital` maps to
`department capital`, and `home-rule city` maps to `home rule city`. Placetype abbreviations are particularly useful
in holonym specs, because every holonym must be accompanied by its placetype, for disambiguation purposes.
# A ''placetype qualifier'' is an adjective prepended to the placetype to give additional information about the
place being described. For example, a given place may be described as a `small city`; logically this is still a city,
but the qualifier `small` gives additional information about the place. Multiple qualifiers can be stacked, e.g.
`small affluent beachfront unincorporated community`, where `unincorporated community` is a recognized placetype and
`small`, `affluent` and `beachfront` are qualifiers. (As shown here, it may not always be obvious where the qualifiers
end and the placetype begins.) For the most part, placetype qualifiers do not affect categorization; a `small city`
is still a city and an `affluent beachfront unincorporated community` is still an unincorporated community, and both
should still be categorized as such. But some qualifiers do change the categorization. In particular, a
`former province` is no longer a province and should not be categorized in e.g. [[:Category:Provinces of Italy]], but
instead in a different set of categories, e.g. [[:Category:Historical political subdivisions]]. There are several
terms treated as equivalent for this purpose: `abandoned` `ancient`, `extinct`, `historic(al)`, `medi(a)eval` and
`traditional`. Another set of qualifiers that change categorization are `fictional` and `mythological`, which cause
any term using the qualifier to be categorized respectively into [[:Category:Fictional locations]] and
[[:Category:Mythological locations]].
===More about toponyms===
# Toponyms may be:
## ''simple'' (not including any containing location in its name, such as `Tucson`) or ''multipart'' (including one or
more containing locations, such as `Tucson, Arizona` or `Tucson, USA` or even `Tucson, Arizona, USA`);
## ''bare'' (not including the word `the` if the location normally requires this article when following a preposition,
such as `United States`, `Gambia` or 'Community of Madrid') or ''prefixed'' (including the word `the` as needed, such
as `the United States`, `the Gambia` or `the Community of Madrid`);
## ''elliptical'' (just the placename without any disambiguating placetype, such as `Durham`, `New York` or `Mexico`) or
''full'' (containing a disambiguating placetype or similar identifier if one is commonly included, such as
the city of `Durham` (in England) vs. its containing county `County Durham`; the US city `New York City` vs. its
containing state `New York`; or the three-way distinction between `Mexico` (the country), `Mexico City` (the capital
of this country) and `(the) State of Mexico` (one of the states of the country Mexico, mostly surrounding but not
including Mexico City)).
# The ''canonical Wiktionary article'' is the main article on Wiktionary where a location is described. Canonical
articles, per the above terminology, are generally ''simple'' and ''bare'', but may be either ''full'' or
''elliptical''. The fact that a given article is canonical is often identifiable by the fact that translations are
housed there an not somewhere else. For example, most counties of the US and Canada include the word `County` in their
canonical article name, but most counties elsewhere do not. `Washington, D.C.` is one of the few cases where a
non-simple toponym is used as the canonical article; this is based on common usage, especially by residents of the
city in question (who commonly refer to it as "D.C." but rarely just as "Washington").
===More about known locations===
# The following types of known locations are defined in this module:
## Continents, supercontinents and continent-level regions, into which countries are grouped. Specifically:
### At the top level below `Earth` are the supercontinents `America` and `Eurasia` and the continents `Africa`,
`Oceania` and `Antartica`.
### `America` is further broken down into the continents `North America` (in turn containing the continental regions
`Central America` and `Caribbean`, with the United States, Canada and Mexico directly under North America) and
`South America`.
### `Eurasia` is further broken down into the continents `Europe` and `Asia`.
### `Oceania` is further broken down into the continental regions `Melanesia`, `Micronesia` and `Polynesia`, with
Australia` directly under `Oceania.
### Under the above-specified divisions are countries. Some countries are placed in more than one continent or
continent-level region, either because they actually span two continents (e.g. Russia, Turkey, Kazakhstan, Egypt) or
because they are politically considered to belong to a continent different from the one they are geographically in
(Cyprus, Georgia, Armenia, etc.).
## Political entities, including:
### Top-level political entities, which includes:
#### Countries, with a fairly liberal definition, notably including all UN-recognized countries plus some others that
are commonly considered countries, even if not all other countries recognize them as such or consider them
completely independent (notably, Kosovo, Palestine, Taiwan, Western Sahara, Niue and the Cook Islands).
#### Pseudo-countries, which include areas calling themselves countries that are de-facto not under the control of the
country that they are internationally considered part of (e.g. Abkhazia, South Ossetia, Transnistria);
dependent/external/etc. territories of countries (e.g. American Samoa [US], Bermuda [UK], Christmas Island
[Australia], Easter Island [Chile]); constituent countries, autonomous territories and the like (Aruba, Curaçao and
Sint Maarten of the Netherlands; Greenland and the Faroe Islands of Denmark; etc.; but notably not including
England, Scotland, Northern Ireland and Wales, which are treated as regular countries); and a grab bag of other
entities that have a semi-independent existence, such as Hong Kong, Macau, Guadeloupe, Martinique and the like.
Currently, the actual distinction in treatment between "countries" and "country-like entities" is minimal, but in
the future we might restrict the sorts of subcategories of country-like entities more than regular countries.
#### Former countries, e.g. the Soviet Union, Yugoslavia, West Germany and the Roman Empire. These are much more limited
in the sorts of subcategories allowed, because generally locations, especially cities, should be described from the
perspective of which political entity they are currently located in (e.g. "an ancient Roman town in modern Syria")
and categorized as such.
### Subpolities. Generally we only list top-level administrative divisions of countries (and only fairly major countries
are usually included), but sometimes we list second-level administrative divisions, as in the case of the
United Kingdom (where the top-level administrative divisions of the four constituent countries are listed) and China
(where major prefecture-level cities are listed, and are considered administrative divisions rather than cities).
### Cities. Only major cities get categories, with the definition of "major" varying by country but often including
those where the city population itself (sometimes the metro area) is >= 1,000,000 people.
# A distinction should be made in the {{tl|place}} modules between ''keys'' and ''placenames''. Placenames are as the
location appears in a holonym, and are generally in the same format as the canonical Wiktionary article describing the
location so that when formatted as a link, the link goes to the right article; i.e. they are simple and bare, and may
be full or elliptical according to Wiktionary conventions. The ''canonical key'' of a location is how the location's
category is named, and always uniquely identifies the location from among the known locations in this module (but
not necessarily among all possible locations). In particular, subpolities usually have multipart keys that include the
containing location, such as `Anhui, China` (not just `Anhui`); `Arizona, USA` (not just `Arizona`, and also not
`Arizona, United States`); and `Herefordshire, England` (not just `Herefordshire`, and also in this case not
`Herefordshire, UK` or `Herefordshire, England, UK` or any other possible variation). Cities are normally simple, but
some cities are multipart for disambiguation purposes (e.g. `Newcastle, New South Wales` for the city in Australia vs.
`Newcastle upon Tyne` for the identically-named city in England). Canonical keys may have ''key aliases'', other
ways of referring to the location that are not necessarily unique (e.g. `Newcastle` is a key alias for both of the
above-mentioned cities), and city keys with diacritics generally have diacriticless aliases, such as canonical key
`Düsseldorf` vs. key alias `Dusseldorf`, or canonical key `Łódź` vs. key alias `Lodz`.
# Known locations are gathered into ''groups'' with similar properties, such as all the states of the United States;
all the (ceremonial) counties of England (see below); and all the "sufficiently major" prefecture-level cities in
China (where a prefecture-level city is a prefecture surrounding a major city with a unified government and is more
like a prefecture, i.e. a major administrative division just underneath a province, than like a city, and where
"sufficiently major" is defined according to the population of either the total prefecture or the urban area of the
city). Note that there are multiple types of counties in England, with overlapping but non-identical names and
boundaries; there are, in particular, ''ceremonial counties'', ''local government counties'' and ''historic
counties''; ''ceremonial counties'' have only ceremonial administrative functionality but unlike local government
counties (a) don't frequently change their boundaries or nature, (b) correspond more closely to historic county
boundaries and names, and (c) are what Englanders usually identify themselves with, and so they are used as top-level
divisions rather than local government counties.
# Some known locations have ''aliases'' defined, which are of two types. ''Display aliases'' map holonyms to their
canonical form near the beginning of processing (in particular before the displayed output is formatted). For example,
`US`, `U.S.`, `USA`, `U.S.A.` and `United States of America` are all canonicalized to `United States` (if identified
as a country), and display as `United States`. Similarly, the foreign forms `Occitanie` (as a region or administrative
region) and `Noord-Brabant` (as a province) are mapped to `Occitania` and `North Brabant` for display purposes. There
are also ''category aliases'', so that if e.g. `Republic of Macedonia` is encountered, it will display as such but
categorize as `North Macedonia`. (This is because, among other reasons, `Republic of Macedonia` is normally preceded
by `"the"` while `North Macedonia` is not, so a call {{tl|place|en|a <<city>> in the <<c/Republic of Macedonia>>}}
would look wrong if `Republic of Macedonia` were converted to `North Macedonia` during display, as the result would be
`a city in the North Macedonia`. There are also frequently political connotations to different category aliases, e.g.
`Burma` vs. `Myanmar`.) All of these aliases are sensitive to the placetype specified. For example, `Mexico` as a
state is categorized under `State of Mexico, Mexico` but `Mexico` the country is categorized as just `Mexico`.
===Categories===
There are two main types of categories:
# Categories for known locations, divided into:
## Top-level polity categories (e.g. [[:Category:United States]], [[:Category:Taiwan]], [[:Category:South Ossetia]],
[[:Category:Bermuda]], [[:Category:Soviet Union]], [[:Category:West Germany]]).
## Subpolity categories ([[:Category:Arizona, USA]], [[:Category:Hunan]], [[:Category:Kagoshima Prefecture]],
[[:Category:Cluj County, Romania]]). For historical reasons, different formats are used for the subpolities of
different polities. Increasingly, we are moving towards always including the polity name in the subpolity category,
but whether the subpolity type is included and where it is included (cf. [[:Category:Cluj County, Romania]] vs.
[[:Category:County Cork, Ireland]] is still inconsistent and will probably remain that way, based on how the
subpolity is normally referred to.
## City categories ([[:Category:Tokyo]], [[:Category:New York City]], [[:Category:Jaipur]]). Normally these do not
include the containing subpolity, but may do so in order to disambiguate.
# Categories for placetypes, divided into:
## "Immediate" political and non-political division categories ([[:Category:States of the United States]],
[[:Category:Municipalities of Tocantins, Brazil]], [[:Category:Ghost towns in Arizona, USA]]). These are name
categories, whose purpose is to contain locations of the specified type. "Immediate" here refers to the fact that
the location in the category name is the immediately-containing polity. Usually these categories use the preposition
"of", but sometimes "in". (Specifically, "of" typically implies that the placetype in question has an official or
semi-official status, whereas "in" implies there is no such official status, but common usage may override this.)
The form of the toponym appearing in these categories is always the same as that of the corresponding toponym
category except that the word "the" may appear (e.g. [[:Category:States of the United States]]), whereas it doesn't
appear in the toponym category itself ([[:Category:United States]], no "the").
## "Skip-polity" categories for second-level political and non-political divisions of a country or other top-level
polity (e.g. [[:Category:Counties of the United States]], [[:Category:Municipalities of Brazil]] and
[[:Category:Subprefectures of Japan]]). These have several purposes:
* They group the immediate division categories mentioned previously.
* They categorize "straggler" topoynms that (often improperly) fail to mention the subpolity they belong to, but
only the top-level polity.
* If categories do not exist for the first-level divisions of a country (and sometimes even when they do), they group
all toponyms of the specified type for the specified country. For example, Lithuania is divided into first-level
counties and second-level municipalities, but since we don't currently have categories for Lithuanian counties,
all municipalities go under [[:Category:Municipalities of Lithuania]] rather than under a category for a specific
county. In addition, even though we do have categories for Japanese prefectures (a first-level division), all
subprefectures (a second-level division) go under [[:Category:Subprefectures of Japan]] because there aren't very
many of them (see below).
## "Generic placetype" categories, both of the immediate and skip-polity type (immediate
[[:Category:Cities in California, USA]] and [[:Category:Neighborhoods of the Bronx]]; skip-polity
[[:Category:Villages in Ivory Coast]], [[:Category:Geographic and cultural areas of England]],
[[:Category:Rivers in Egypt]] and [[:Category:Places in the Philippines]]). As mentioned above, "generic" placetypes
occur in every polity (although the set of generic placetypes allowed for cities is a subset of those allowed for
top-level polities and subpolities). Usually these categories use the preposition "in", but sometimes "of". As above,
skip-polity categories group immediate categories, and in addition there are various reasons a toponym entry is
categorized into a skip-polity category. (For example, as a general rule, geographic and cultural areas only
categorize at the country level, not the subpolity level, both because there often aren't very many in a given
country and because they often span multiple subpolities.)
The parent categories of a given category depend on its type. Generally, location categories have placetype categories
as their first parent, and vice-versa. Specifically:
# Top-level country categories have as their parent e.g. [[:Category:Countries in Europe]],
[[:Category:Countries in Central America]] or [[:Category:Countries in Polynesia]], using the most specific
continental-level region the country is contained in.
# Pseudo-countries are under [[:Category:Country-like entities]] as a neutral designation. There aren't enough of them
to subcategorize under continent-level regions.
# Former countries are under [[:Category:Former countries and country-like entities]].
# Subpolity categories are usually under a placetype category whose placetype is the canonical (first-listed) placetype
of the subpolity and whose toponym is the immediately containing polity, but there are exceptions. Specifically,
sometimes if a polity has multiple types of subpolities, they are combined (e.g. [[:Category:States and territories of
Australia]], [[:Category:Federal subjects of Russia]]). In addition, sometimes a less specific but more identifiable
placetype is used instead of the canonical one (e.g. [[:Category:Regions of France]] when the canonical placetype is
"administrative region"). The same rules and exceptions generally apply when categorizing subpolities themselves; e.g.
both the Australian state of Queensland and territory of Northern Territory go under
[[:Category:en:States and territories of Australia]] rather than separately under [[:Category:en:States of Australia]]
and [[:Category:en:Territories of Australia]]. In addition, sometimes subpolities may "skip a level" if there aren't
very many. For example, there are only 26 subprefectures of Japan (14 under Hokkaido and 12 more scattered under five
other prefectures). Rather than have e.g. [[:Category:en:Subprefectures of Kagoshima Prefecture]] containing at most
two entries and [[:Category:en:Subprefectures of Miyazaki Prefecture]] containing at most one, they are all grouped
under the so-called "skip-subpolity category" [[:Category:en:Subprefectures of Japan]].
# City categories are always under e.g. [[:Category:Cities in the United States]] (e.g. [[:Category:New York City]] is
so-placed, even though [[:Category:Cities in New York, USA]] exists). However, they may have a second, more-specific
parent (e.g. [[:Category:Cities in New York, USA]] in the case of New York City). The city entries themselves will
go under the more specific parent if it exists.
# Immediate placetype categories for second-level divisions of a country generally have, respectively, a
"toponym parent" that is the toponym mentioned in the category and a "skip-polity parent" that groups all subpolity
placetype categories of a specific type and containing polity. For example, [[:Category:Counties of Arizona, USA]] has
toponym parent [[:Category:en:Arizona, USA]] and skip-polity parent [[:Category:en:Counties of the United States]].
Sometimes the default skip-polity parent is overridden or disabled entirely. For example, in the US, most states are
divided into counties but Louisiana is divided into parishes and Alaska into boroughs. It would make no sense to put
[[:Category:Parishes of Louisiana, USA]] under [[:Category:Parishes of the United States]] (which would only have one
subcategory), so we include them under [[:Category:Counties of the United States]]. An alternative would be to name
the skip-polity category to explicitly include parishes and boroughs; this would get awkward here but is done in some
cases. Similarly, [[:Category:Regional county municipalities of Quebec]] is placed under
[[:Category:Regional municipalities of Canada]] since that name is used in other provinces. Meanwhile,
[[:Category:Regional districts of British Columbia]] disables its skip-polity category since no other province or
territory of Canada has regional districts or comparable subpolities under a different name (an alternative would be
to place them under [[:Category:Counties of Canada]], since they are sort of comparable to counties).
# Placetype categories for first-level divisions of a country similarly (e.g. [[:Category:States of the United States]])
have a toponym parent (in this case [[:Category:United States]]), but in place of the skip-polity parent they have two
other parents: a "bare placetype" parent (in this case [[:Category:States]]) and the "generic" parent
[[:Category:Political divisions of specific countries]]. (There is also a bare [[:Category:Political divisions]]
that groups "bare placetype" categories.) Skip-polity placetype categories for second-level divisions of a country
(e.g. [[:Category:Counties of the United States]]) work the same. Placetype categories for countries work likewise
except they are missing the generic parent.
===Place descriptions===
A given place description is defined internally in a table of the following form:
```{
placetypes = {"``placetype``", "``placetype``", ...},
holonyms = {
{ -- holonym object; see below
placetype = "``placetype``" or nil,
display_placename = "``placename``",
unlinked_placename = "``placename``",
langcode = "``langcode``" or nil,
no_display = BOOLEAN,
needs_article = BOOLEAN,
force_the = BOOLEAN,
affix_type = "``affix_type``" or nil,
pluralize_affix = BOOLEAN,
suppress_affix = BOOLEAN,
continue_cat_loop = BOOLEAN,
},
...
},
order = { ``order_item``, ``order_item``, ... }, -- (only for new-style place descriptions),
joiner = "``joiner_string``" or nil,
holonyms_by_placetype = {
``holonym_placetype`` = {"``placename``", "``placename``", ...},
``holonym_placetype`` = {"``placename``", "``placename``", ...},
...
},
}```
Holonym objects have the following fields:
* `placetype`: The canonicalized placetype if specified as e.g. `c/Australia`; nil if no slash is present (in which case
the placename in `display_placename` refers to raw text).
* `display_placename`: The placename or raw text, in the format to be displayed. Placename display aliases have already
been resolved. It is raw text if `placetype` is nil.
* `unlinked_placename`: Same as `display_placename` but with links and HTML removed.
* `langcode`: The language code prefix if specified as e.g. `c/fr:Australie`; otherwise nil.
* `no_display`: If true (holonym prefixed with !), don't display the holonym but use it for categorization.
* `needs_article`: If true, prepend an article if the placename needs one (e.g. `United States`).
* `force_the`: If true, always prepend the article `the`. Example use: holoynm 'city:pref:the/Gold Coast', which gets
formatted as `(the) city of the [[Gold Coast]]`.
* `affix_type`: Type of affix to prepend (values `pref` or `Pref`) or append (values `suf` or `Suf`). The actual affix
added is the placetype (capitalized if values `Pref` or `Suf` are given), or its plural if
`pluralize_affix` is given. Note that some placetypes (e.g. `district` and `department`) have inherent
affixes displayed after (or sometimes before) them.
* `pluralize_affix`: Pluralize any displayed affix. Used for holonyms like `c:pref/Canada,US`, which displays as
`the countries of Canada and the United States`.
* `suppress_affix`: Don't display any affix even if the placetype has an inherent affix. Used for the non-last
placenames when there are multiple and a suffix is present, and for the non-first placenames when
there are multiple and a prefix is present.
* `continue_cat_loop`: If true (holonym used :also), continue producing categories starting with this holonym when
preceding holonyms generated categories.
Note that new-style place descs (those specified as a single argument using <<...>> to denote placetypes, placetype
qualifiers and holonyms) have an additional `order` field to properly capture the raw text surrounding the items
denoted in double angle brackets. The ``order_item`` items in the `order` field are objects of the following form:
```{
type = "``order_type``",
value = "STRING" or INDEX,
}```
Here, the ``order_type`` is one of `"raw"`, `"qualifier"`, `"placetype"` or `"holonym"`:
* `"raw"` is used for raw text surrounding `<<...>>` specs.
* `"qualifier"` is used for `<<...>>` specs without slashes in them that consist only of qualifiers (e.g. the spec
`<<former>>` in `<<former>> French <<colony>>`).
* `"placetype"` is used for `<<...>>` `specs without slashes that do not consist only of qualifiers.
* `"holonym"` is used for holonyms, i.e. `<<...>>` specs with a slash in them.
For all types but `"holonym"`, the value is a string, specifying the text in question. For `"holonym"`, the value is a
numeric index into the `holonyms` field.
It should be noted that placetypes and placenames occurring inside the holonyms structure are canonicalized, but
placetypes inside the placetypes structure are as specified by the user. Stripping off of qualifiers and
canonicalization of qualifiers and bare placetypes happens later.
The information under `holonyms_by_placetype` is redundant to the information in holonyms but makes categorization
easier. The holonym placenames listed here already have category aliases applied.
For example, the call {{tl|place|en|city|s/Pennsylvania|c/US}} will result in the return value
```{
placetypes = {"city"},
holonyms = {
{ placetype = "state", display_placename = "Pennsylvania", unlinked_placename = "Pennsylvania" },
{ placetype = "country", display_placename = "United States", unlinked_placename = "United States" },
},
holonyms_by_placetype = {
state = {"Pennsylvania"},
country = {"United States"},
},
}```
Here, the placetype aliases `s` and `c` have been expanded into `state` and `country` respectively, and the placename
display alias `US` has been expanded into `United States`. PLACETYPES is a list because there may be more than one. For
example, the call {{tl|place|en|city/and/municipality|p/[[Kwango]] Province|c/Congo}} will result in the return value
```
{
placetypes = {"city", "และ", "municipality"},
holonyms = {
{ placetype = "province", display_placename = "[[Kwango]] Province", unlinked_placename = "Kwango Province" },
{ placetype = "country", display_placename = "Congo", unlinked_placename = "Congo" },
},
holonyms_by_placetype = {
country = {"Congo"},
},
}```
Here, the `unlinked_placename` field has removed links from `display_placename`.
The value in the key/value pairs is likewise a list; e.g. the call {{tl|place|en|city|s/Kansas|and|s/Missouri}} will
return
```
{
placetypes = {"city"},
holonyms = {
{ placetype = "state", display_placename = "Kansas", unlinked_placename = "Kansas" },
{ display_placename = "และ", unlinked_placename = "และ" },
{ placetype = "state", display_placename = "Missouri", unlinked_placename = "Missouri" },
},
holonyms_by_placetype = {
state = {"Kansas", "Missouri"},
},
}
```
Note that in `get_cats()` (which runs after the display form has been generated), further changes to the holonym
structure are made to aid in categorization. For example, after `handle_category_implications()` and
`augment_holonyms_with_container()` are called, the above structure will look more like
```
{
placetypes = {"city"},
holonyms = {
{ placetype = "state", display_placename = "Kansas", unlinked_placename = "Kansas" },
{ placetype = "country", unlinked_placename = "United States" },
{ display_placename = "และ", unlinked_placename = "และ" },
{ placetype = "state", display_placename = "Missouri", unlinked_placename = "Missouri" },
{ placetype = "country", unlinked_placename = "United States" },
},
holonyms_by_placetype = {
state = {"Kansas", "Missouri"},
country = {"United States"}
},
}
```
===Overall place specs===
The overall place spec parsed by `parse_overall_place_spec` has the following fields:
* `lang`: The language object (from {{para|1}}).
* `args`: The parsed arguments from the {{tl|place}} call.
* `directives`: List of form-of directives (starting with `@`) parsed from the numeric args beginning with {{para|2}}.
Each directive contains fields `directive` (the directive as specified by the user, e.g. `"former name of"`);
`terms` (list of term objects for the terms specified by the user); `conj` (conjunction specified by the user using
inline modifier `<conj:...>`, or {nil}); `spec` (the corresponding directive spec from `all_form_of_directives`);
`pretext` (the text to display directly before the directive); `posttext` (the text to display directly after the
directive; {nil} except for the last directive).
* `descs`: List of one or more place description objects parsed from the numeric args beginning with {{para|2}}, as
described above.
* `extra_info`: List of extra-info objects for extra info specified using arguments such as {{para|capital}},
{{para|modern}}, etc. Objects are in the order they should be displayed, and each object contains fields `spec` (the
spec for the type of extra info, taken from `export.extra_info_args`), `terms` (list of term objects for the terms
specified by the user); and `conj` (conjunction specified by the user using inline modifier `<conj:...>`, or {nil}).
===Category determination===
The algorithm to find the categories to which a given place belongs works off of a place description (which specifies
the entry placetype(s) and holonym(s); see above). If there are multiple place descriptions, each is processed
independently to generate categories. Likewise, if there are multiple entry placetypes in a given place description,
each is processed independently with all the holonyms of the description to generate categories. Furthermore, before
the category-generation algorithm runs, earlier steps have modified the holonyms of the place description (inserting
containing polities whenever possible; see the description above of `handle_category_implications()` and
`augment_holonyms_with_container()`).
Given a single entry placetype and a place description, the algorithm to generate categories processes holonyms from
left to right until it finds one that "matches" in that it produces one or more categories. At that point it attempts
to generate categories for all other holonyms in the place description of the same placetype. Normally, it then stops
processing holonyms, but if a holonym is marked using the `:also` modifier, the category generation process starts over
starting with that holonym (or the leftmost such remaining holonym, if there is more than one marked with `:also`).
This makes it possible, for example, to specify the description of a river that passes through two different types of
political divisions (e.g. Alberta and the Northwest Territories), or categorize a geographic region at both the
continent and country level, such as this:
<pre>
{{place|en|historical region|r/Eastern Europe|located in southeastern|c:also/Poland|*and western|c/Ukraine}}
</pre>
Here, `r/Eastern Europe` has a category implication that adds `cont/Europe` as a holonym directly after it, which
causes the page to be categorized into [[:Category:en:Geographic and cultural areas of Europe]]. The category generation
process would normally stop at this point, but the presence of `:also` causes it to restart with `c/Poland` and
generate the category [[:Category:en:Geographic and cultural areas of Poland]]. After doing this, it looks for other
holonyms of the same placetype as `c/Poland` (i.e. other countries), which causes it to process `c/Ukraine` and generate
the category [[:Category:en:Geographic and cultural areas of Ukraine]].
The category generation process works off of the `placetype_data` table, which specifies various properties for
placetypes, such as how to display a holonym of that placetype as well as how to categorize certain pages where the
{{tl|place}} call contains the specified placetype as an entry placetype. For example, the entry for `city-state` in
[[Module:place/placetypes]] might look like
```
["city-state"] = {
link = true,
category_link = "[[sovereign]] [[microstate]]s consisting of a single [[city]] and [[w:dependent territory|dependent territories]]",
has_neighborhoods = true,
class = "settlement",
["continent/*"] = {"City-states", "นคร", "ประเทศ", "ประเทศใน+++", "เมืองหลวงของประเทศ"},
default = {"City-states", "นคร", "ประเทศ", "เมืองหลวงของประเทศ"},
},
```
Here, the keys specify, respectively:
# If `city-state` occurs as an entry placetype, link it to the corresponding Wiktionary entry (that is what `true` means
in `link = true`).
# Use the specified `category_link` text for categories such as [[:Category:City-states]].
# City-states are "city-like", i.e. they have neighborhoods; this controls the handling of entry placetypes such as
`neighborhood`, `district`, `area`, etc.
# City-states should be treated as settlements for determining how to handle the placetype `former city-state` and for
categorizing the bare category [[:Category:City-states]] and language-specific equivalents such as
[[:Category:en:City-states]].
# When the entry placetype `city-state` occurs along with a continent holonym, categorize into the specified categories
under `continent/*`. Here, `+++` stands for the holonym in question.
# When the entry placetype `city-state` occurs in any other context, categorize into the specified categories under
`default`.
It's important to realize that the only categorization keys under a given placetype entry that are specified
explicitly in [[Module:place/placetypes]] are certain wildcard keys such as `continent/*` above (i.e. containing a slash
followed by `*`) and under the key `default`. All the remaining categorization happens through category handlers, based
on the information on known locations in [[Module:place/locations]]. For example, [[Module:place/locations]] has an
"England group" specified similarly to the following:
```
export.england_group = {
default_container = {key = "England", placetype = "constituent country"},
default_placetype = "county",
default_divs = {
"districts",
{type = "local government districts", cat_as = "districts"},
{
type = "local government districts with borough status",
cat_as = {"districts", "boroughs"},
},
{type = "boroughs", cat_as = {"districts", "boroughs"}},
"civil parishes",
},
default_british_spelling = true,
data = export.england_counties,
}
```
The `default_divs` key here specifies the divisions that exist for each of the counties listed under the `data` key
(unless the key overrides them). Here, the entry `{type = "boroughs", cat_as = {"districts", "boroughs"}}` directs the
category handler `political_division_cat_handler` in [[Module:place/placetypes]] (which is one of two category handlers
that run for all entry placetypes, along with `generic_place_cat_handler`) to categorize boroughs specified under any of
the counties listed under `data` as both districts and boroughs.
Now, the categorization process proceeds as follows, given an entry placetype and place description, which specifies a
set of holonyms (the code to do this is in `get_placetype_cats()`):
# First, look up the entry placetype and any equivalent placetypes in `placetype_data`, which is defined in
[[Module:place/placetypes]]. Note that the entry in `placetype_data` that specifies the placetype information that is
used to determine the category or categories may not directly correspond to the entry placetype as specified in the
place description. For example, if the entry placetype is `small town`, the placetype whose data is fetched will be
`town` since `small` is a recognized qualifier and there is no entry in `placetype_data` for `small town`. As another
example, if the entry placetype is `administrative capital`, the code will first look up `administrative capital` and
then look up `capital city`, which is where the category handler is found, because `administrative capital` specifies
`capital city` as its fallback.
# Then, iterate over holonyms from left to right, as described above. For each holonym, we proceed as follows:
## First, call `political_division_cat_handler` to check if the entry placetype and holonym match a division in the
`locations` data in [[Module:place/locations]], as in the example above. Note that when doing this, holonyms are
canonicalized so that e.g. `co/Bedfordshire` gets mapped to `county/Bedfordshire` (because there is an entry in
`placetype_aliases` in [[Module:place/placetypes]] that maps `co` to `county`) and `c/USA` gets mapped to
`country/United States` (because there is an entry in the location data for the list of countries that maps
`country/USA` to `country/United States` for both display and categorization purposes). This category handler, as
with all such handlers, is passed the entry placetype and holonym being processed, but is also passed the entire
place description, so it can look at other specified holonyms (particularly those that follow). It either returns
{nil} or a list of category specs (which are the actual categories minus the preceding language code).
## If `political_division_cat_handler` doesn't generate any categories, check if there is a category handler defined
using the `cat_handler` key for the entry placetype. If so, call it to generate the categories (if any).
## If the category handler returns {nil}, or there is no category handler, look for a ''wildcard key'' of the format
e.g. `country/*`, which matches any holonym of placetype `country`. If found, the value is a list of category specs,
which are processed as above.
## If we get this far without generating any categories, move to the next holonym.
## If we do generate any categories, process all other holonyms of the same placetype. For example, if the user says
{{tl|place|en|city|s/Kansas|and|s/Missouri}}, when we get to the holonym `s/Kansas`, we generate the category
[[:Category:en:Cities in Kansas, USA]]. This causes us to look for other holonyms of the same placetype `state`,
and process them accordingly, generating a category [[:Category:en:Cities in Missouri, USA]] as well. The same thing
happens in an invocation like {{tl|place|pl|river|c/Poland,Ukraine,Belarus}}.
# Once we generate categories for a holonym and any other holonyms of the same placetype, we normally stop processing
holonyms. But if a holonym has the `:also` modifier, we restart the left-to-right loop at that holonym. For example,
in the invocation {{tl|place|en|river|flowing through|p/Alberta|p/British Columbia|and the|terr/Northwest Territories}},
we will generate a category [[:Category:en:Rivers in Alberta, Canada]] as well as
[[:Category:en:Rivers in British Columbia, Canada]] (because British Columbia is of the same placetype as Alberta);
but no category will be generated for the Northwest Territories, which is of a different placetype. To fix this, write
{{tl|place|en|river|flowing through|p/Alberta|p/British Columbia|and the|terr:also/Northwest Territories}}. The use
of `:also` will cause holonym processing to resume at `Northwest Territories` after `Alberta` is processed, leading to
an additional category [[:Category:en:Rivers in the Northwest Territories, Canada]]. (The presence of `the` in this
last category is because `Northwest Territories` is a known location with a spec indicating that it should be preceded
by `the`; it has nothing to do with the raw text `and the` in the invocation.)
# Finally, if we process all holonyms and don't end up producing any categories, we check the entry placetype's data for
a `default` key. If found, it lists category specs, which are processed to generate categories. This is used, for
example, in the placetype `city-state`, as described above.
# It should be noted that the above process runs independently for each combination of entry placetype and place
description. Thus, for example, an invocation {{tl|place|en|city/and/county|s/Kansas,Missouri|c/USA}} will generate
categories for both cities and counties in both Kansas and Missouri.
# Two additional sources of categories are ''bare location'' categories and ''generic place'' categories. These
categories are added by appropriate calls in the outer function `get_cats`, which iterates over placetypes and place
descriptions, calling `get_placetype_cats` on each combination.
## Bare location categories are categories like [[:Category:Arizona, USA]] that are related-to categories containing
terms related to the specified location. The bare location code, for example, adds the term [[Arizona]], and its
equivalents in other languages, to [[:Category:Arizona, USA]]. When looking for terms to consider, it checks the
pagename, the glosses specified using {{para|t}}, and the terms specified using {{para|modern}}, {{para|short}} and
{{para|full}}. It looks to see if any of these parameters match any known locations, but only adds them to a bare
location category if (a) the specified entry placetype matches, so that for example Russian `[[Джорджия]]` goes into
[[:Category:Georgia, USA]] while `[[Грузия]]` goes into [[:Category:Georgia]] (the country), even though both have a
gloss `Georgia`; and (b) there are no conflicting holonyms, so that for example the Old English term [[Munucceaster]]
if defined similarly to {{tl|place|ang|city|in modern|cc/England|t=Newcastle}} won't get added to
[[:Category:Newcastle, New South Wales]] (even though it is also a city) because the latter city is known to be in
Australia, which conflicts with the country `United Kingdom` (added internally to the Old English place description
through the holonym augmentation process, based on the holonym `cc/England`).
## Generic place categories are categories like [[:Category:Places in Kansas, USA]] and [[:Category:Places in England]]
that contain places of arbitrary placetype. These are added through a special category handler that operates like
other category handlers but is run for all placetypes, rather than only for the specified one(s).
]==]
--[=[
TODO/FIXME:
1. [DONE] Neighborhoods should categorize at the city level. Categories like [[:Category:Places in Los Angeles]] exist
but not [[:Category:Neighborhoods in Los Angeles]]; we can refactor the code in generic_cat_handler() to support this
use case.
2. Display handlers should be smarter. For example, 'co/Travis' as a holonym should display as 'Travis County' in the
United States, but (I think) display handlers don't currently have the full context of holonyms passed in to allow
this to happen.
3. Connected to this, we have various display handlers that add the name of the holonym after or (sometimes) before the
placename if it's not already there. An example is the county_display_handler() in [[Module:place/placetypes]], which
adds "County" before Ireland and Northern Ireland counties and after Taiwan and Romania counties. This should be
integrated into the polity group for these respective polities through a setting rather than requiring a separate
handler that has special casing for various polities.
4. Placetypes for toponyms should also have display handlers rather than just fixed text. This should allow us to
dispense with the need for special types for "fpref" = "French prefecture" (which displays as "prefecture" but links
to the appropriate Wikipedia article on Frenc prefectures, which are completely different from the more general
concept of prefecture). Similarly for "Polish colony" and "Welsh community". ("Israeli settlement" should probably
stay as-is because it displays as "Israeli settlement" not just "settlement".)
5. [DONE] Currently, categories for e.g. states and territories of Australia go into
[[:Category:States and territories of Australia]] but terms for states and territories of Australia go into
(respectively) [[:Category:States of Australia]] and [[:Category:Territories of Australia]]. We should fix this;
maybe this is as easy as setting cat_as in the respective divs definitions.
6. Probably cat_as should support raw categories as well as category types; raw categories would be indicated by being
prefixed with "Category:".
7. [MOSTLY DONE] Update documentation.
8. [DONE] Rename remaining political division categories to include name of country in them.
9. [DONE] Add Pakistan provinces and territories.
10. [DONE] Add a polity group for continents and continent-level regions instead of special-casing. This should make it
possible e.g. to have Jerusalem as a city under "Asia".
11. [DONE] Add better handling of cities that are their own states, like Mexico City.
12. [DONE] Breadcrumb for e.g. [[Category:Aguascalientes, Mexico]] is "Aguascalientes, Mexico" instead of just
"Aguascalientes".
13. [DONE] Unify aliasing system; cities have a completely different mechanism (alias_of) vs. polities/subpolities
(which use`placename_cat_aliases` and `placename_display_aliases` in [[Module:place/placetypes]]).
14. [DONE] More generally, cities should be unified into the polity grouping system to the extent possible; this would
allow for divs of cities (see #17 below).
15. [DONE] We have `no_containing_polity_cat` set for Lebanon, Malta and Saudi Arabia to prevent country-level
implications from being added due to generically-named divisions like "North Governorate", "Central Region" and
"Eastern Province" but (a) this setting seems to do multiple things and should be split, (b) it should be possible
to set this at the division level instead of the country level.
16. Split out the data from the handlers so we can use loadData() on the data because it's becoming very big.
17. [DONE] Cities like Tokyo have special wards; "prefecture-level cities" like Wuhan (which aren't really cities but we
treat them as such) have districts, subdistricts, etc. We need to support divs for cities and even named divisions
of cities (such as we already have for boroughs of New York City).
18. [DONE] It should be allowed to set 'true' to any qualifier (which links it) and have it work correctly; qualifier lookup
in [[Module:place]] needs to remove links first.
19. [DONE] Categories 'Historical polities' and 'Historical political subdivisions' should be renamed 'Former ...' since
"historic(al)" is ambiguous (cf. "historic counties" in England which are not former, but still have a legal
definition).
20. [PARTLY DONE; SUPPORT IS THERE BUT FORMER PROVINCES NOT YET CATEGORIZED] It should be possible to categorize former
subpolities of certain polities; cf. [[:Category:ja:Provinces of Japan]], which contains former provinces.
21. [DONE] In subpolity_keydesc(), we need to generate the correct indefinite article and have a huge hack to check
specifically for "union territory", which is the only placetype that shows up in this function where the default
indefinite article generating function fails. To fix this properly, we need to separate out the non-category
placetype data from `cat_data` in [[Module:place/placetypes]] and move it to [[Module:place/locations]], because we
don't have access to the data in [[Module:place/placetypes]], and that data indicates the correct article for
placetypes like "union territory".
22. [DONE] Simplify the specs in `cat_data`, eliminating the distinction between "inner" and "outer" matching. There
should not be two levels, just one. For example, in "district", instead of
["country/Portugal"] = {
["itself"] = {"Districts and autonomous regions of +++"},
}
we should just have
["country/Portugal"] = {"Districts and autonomous regions of +++"},
And in "dependent territory", instead of
["default"] = {
["itself"] = {true},
["country"] = {true},
},
we should just have
["itself"] = {true},
["country/*"] = {true},
It appears the only remaining spec that can't be easily converted in this fashion is for "subdistrict":
["country/Indonesia"] = {
["municipality"] = {true},
},
This seems to be specifically for Jakarta and doesn't seem to work anyway, as the two entries in
[[:Category:en:Subdistricts of Jakarta]] and the one entry in [[:Category:id:Subdistricts of Jakarta]] are manually
categorized.
23. [DONE] Consolidate the remaining stuff in [[Module:category tree/topic cat/data/Earth]] into
[[Module:category tree/topic cat/data/Places]].
24. [DONE] The `generic_cat_handler` that categorizes into `Places in FOO` is smart enough not to categorize cities that
are in different polities from the specified containing polity/polities of the city, but doesn't do the same for
larger-level divisions. Likewise for the `city_type_cat_handler`. There are some sufficiently generically-named
divisions that this issue can occur; for example, [[Koforidua]], the capital city of Eastern Region, Ghana, is
incorrectly categorized under [[:Category:en:Cities in Eastern Region, Malta]] and
[[:Category:en:Places in Eastern Region, Malta]]. Note that the function `augment_holonyms_with_container`
''DOES'' do such checks, so we should be able to refactor the code out of that function and use it elsewhere.
25. [DONE] The `generic_cat_handler` that categorizes into `Places in FOO` is smart enough not to categorize cities that
are in different polities from the specified containing polity/polities of the city; but how smart is it? It will
successfully avoid categorizing a neighborhood in e.g. [[Columbus]], [[Georgia]] that doesn't explicitly mention the
US (only `s/Georgia`) into [[:Category:en:Places in Columbus]], which is for Columbus, Ohio, but will it do the same
for a hypothetical neighborhood of Columbus in say Merseyside, England? This should be investigated. It will
probably work for a hypothetical Columbus in [[Canada]] because `augment_holonyms_with_container` would
auto-add Canada as an additional holonym once say `p/Ontario` is mentioned, but I think there's a setting preventing
this augmentation from happening for the UK. (This relates to FIXME #15. `no_containing_polity_cat` is set on
England, Scotland, etc. to prevent the toponyms from being added to [[:Category:en:Places in the United Kingdom]],
but this same setting is used to prevent augmentation, which it should not be; there should be different settings.)
26. [DONE] The `generic_cat_handler` (or more specifically `find_holonym_keys_for_categorization`) checks for city
holonyms by looking specifically for holonym type `city`. But some cities (particularly those in China) can be
specified using different holonym types, e.g. `prefecture-level city`, `subprovincial city`, etc. We should allow
these when appropriate (which means the cities in China need to have a `placetype` set that indicates their
regional-level status as well as just `city`). I'm not sure if cities support specifying a custom `placetype` at the
moment; this relates to FIXME #14 above concerning unifying cities and political divisions internally.
27. [DONE] The bare category handler (`get_bare_categories` in [[Module:place/placetypes]]) is not smart enough to avoid
overcategorizing cities or other divisions that are of the right placetype but in the wrong containing polity. For
example, Asturian [[Llión]] "León (city in Spain)" gets put in [[:Category:ast:León]] even though the latter is
supposed to refer to a city in Mexico. We can borrow the check-containing-polity code from `generic_cat_handler`.
28. [DONE] Redo handling of singular and plural to respect overrides specified in placetype_data. Check more carefully
for things that may not singularize correctly, e.g. 'passes' -> 'passe'? Definitely 'headquarters' and variants.
29. [DONE] Combine placetype_equivs and other placetype data into `placetype_data`. Figure out if we need the
distinction between `placetype_equivs` and `fallback`.
30. `has_neighborhoods` may need to be a function that can look at the containing holonyms to determine whether the
entity in question is city-like.
31. [DONE] Bare placenames as they appear in holonyms (e.g. `Riau Islands`) instead of category keys (e.g.
`the Riau Islands, Indonesia`) should appear in the polity data tables. As a first pass, the word "the" should not
appear but should instead be a property of the polity.
32. [DONE] `capital_city_cat_handler` should use `get_holonyms_to_check()`.
33. [PARTLY DONE] The code to generate and parse the correct preposition ("in" or "of") is very convoluted, and the
actual preposition used is specified in various locations with various defaults, sometimes hardcoded. This should be
simplified. It is made more difficult by the fact that the in/of distinction occurs in several places:
(a) when generating the {{place}} text in old-style descriptions where the preposition isn't explicitly given, which
uses the `preposition` setting in placetype_data, defaulting to "in";
(b) when generating categories based on explicit category specs in placetype_data (which are gradually being
deprecated), which likewise uses the `preposition` setting in placetype_data, defaulting to "in";
(c) when generating categories based on political_division_cat_handler, originating in the `divs` placetypes for
specific known locations in [[Module:place/locations]], which uses the `prep` setting embedded in the `divs`
specifications, defaulting to "of";
(d) when generating categories based on category handlers specified using the `cat_handler` property of entries in
placetype_data, which tend to hardcode "in" or "of" depending on the specific category handler;
(e) when generating category descriptions in [[Module:category tree/topic/Places]] for `divs` categories generated
in (c), which (correctly) uses the same `prep` setting embedded in the `divs` settings that is used when
generating the categories themselves;
(f) when generating category descriptions for categories generated in (b) and (d) above, which relies on the
`generic_before_non_cities` and `generic_before_cities` settings in placetype_data, which need to match the
corresponding prepositions hardcoded in the category generation handlers. Instead of the hardcoding, the
category generation handler should respect the `generic_before_*` settings.
34. [[Krakow]] defined as {{place|en|A <<city>> on the [[Vistula]] River, the <<capital>> of the <<voi/Lesser Poland Voivodeship>> in southern <<c/Poland>>}}
categorizes under [[:Category:Voivodeship capitals]] when it should probably instead be under
[[:Category:Voivodeship capitals of Poland]]. Possibly this is because the various voivodeships haven't yet been
entered as known locations, but this should happen regardless of that.
35. {{tcl}} bugs:
a. [DONE] Lowercase initial letter in new-style {{place}} descriptions in {{tcl}}. Maybe we can have a setting
tcl_nolc=1 to prevent this from happening.
b. [DONE] tcl= and probably new-style {{place}} descriptions in general should recognize ;; to separate distinct {{place}}
descriptions, and similarly ;;and as the equivalent of regular `;and`, etc.
c. [DONE] The value supplied in `modern=` should be displayed in {{tcl}} descriptions regardless of the setting that
normally disables this, so that e.g. the foreign-language equivalent of [[British Honduras]] doesn't just say
it's a former British colony in Central America but specifically identifies it as modern Belize. If the user
gives, place_modern= in {{tcl}}, that should override the modern= value and still display.
d. [DONE] The page supplied to {{tcl}} should be used for generating bare categories even if t= is supplied and
overrides the English term displayed. [DONE]
e. [DONE] If text follows {{place}} and begins with a semicolon, the semicolon isn't copied into {{tcl}}.
36. County boroughs used as holonyms currently display 'borough county borough' because there's an affix setting for
'county borough' and a fallback display handler for 'borough'. We need to rethink this; maybe merge the affix
setting and display handlers.
37. Implement known-location groups and specs in a more standardly object-oriented way using metatables.
38. Implement caching of known location lookup in the holonym. This may have to be keyed by placetype, but we can have a
special field for when the lookup placetype is the same as the user-specified placetype of the holonym. Use this
known location in place of looking up known locations and store the appropriate known location there in
`augment_holonyms_with_container()` instead of calling `key_to_placename`.
39. Bug fixes with 'the':
(a) [DONE] [[Kazaň]] defined as {{place|cs|caplc|rep:Pref/Tatarstan|c/Russia|t1=Kazan}} displays as
"Republic of the Tatarstan".
(b) [[Valday]] defined as {{place|en|town/administrative center|dist:Suf/Valdaysky|obl/Novgorod|c/Russia}}
displays as "a town, the administrative center of the Valdaysky District". Changing to `dist:suf/Valdaysky`
displays as "... of Valdaysky district".
40. [DONE] Bug fix with 'the': [[Verkhoyansk]] defined as {{place|en|town|rep/Sakha|c/Russia}} displays as "a town in
the Sakha".
41. [DONE] [[Category:Cities in Asia]] has [[Category:Cities in Eurasia]] as a parent, which in turn has
[[Category:Cities in the Earth]] as a parent. Continents should not have the second parent like this.
42. [DONE] When checking `british_spelling`, it should check all containers as well; otherwise it's too hard to keep
this in sync across cities, administrative divisions and countries.
43. [DONE] `skip_polity_parent_type` should be renamed to container_parent_type or similar.
44. There should be a flag to allow e.g. departments of France that are currently categorized as departments of their
region to also be categorized as departments of France.
45. [DONE] Aliases are causing iterate_matching_holonym_location() to fail, e.g. if [[براق]] "Prague" is specified as
{{place|acw|capital city|c/Czechia|t1=Prague}}, this fails add a bare category [[Category:acw:Prague]] because
the code in iterate_matching_holonym_location() isn't resolving aliases when comparing the known container
'Czech Republic'. Probably we want to build an alias table to speed up these sorts of lookups.
46. [DONE; DUE TO TYPO IN HANDLER] The district cat handler is failing to work right, e.g. in [[Saint-Gaudérique]]
defined as {{place|fr|district|city/Perpignan|in|dept/Pyrénées-Orientales|r/Occitania|c/France|t=Saint-Gaudérique}},
only the 'Places in ...' categories are getting triggered.
47. Suburbs of a given city aren't generally in the city and may not even be in the same country or country division,
so they should not categorize as "Places in ..." based on the city and specified country and division. Same goes
for "enclave" (within somewhere) and "exclave".
48. When converting display aliases, we should automatically convert full placenames to full placenames and elliptical
placenames to elliptical placenames instead of always either doing elliptical or full placenames depending on the
value of `display_as_full`.
49. `@obsolete form of` and `@archaic form of` should automatically trigger nocat=1.
50. The handler that adds bare categories should pick up values in <eq:...>.
]=]
--[==[ var:
List specifying the allowed form-of directives, used for former names, official names, abbreviations, etc. of places.
The key is the form-of directive and the value is an object with the following properties:
* `text`: The actual text displayed before the terms. If the value is `+`, the key is used as the text. If the value is
a function, it is passed a single argument, the overall place spec (see comment at top of file) and should return
the text to be displayed.
* `type_prefix`: The prefix used to generate the placetype for looking up the appropriate category or categories in the
placetype data structure. Can be omitted if there are no categories associated with the directive.
* `conjunction`: The conjunction used to join multiple terms, defaulting to `and`.
* `cat`: Additional category or categories to add the term to, whenever this particular directive is used. Normally the
value is a topic-style category minus the langcode prefix, but if prefixed with `cln:`, it is a langname-style
category. For example, the value `"Abbreviations"` would correspond to a category [[:Category:en:Abbreviations]]
(assuming the language of the {{tl|place}} call is English), while the value `"cln:abbreviations"` corresponds to a
category [[:Category:English abbreviations]]. Use a list of such specs for multiple categories.
* `default_foreign`: If specified, the default language of terms given along with this directive is the language in
{{para|1}}; otherwise it is English.
]==]
export.all_form_of_directives = {
["former name of"] = {text = "+", type_prefix = "FORMER_NAME_OF"},
["fmr of"] = {alias_of = "former name of"},
["ancient name of"] = {text = "+", type_prefix = "FORMER_NAME_OF"},
["official name of"] = {text = "+", type_prefix = "OFFICIAL_NAME_OF"},
["former official name of"] = {text = "+", type_prefix = "FORMER_OFFICIAL_NAME_OF"},
["long form of"] = {text = "+", type_prefix = "LONG_FORM_OF"},
["former long form of"] = {text = "+", type_prefix = "FORMER_LONG_FORM_OF"},
["nickname for"] = {text = "+", type_prefix = "NICKNAME_FOR"},
["official nickname for"] = {text = "+", type_prefix = "OFFICIAL_NICKNAME_FOR"},
["former nickname for"] = {text = "+", type_prefix = "FORMER_NICKNAME_FOR"},
["derogatory name for"] = {text = "[[Appendix:Glossary#derogatory|derogatory]] name for", type_prefix = "DEROGATORY_NAME_FOR"},
["synonym of"] = {text = "+"},
["syn of"] = {alias_of = "synonym of"},
["abbreviation of"] = {text = "[[Appendix:Glossary#abbreviation|abbreviation]] of", type_prefix = "ABBREVIATION_OF", cat = "cln:abbreviations",
default_foreign = true},
["abbr of"] = {alias_of = "abbreviation of"},
["abbrev of"] = {alias_of = "abbreviation of"},
["initialism of"] = {text = "[[Appendix:Glossary#initialism|initialism]] of", type_prefix = "ABBREVIATION_OF", cat = "cln:initialisms",
default_foreign = true},
["init of"] = {alias_of = "initialism of"},
["acronym of"] = {text = "[[Appendix:Glossary#acronym|acronym]] of", type_prefix = "ABBREVIATION_OF", cat = "cln:acronyms",
default_foreign = true},
["syllabic abbreviation of"] = {text = "[[Appendix:Glossary#syllabic abbreviation|syllabic abbreviation]] of", type_prefix = "ABBREVIATION_OF", cat = "cln:syllabic abbreviations",
default_foreign = true},
["sylabbr of"] = {alias_of = "syllabic abbreviation of"},
["sylabbrev of"] = {alias_of = "syllabic abbreviation of"},
["ellipsis of"] = {text = "[[Appendix:Glossary#ellipsis|ellipsis]] of", type_prefix = "ELLIPSIS_OF", cat = "cln:ellipses",
default_foreign = true},
["ellip of"] = {alias_of = "ellipsis of"},
["clipping of"] = {text = "[[Appendix:Glossary#clipping|clipping]] of", type_prefix = "CLIPPING_OF", cat = "cln:clippings",
default_foreign = true},
["clip of"] = {alias_of = "clipping of"},
["alternative form of"] = {text = "+", default_foreign = true},
["alt form"] = {alias_of = "alternative form of"},
["alternative spelling of"] = {text = "+", default_foreign = true},
["alt spell"] = {alias_of = "alternative spelling of"},
["alt sp"] = {alias_of = "alternative spelling of"},
["dated form of"] = {text = "[[Appendix:Glossary#dated|dated]] form of", type_prefix = "DATED_FORM_OF", cat = "cln:dated forms",
default_foreign = true},
["dated form"] = {alias_of = "dated form of"},
["dated spelling of"] = {text = "[[Appendix:Glossary#dated|dated]] spelling of", type_prefix = "DATED_FORM_OF", cat = "cln:dated forms",
default_foreign = true},
["dated spell"] = {alias_of = "dated spelling of"},
["dated sp"] = {alias_of = "dated spelling of"},
["archaic form of"] = {text = "[[Appendix:Glossary#archaic|archaic]] form of", type_prefix = "ARCHAIC_FORM_OF", cat = "cln:archaic forms",
default_foreign = true},
["arch form"] = {alias_of = "archaic form of"},
["archaic spelling of"] = {text = "[[Appendix:Glossary#archaic|archaic]] spelling of", type_prefix = "ARCHAIC_FORM_OF", cat = "cln:archaic forms",
default_foreign = true},
["arch spell"] = {alias_of = "archaic spelling of"},
["arch sp"] = {alias_of = "archaic spelling of"},
["obsolete form of"] = {text = "[[Appendix:Glossary#obsolete|obsolete]] form of", type_prefix = "OBSOLETE_FORM_OF", cat = "cln:obsolete forms",
default_foreign = true},
["obs form"] = {alias_of = "obsolete form of"},
["obsolete spelling of"] = {text = "[[Appendix:Glossary#obsolete|obsolete]] spelling of", type_prefix = "OBSOLETE_FORM_OF", cat = "cln:obsolete forms",
default_foreign = true},
["obs spell"] = {alias_of = "obsolete spelling of"},
["obs sp"] = {alias_of = "obsolete spelling of"},
}
local function get_seat_text(overall_place_spec)
local placetype = overall_place_spec.descs[1].placetypes[1]
if placetype == "county" or placetype == "counties" then
return "county seat"
elseif placetype == "parish" or placetype == "parishes" then
return "parish seat"
elseif placetype == "borough" or placetype == "boroughs" then
return "borough seat"
else
return "seat"
end
end
--[==[ var:
List specifying the allowed arguments containing extra information that is sometimes added to a definition, such as the
capital, largest city, modern name, official name, etc., along with associated properties; displayed in the order given.
Each element is an object with the following properties:
* `arg`: The argument name.
* `text`: The actual text displayed before the terms. If the value is `+`, the argument name is used as the text. If the
value is a function, it is passed a single argument, the overall place spec (see the comment at the top of the file)
and should return the text to be displayed.
* `conjunction`: The conjunction used to join multiple terms, defaulting to `and`.
* `display_even_when_dropped`: Display this piece of extra info even when it would normally be dropped (e.g. in
{{tl|tcl}} when the language is other than English).
* `match_sentence_style`: If true, the text will be capitalized and preceded by a period when ''sentence style'' is
in effect (essentially, when the language is English and there is no translation specified using {{para|t}} or
similar parameter); otherwise, the text will be displayed as-is and preceded by a semicolon. If false, the semicolon
style will always be used.
* `auto_plural`: If true, pluralize the text when there is more than one term.
* `with_colon`: If true, follow the text with a colon. (This colon cannot easily be included in the text itself because
if pluralized, the pluralized text goes before the colon.)
]==]
export.extra_info_args = {
{arg = "modern", text = "+", conjunction = "หรือ", display_even_when_dropped = true},
{arg = "now", text = "now,", conjunction = "หรือ", display_even_when_dropped = true},
{arg = "full", text = "in full,", conjunction = "หรือ", display_even_when_dropped = true},
{arg = "short", text = "short form", conjunction = "หรือ"},
{arg = "abbr", text = "abbreviation", conjunction = "หรือ"},
{arg = "former", text = "formerly,"},
{arg = "official", text = "ชื่อทางการ", match_sentence_style = true, auto_plural = true, with_colon = true},
{arg = "capital", text = "เมืองหลวง", match_sentence_style = true, auto_plural = true, with_colon = true},
{arg = "largest city", text = "นครใหญ่สุด", match_sentence_style = true, auto_plural = true, with_colon = true},
{arg = "caplc", text = "เมืองหลวงและนครใหญ่สุด", match_sentence_style = true, auto_plural = false,
with_colon = true},
{arg = "seat", text = get_seat_text, match_sentence_style = true, auto_plural = true, with_colon = true},
{arg = "shire town", text = "+", match_sentence_style = true, auto_plural = true, with_colon = true},
{arg = "headquarters", text = "+", match_sentence_style = true, auto_plural = false, with_colon = true},
{arg = "center", text = "administrative center", match_sentence_style = true, auto_plural = false, with_colon = true},
{arg = "centre", text = "administrative centre", match_sentence_style = true, auto_plural = false, with_colon = true},
}
export.extra_info_arg_map = {}
for _, spec in ipairs(export.extra_info_args) do
export.extra_info_arg_map[spec.arg] = spec
end
----------- Wikicode utility functions
-- Return a wikilink link {{l|language|text}}
local function link(text, langcode, id)
if not langcode then
return text
end
return m_links.full_link(
{term = text, lang = require(languages_module).getByCode(langcode, true, "allow etym"), id = id},
nil, "allow self link"
)
end
---------- Basic utility functions
-- Add the page to a tracking "category". To see the pages in the "category",
-- go to [[Wiktionary:Tracking/place/PAGE]] and click on "What links here".
local function track(page)
require(debug_track_module)("place/" .. page)
return true
end
local function ucfirst_all(text)
if text:find(" ") then
local parts = split(text, " ", true)
for i, part in ipairs(parts) do
parts[i] = m_strutils.ucfirst(part)
end
return concat(parts, " ")
else
return m_strutils.ucfirst(text)
end
end
local function lc(text)
return mw.getContentLanguage():lc(text)
end
---------- Argument parsing functions and utilities
-- Split an argument on comma, but not comma followed by whitespace.
local function split_on_comma(val)
if val:find(",") then
return require(parse_interface_module).split_on_comma(val)
else
return {val}
end
end
-- Split an argument on slash, but not slash occurring inside of HTML tags like </span> or <br />.
local function split_on_slash(arg)
if arg:find("<") then
local m_parse_utilities = require(parse_utilities_module)
-- We implement this by parsing balanced segment runs involving <...>, and splitting on slash in the remainder.
-- The result is a list of lists, so we have to rejoin the inner lists by concatenating.
local segments = m_parse_utilities.parse_balanced_segment_run(arg, "<", ">")
local slash_separated_groups = m_parse_utilities.split_alternating_runs(segments, "/")
for i, group in ipairs(slash_separated_groups) do
slash_separated_groups[i] = concat(group)
end
return slash_separated_groups
else
return split(arg, "/", true)
end
end
-- Implement "implications", i.e. where the presence of a given holonym causes additional holonym(s) to be added.
-- Implications apply only to categorization. There used to be support for "general implications" that applied to both
-- display and categorization, but there ended up not being any such implications, so we've removed the support. It is
-- a bad idea in any case to have such implications; the user might purposely leave out a higher-level polity to avoid
-- redundancy in several successive definitions, and we wouldn't want to override that. Note that in practice the
-- mechanism implemented by this function is used specifically for non-administrative geographic regions such as
-- Eastern Europe and the West Bank; there is a similar mechanism for administrative regions handled by
-- `augment_holonyms_with_containing_polity` in [[Module:place/placetypes]].
--
-- `place_descriptions` is a list of place descriptions (see top of file, collectively describing the data passed to
-- {{place}}). `implication_data` is the data used to implement the implications, i.e. a table indexed by holonym
-- placetype, each value of which is a table indexed by holonym placename, each value of which is a list of
-- "PLACETYPE/PLACENAME" holonyms to be added to the end of the list of holonyms.
local function handle_category_implications(place_descriptions, implication_data)
for i, desc in ipairs(place_descriptions) do
if desc.holonyms then
local new_holonyms = {}
for _, holonym in ipairs(desc.holonyms) do
insert(new_holonyms, holonym)
local imp_data = m_placetypes.get_equiv_placetype_prop(holonym.placetype, function(pt)
local implication = implication_data[pt] and implication_data[pt][holonym.unlinked_placename]
if implication then
return implication
end
end)
if imp_data then
for _, holonym_to_add in ipairs(imp_data) do
local split_holonym = split_on_slash(holonym_to_add)
if #split_holonym ~= 2 then
internal_error("Invalid holonym in implications: %s", holonym_to_add)
end
local holonym_placetype, holonym_placename = unpack(split_holonym, 1, 2)
local new_holonym = {
-- By the time we run, the display has already been generated so we don't need to set
-- display_placename.
placetype = holonym_placetype, unlinked_placename = holonym_placename
}
insert(new_holonyms, new_holonym)
m_placetypes.key_holonym_into_place_desc(desc, new_holonym)
end
end
end
desc.holonyms = new_holonyms
end
end
end
-- Split a holonym (e.g. "continent/Europe" or "country/en:Italy" or "in southern" or "r:suf/O'Higgins" or
-- "c/Austria,Germany,Czech Republic") into its components. Return a list of holonym objects (see top of file). Note
-- that if there isn't a slash in the holonym (e.g. "in southern"), the `placetype` field of the holonym will be nil.
-- Placetype aliases (e.g. "r" for "region") and placename aliases (e.g. "US" or "USA" for "United States") will be
-- expanded.
local function split_holonym(raw)
local no_display, combined_holonym = raw:match("^(!)(.*)$")
no_display = not not no_display
combined_holonym = combined_holonym or raw
local suppress_comma, combined_holonym_without_comma = combined_holonym:match("^(%*)(.*)$")
suppress_comma = not not suppress_comma
combined_holonym = combined_holonym_without_comma or combined_holonym
local holonym_parts = split_on_slash(combined_holonym)
if #holonym_parts == 1 then
-- `unlinked_placename` should not be used.
return {{display_placename = combined_holonym, no_display = no_display, suppress_comma = suppress_comma}}
end
-- Rejoin further slashes in case of slash in holonym placename, e.g. Admaston/Bromley.
local placetype = holonym_parts[1]
local placename = concat(holonym_parts, "/", 2)
-- Check for modifiers after the holonym placetype.
local split_holonym_placetype = split(placetype, ":", true)
placetype = split_holonym_placetype[1]
local affix_type
local saw_also
local saw_the
for i = 2, #split_holonym_placetype do
local modifier = split_holonym_placetype[i]
if modifier == "also" then
if saw_also then
error(("Modifier ':also' occurs twice in holonym '%s'"):format(combined_holonym))
end
saw_also = true
elseif modifier == "the" then
if saw_the then
error(("Modifier ':the' occurs twice in holonym '%s'"):format(combined_holonym))
end
saw_the = true
elseif modifier == "pref" or modifier == "Pref" or modifier == "suf" or modifier == "Suf" or
modifier == "noaff" then
if affix_type then
error(("Affix-type modifier ':%s' occurs twice in holonym '%s'"):format(modifier, combined_holonym))
end
affix_type = modifier
else
error(("Unrecognized holonym placetype modifier '%s', should be one of " ..
"'pref', 'Pref', 'suf', 'Suf', 'noaff', 'also' or 'the'"):format(modifier))
end
end
placetype = m_placetypes.resolve_placetype_aliases(placetype)
local holonyms = split_on_comma(placename)
local pluralize_affix = #holonyms > 1
local affix_holonym_index = (affix_type == "pref" or affix_type == "Pref") and 1 or affix_type == "noaff" and 0 or
#holonyms
for i, placename in ipairs(holonyms) do
-- Check for langcode before the holonym placename, but don't get tripped up by Wikipedia links, which begin
-- "[[w:...]]" or "[[wikipedia:]]".
local langcode, placename_without_langcode = rmatch(placename, "^([^%[%]]-):(.*)$")
if langcode then
placename = placename_without_langcode
end
placename = m_placetypes.resolve_placename_display_aliases(placetype, placename)
holonyms[i] = {
placetype = placetype,
display_placename = placename,
unlinked_placename = m_placetypes.remove_links_and_html(placename),
langcode = langcode,
affix_type = i == affix_holonym_index and affix_type or nil,
pluralize_affix = i == affix_holonym_index and pluralize_affix,
suppress_affix = i ~= affix_holonym_index,
no_display = no_display,
suppress_comma = suppress_comma,
continue_cat_loop = saw_also,
force_the = i == 1 and saw_the,
}
end
return holonyms
end
local get_param_mods = memoize(function()
local m_param_utils = require(parameter_utilities_module)
return m_param_utils.construct_param_mods {
{group = {"link", "q", "l", "ref"}},
{param = "eq"},
-- FIXME: Finish [[Module:format utilities]].
--{param = "conj", set = require(format_utilities_module).allowed_conjs_for_join_segments, overall = true},
{param = "conj", set = {["and"] = true, ["or"] = true, ["and/or"] = true, ["และ"] = true, ["หรือ"] = true, ["และ/หรือ"] = true}, overall = true},
}
end)
local function parse_term_with_inline_modifiers(term, paramname, default_lang)
-- FIXME: Finish changes to [[Module:parameter utilities]] and [[Module:parse utilities]] that support continuations
-- and new-format generate_obj().
--local function generate_obj(data)
-- local m_param_utils = require(parameter_utilities_module)
-- data.parse_lang_prefix = true
-- data.special_continuations = m_param_utils.default_special_continuations
-- data.default_lang = default_lang
-- return m_param_utils.generate_obj_maybe_parsing_lang_prefix(data)
--end
local function generate_obj(raw_term, parse_err)
local obj = require(parameter_utilities_module).generate_obj_maybe_parsing_lang_prefix {
term = raw_term,
parse_err = parse_err,
parse_lang_prefix = true,
}
obj.lang = obj.lang or default_lang
return obj
end
return require(parse_interface_module).parse_inline_modifiers(term, {
paramname = paramname,
param_mods = get_param_mods(),
generate_obj = generate_obj,
-- FIXME: See above.
--generate_obj_new_format = true,
splitchar = ",",
outer_container = {},
})
end
local function parse_form_of_directive(arg, lang, form_of_overridden_args)
local form_of_directive, raw_terms = arg:match("^@([a-z -]+):(.*)$")
if not form_of_directive then
error("Misformatted @-directive: " .. dump(arg))
end
if not export.all_form_of_directives[form_of_directive] then
local known_directives = {}
for k, _ in pairs(export.all_form_of_directives) do
insert(known_directives, '"' .. k .. '"')
end
table.sort(known_directives)
error(("Unrecognized form-of directive %s in @-directive %s; recognized directives are %s"):format(
dump(form_of_directive), dump(arg), concat(known_directives, ", ")))
end
local spec = export.all_form_of_directives[form_of_directive]
local canonical_directive = form_of_directive
if spec.alias_of then
canonical_directive = spec.alias_of
spec = export.all_form_of_directives[canonical_directive]
if not spec then
internal_error("Form-of directive alias %s points to %s, which is not a directive",
"@" .. form_of_directive, canonical_directive)
elseif spec.alias_of then
internal_error("Form-of directive alias %s points to %s, which is also an alias",
"@" .. form_of_directive, canonical_directive)
end
end
local default_foreign = spec.default_foreign
local directive_param = "@" .. form_of_directive
if form_of_overridden_args and form_of_overridden_args[canonical_directive] then
raw_terms = form_of_overridden_args[canonical_directive].new_value
local new_directive = form_of_overridden_args[canonical_directive].new_directive
local new_spec = export.all_form_of_directives[new_directive]
if not new_spec then
error(("Internal error: [[Module:transclude]] passed in unrecognized replacement directive '@%s'"):
format(new_directive))
end
if new_spec.alias_of then
error(("Internal error: [[Module:transclude]] passed in replacement directive alias '@%s', " ..
"should be canonical"):format(new_directive))
end
if new_directive ~= canonical_directive then
directive_param = directive_param .. (" (replaced with @%s)"):format(new_directive)
canonical_directive = new_directive
spec = new_spec
end
default_foreign = true
end
local terms = parse_term_with_inline_modifiers(raw_terms, directive_param,
default_foreign and lang or enlang)
return {
directive = canonical_directive,
terms = terms.terms,
conj = terms.conj,
spec = spec,
}
end
-- Parse an argument containing extra information that is sometimes added to a definition, such as the capital, largest
-- city, modern name, official name, etc. `args` is the value from the parsed argument structure and can be either nil,
-- a string or a list (depending on whether it was declared as a single parameter or a list). `spec` is the extra info
-- spec corresponding to the type of extra info. Each value in `args` can be a comma-separated list of terms with inline
-- modifiers attached. [FIXME: we should switch to always using the comma-separated format and disallow list parameters
-- such as |capital=, |capital2=, etc.] The return value is a structure containing fields `terms` (a list of term
-- objects, each of which is in the format expected by full_link() in [[Module:links]]), `conj` (an explicit
-- conjunction to join multiple terms, or nil if no explicit conjunction was given) and `spec` (the passed-in spec).
local function parse_extra_info_arg(args, spec, default_lang)
if not args then
return nil
end
if type(args) ~= "table" then
args = {args}
end
if not args[1] then
return nil
end
local terms = nil
local conj
for i, arg in ipairs(args) do
local this_terms = parse_term_with_inline_modifiers(arg, spec.arg .. (i == 1 and "" or i), default_lang)
local thisconj = this_terms.conj
if not conj then
conj = thisconj
elseif thisconj and conj ~= thisconj then
error(("Two different conjunctions '%s' and '%s' specified for |%s=; you only need to specify the " ..
"conjunction once"):format(conj, thisconj))
end
if not terms then
terms = this_terms.terms
else
m_table.extend(terms, this_terms.terms)
end
end
return {
spec = spec,
terms = terms,
conj = conj,
}
end
--[==[
Parse a "new-style" place description, with placetypes and holonyms surrounded by `<<...>>` amid otherwise raw text.
Return value is a place description object as documented at the top of the file. Exported for use by
[[Module:demonyms]].
]==]
function export.parse_new_style_place_desc(text, lang, form_of_directives, form_of_overridden_args)
local placetypes = {}
local segments = split(text, "<<(.-)>>")
local retval = {holonyms = {}, order = {}}
local form_of_directives_already_present = form_of_directives and not not form_of_directives[1]
for i, segment in ipairs(segments) do
if i % 2 == 1 then
insert(retval.order, {type = "raw", value = segment})
elseif segment:find("@") then
if not form_of_directives then
error(("Form-of directive '%s' not allowed in this context"):format(segment))
elseif form_of_directives_already_present then
error(("Saw form-of directive '%s' in new-style place desc followed by direct (separate-parameter) form-of directives; not allowed"):format(
segment))
elseif placetypes[1] or retval.holonyms[1] then
error(("Form-of directive '%s' must come first, before placetypes and holonyms"):format(segment))
else
local form_of_directive = parse_form_of_directive(segment, lang, form_of_overridden_args)
if not retval.order[1] or retval.order[1].type ~= "raw" or retval.order[2] then
internal_error("`retval.order` should have a single raw element: %s", retval.order)
end
form_of_directive.pretext = retval.order[1].value
retval.order[1] = nil
insert(form_of_directives, form_of_directive)
end
elseif segment:find("/") then
local holonyms = split_holonym(segment)
for j, holonym in ipairs(holonyms) do
if j > 1 then
if not holonym.no_display then
if j == #holonyms then
insert(retval.order, {type = "raw", value = " and "})
else
insert(retval.order, {type = "raw", value = ", "})
end
end
-- All but the first in a multi-holonym need an article. For the first one, the article is
-- specified in the raw text if needed. (Currently, needs_article is only used when displaying the
-- holonym, so it wouldn't matter when no_display is set, but we set it anyway in case we need it
-- for something else.)
holonym.needs_article = true
end
insert(retval.holonyms, holonym)
if not holonym.no_display then
insert(retval.order, {type = "holonym", value = #retval.holonyms})
end
m_placetypes.key_holonym_into_place_desc(retval, holonym)
end
else
local treat_as, display = segment:match("^(..-):(.+)$")
if treat_as then
segment = treat_as
else
display = segment
end
-- see if the placetype segment is just qualifiers
local only_qualifiers = true
local split_segments = split(segment, " ", true)
for _, split_segment in ipairs(split_segments) do
if m_placetypes.placetype_qualifiers[split_segment] == nil then
only_qualifiers = false
break
end
end
insert(placetypes, {placetype = segment, only_qualifiers = only_qualifiers})
if only_qualifiers then
insert(retval.order, {type = "qualifier", value = display})
else
insert(retval.order, {type = "placetype", value = display})
end
end
end
if not form_of_directives_already_present and form_of_directives and form_of_directives[1] then
form_of_directives[#form_of_directives].posttext = ""
end
local final_placetypes = {}
for i, placetype in ipairs(placetypes) do
if i > 1 and placetypes[i - 1].only_qualifiers then
final_placetypes[#final_placetypes] = final_placetypes[#final_placetypes] .. " " .. placetypes[i].placetype
else
insert(final_placetypes, placetypes[i].placetype)
end
end
retval.placetypes = final_placetypes
return retval
end
--[==[
Parse one or more "new-style" place descriptions, with placetypes and holonyms surrounded by `<<...>>` amid otherwise
raw text. Multiple descriptions are separated by two semicolons in a row. Return value is a list of place description
objects as documented at the top of the file.
]==]
local function parse_conjoined_new_style_place_desc(text, lang, form_of_directives, form_of_overridden_args)
local separate_specs = split(text, ";(;[^ ]*)")
local descs = {}
for i = 1, #separate_specs do
if i % 2 == 1 then
insert(descs, export.parse_new_style_place_desc(separate_specs[i], lang, form_of_directives,
form_of_overridden_args))
form_of_directives = nil
else
descs[#descs].separator = separate_specs[i]
end
end
return descs
end
--[=[
Process numeric and "extra info" arguments into an overall place spec, as described at the top of the file. `data` is an
object with the following fields:
* `args`: The parsed arguments of {{tl|place}}.
* `from_tcl`: True if we're being invoked from {{tl|tcl}}.
* `extra_info_overridden_set`, `form_of_overridden_args`: Same as the corresponding fields in the `data` object passed
to `export.format`.
]=]
local function parse_overall_place_spec(data)
local args, from_tcl, extra_info_overridden_set, form_of_overridden_args =
data.args, data.from_tcl, data.extra_info_overridden_set, data.form_of_overridden_args
local descs = {}
local this_desc
-- Index of separate (semicolon-separated) place descriptions within `descs`.
local desc_index = 1
-- Index of separate holonyms within a place description. 0 means we've seen no holonyms and have yet to process
-- the placetypes that precede the holonyms. 1 means we've seen no holonyms but have already processed the
-- placetypes.
local holonym_index = 0
local in_place_desc = false
local form_of_directives = {}
local function set_desc_joiner(desc, separator)
if separator == ";" then
this_desc.joiner = "; "
this_desc.include_following_article = true
elseif separator == ";;" then
this_desc.joiner = " "
else
local joiner = separator:sub(2)
if rfind(joiner, "^%a") then
this_desc.joiner = " " .. joiner .. " "
else
this_desc.joiner = joiner .. " "
end
end
end
for _, arg in ipairs(args[2]) do
if arg:find("^@") then
if not (desc_index == 1 and holonym_index == 0) then
error("@-directives cannot follow place descriptions")
end
local form_of_directive = parse_form_of_directive(arg, args[1], form_of_overridden_args)
if form_of_directives[1] then
form_of_directive.pretext = ", "
else
form_of_directive.pretext = ""
end
insert(form_of_directives, form_of_directive)
elseif arg == ";" or arg:find("^;[^ ]") then
if not this_desc then
error("Saw semicolon joiner without preceding place description")
end
set_desc_joiner(this_desc, arg)
desc_index = desc_index + 1
holonym_index = 0
in_place_desc = false
else
if arg:find("<<") then
if in_place_desc then
error("New-style place description must come first or following a separator (semicolon or similar), not directly following another description")
end
in_place_desc = true
local this_descs = parse_conjoined_new_style_place_desc(arg, args[1], form_of_directives,
form_of_overridden_args)
for j, desc in ipairs(this_descs) do
this_desc = desc
if holonym_index > 0 then
desc_index = desc_index + 1
holonym_index = 0
end
if j < #this_descs then
set_desc_joiner(this_desc, this_desc.separator)
end
descs[desc_index] = this_desc
last_was_new_style = true
holonym_index = #this_desc.holonyms + 1
end
else
-- Old-style arguments can directly follow a new-style argument; they become additional holonyms
-- tacked onto the end of the holonym list, and are displayed old-style except that there is no
-- prefix before the first one following the new-style argument.
in_place_desc = true
if holonym_index == 0 then
local entry_placetypes = split_on_slash(arg)
this_desc = {placetypes = entry_placetypes, holonyms = {}}
descs[desc_index] = this_desc
holonym_index = holonym_index + 1
else
local holonyms = split_holonym(arg)
for j, holonym in ipairs(holonyms) do
if j > 1 then
-- All but the first in a multi-holonym need an article. Not for the first one because e.g.
-- {{place|en|city|s/Arizona|c/United States}} should not display as "a city in Arizona, the
-- United States". The overall first holonym in the place description gets an article if
-- needed regardless of our setting here.
holonym.needs_article = true
-- Insert "และ" before the last holonym.
if j == #holonyms then
this_desc.holonyms[holonym_index] = {
-- Use the no_display value from the first holonym; it should be the same for all
-- holonyms. `unlinked_placename` should not be used.
display_placename = "และ", no_display = holonyms[1].no_display
}
holonym_index = holonym_index + 1
end
end
this_desc.holonyms[holonym_index] = holonym
m_placetypes.key_holonym_into_place_desc(this_desc, this_desc.holonyms[holonym_index])
holonym_index = holonym_index + 1
end
end
end
end
end
if form_of_directives[1] and not form_of_directives[#form_of_directives].posttext then
form_of_directives[#form_of_directives].posttext =
(args.def and args.def ~= "-" or not args.def and descs[1]) and ": " or ""
end
-- Tracking code. This does nothing but add tracking for seen placetypes and qualifiers. The place will be linked to
-- [[Wiktionary:Tracking/place/entry-placetype/PLACETYPE]] for all entry placetypes seen; in addition, if PLACETYPE
-- has qualifiers (e.g. 'small city'), there will be links for the bare placetype minus qualifiers and separately
-- for the qualifiers themselves:
-- [[Special:WhatLinksHere/Wiktionary:Tracking/place/entry-placetype/BARE_PLACETYPE]]
-- [[Special:WhatLinksHere/Wiktionary:Tracking/place/entry-qualifier/QUALIFIER]]
-- Note that if there are multiple qualifiers, there will be links for each possible split. For example, for
-- 'small maritime city'), there will be the following links:
-- [[Special:WhatLinksHere/Wiktionary:Tracking/place/entry-placetype/small maritime city]]
-- [[Special:WhatLinksHere/Wiktionary:Tracking/place/entry-placetype/maritime city]]
-- [[Special:WhatLinksHere/Wiktionary:Tracking/place/entry-placetype/city]]
-- [[Special:WhatLinksHere/Wiktionary:Tracking/place/entry-qualifier/small]]
-- [[Special:WhatLinksHere/Wiktionary:Tracking/place/entry-qualifier/maritime]]
-- Finally, there are also links for holonym placetypes, e.g. if the holonym 'c/Italy' occurs, there will be the
-- following link:
-- [[Special:WhatLinksHere/Wiktionary:Tracking/place/holonym-placetype/country]]
for _, desc in ipairs(descs) do
for _, entry_placetype in ipairs(desc.placetypes) do
local splits = m_placetypes.split_qualifiers_from_placetype(entry_placetype, "no canon qualifiers")
for _, split in ipairs(splits) do
local prev_qualifier, this_qualifier, bare_placetype = unpack(split, 1, 3)
track("entry-placetype/" .. bare_placetype)
if this_qualifier then
track("entry-qualifier/" .. this_qualifier)
end
end
end
for _, holonym in ipairs(desc.holonyms) do
if holonym.placetype then
track("holonym-placetype/" .. holonym.placetype)
end
end
end
local extra_info = {}
for _, extra_info_spec in ipairs(export.extra_info_args) do
local extra_info_terms = parse_extra_info_arg(args[extra_info_spec.arg], extra_info_spec,
-- If called from {{tcl}} and extra info argument was set by {{tcl}}, interpret the argument
-- according to the language in 1=; otherwise interpret as English. To override this, prefix
-- with the appropriate language.
from_tcl and extra_info_overridden_set and extra_info_overridden_set[extra_info_spec.arg] and args[1] or
enlang)
if extra_info_terms then
insert(extra_info, extra_info_terms)
end
end
return {
lang = args[1],
args = args,
directives = form_of_directives,
descs = descs,
extra_info = extra_info,
}
end
-------- Definition-generating functions
-- Return a string with the wikilinks to the English translations of the word.
local function get_translations(transl, ids)
local ret = {}
for i, t in ipairs(transl) do
local arg_transls = split_on_comma(t)
local arg_ids = ids[i]
if arg_ids then
arg_ids = split_on_comma(arg_ids)
if #arg_transls ~= #arg_ids then
error(("Saw %s translation%s in t%s=%s but %s ID%s in tid%s=%s"):format(
#arg_transls, #arg_transls > 1 and "s" or "", i == 1 and "" or i, t, #arg_ids,
#arg_ids > 1 and "'s" or "", i == 1 and "" or i, ids[i]))
end
end
for j, arg_transl in ipairs(arg_transls) do
insert(ret, link(arg_transl, "en", arg_ids and arg_ids[j] or nil))
end
end
return concat(ret, ", ")
end
-- Return the article (currently always `"the"`) to be prepended to the given placename, or nil. `decorated_placename`
-- is the placename as specified by the user along with any affix added to it. `placename` is the raw unlinked
-- placename, defaulting to the unlinked version of `decorated_placename` if not given. `placetypes` is a placetype or
-- list of placetypes for the placename. `suppress_holonym_use_the_check` suppresses checking the placetypes for
-- `holonym_use_the`.
local function get_placename_article(decorated_placename, placetypes, placename, suppress_holonym_use_the_check)
local unlinked_decorated_placename = m_placetypes.remove_links_and_html(decorated_placename)
if unlinked_decorated_placename:find("^the ") then
return nil
end
placename = placename or unlinked_decorated_placename
if type(placetypes) == "string" then
placetypes = {placetypes}
end
for _, placetype in ipairs(placetypes) do
local art = m_placetypes.get_equiv_placetype_prop(placetype, function(pt)
local art = m_placetypes.placename_article[pt] and m_placetypes.placename_article[pt][placename]
if art then
return art
end
end)
if art then
return art
end
end
-- Get equivalent placetypes of the specified placetype so that e.g.
-- {{place|en|@official name of:Bahamas|island country|r/Caribbean}} put 'the' before Bahamas ("Bahamas" is just
-- specified as a country but "island country" falls back to "country").
local all_equiv_placetypes = {}
for _, placetype in ipairs(placetypes) do
local this_equiv_placetypes = m_placetypes.get_placetype_equivs(placetype)
for _, this_equiv_placetype in ipairs(this_equiv_placetypes) do
insert(all_equiv_placetypes, this_equiv_placetype.placetype)
end
end
-- Look for a known location. We should be using find_matching_holonym_location() but that function doesn't
-- currently work without alias resolution. Instead we check if any matching location has `the = true` set.
-- In practice there aren't any cases where a given placename matches two locations, only one of which has
-- `the = true` set.
for group, key, spec in m_placetypes.iterate_matching_location {
placetypes = all_equiv_placetypes,
placename = placename,
alias_resolution = "none",
} do
-- `iterate_holonym_location` doesn't initialize the spec if alias resolution is turned off, so check both
-- the spec and group. Be careful in case `the = false` is explicitly given by the spec.
if spec.the ~= nil then
if spec.the then
return "the"
end
elseif group.default_the then
return "the"
end
end
if not suppress_holonym_use_the_check then
-- See if the placetype requests an article to be placed before the placename. This occurs e.g. with 'sea'. But
-- if the user specifies e.g. "sea:pref/Cortez", we'll wrongly get "the sea of the Cortez", so in that case we
-- need to ignore the holonym article specified along with the placetype.
for _, placetype in ipairs(placetypes) do
local holonym_use_the = m_placetypes.get_equiv_placetype_prop(placetype,
function(pt) return placetype_data[pt] and placetype_data[pt].holonym_use_the end)
if holonym_use_the then
return "the"
end
end
end
local universal_res = m_placetypes.placename_the_re["*"]
for _, re in ipairs(universal_res) do
if unlinked_decorated_placename:find(re) then
return "the"
end
end
for _, placetype in ipairs(placetypes) do
local matched = m_placetypes.get_equiv_placetype_prop(placetype, function(pt)
local res = m_placetypes.placename_the_re[pt]
if not res then
return nil
end
for _, re in ipairs(res) do
if unlinked_decorated_placename:find(re) then
return true
end
end
return nil
end)
if matched then
return "the"
end
end
return nil
end
-- Prepend the appropriate article if needed to `decorated_placename` (the user-specified placename with any affix
-- added), where the underlying holonym object that generated `linked_placename` can be found at `holonym_index` in the
-- holonyms in `place_desc`.
local function get_holonym_article(decorated_placename, place_desc, holonym_index)
local holonym = place_desc.holonyms[holonym_index]
local holonym_placetype = holonym.placetype
if not holonym_placetype then
return nil
end
return get_placename_article(decorated_placename, holonym_placetype, holonym.unlinked_placename,
not not holonym.affix_type)
end
-- Convert a holonym into display format. This adds wikilinks to holonyms and passes them through any display handlers,
-- which may (e.g.) add the placetype to the holonym. If `needs_article` is true, prepend the article `"the"` if the
-- holonym requires it (e.g. if the holonym is `United States`). `needs_article` is set to true we are processing the
-- first specified holonym in an old-style place description (i.e. the holonym directly following the entry placetype,
-- with no raw-text holonym in between).
--
-- Examples:
-- ({placetype = "country", display_placename = "United States", unlinked_placename = "United States"}, true) returns
-- the template-expanded equivalent of "the {{l|en|United States}}".
-- ({placetype = "region", display_placename = "O'Higgins", unlinked_placename = "O'Higgins", affix_type = "suf"}, false)
-- returns the template-expanded equivalent of "{{l|en|O'Higgins}} region".
-- ({display_placename = "in the southern"}, false) returns "in the southern" (without wikilinking because .placetype
-- and .langcode are both nil).
local function format_holonym(place_desc, holonym_index, needs_article)
local holonym = place_desc.holonyms[holonym_index]
if holonym.no_display then
return ""
end
local orig_needs_article = needs_article
needs_article = needs_article or holonym.needs_article or holonym.force_the
local output = holonym.display_placename
local placetype = holonym.placetype
local affix_type_pt_data, affix_type, affix_is_prefix, affix, prefix, suffix, no_affix_strings
local pt_equiv_for_affix_type, already_seen_affix, need_affix
-- Implement display handlers.
local display_handler = m_placetypes.get_equiv_placetype_prop(placetype,
function(pt) return placetype_data[pt] and placetype_data[pt].display_handler end)
if display_handler then
output = display_handler(placetype, output)
end
if not holonym.suppress_affix then
-- Implement adding an affix (prefix or suffix) based on the holonym's placetype. The affix will be
-- added either if the placetype's placetype_data spec says so (by setting 'affix_type'), or if the
-- user explicitly called for this (e.g. by using 'r:suf/O'Higgins'). Before adding the affix,
-- however, we check to see if the affix is already present (e.g. the placetype is "district"
-- and the placename is "Mission District"). The placetype can override the affix to add (by setting
-- `prefix`, `suffix` or `affix`) and/or override the strings used for checking if the affix is already
-- present (by setting 'no_affix_strings', which defaults to the affix explicitly given through `prefix`,
-- `suffix` or `affix` if any are given). `prefix` and `suffix` take precedence over `affix` if both are
-- set, but only when the appropriate type of affix is requested.
-- Search through equivalent placetypes for a setting of `affix_type`, `affix`, `prefix` or `suffix`. If we
-- find any, use them. If `affix_type` is given, it is overridden by the user's explicitly specified affix
-- type. If either an `affix_type` is found or the user explicitly specified an affix type, the affix is
-- displayed according to the following:
-- 1. If `prefix`, `suffix` or `affix` is given by the placetype or equivalent placetypes, use it (e.g.
-- placetype `administrative region` requests suffix "region" but doesn't set affix type; if the user
-- explicitly specifies `administrative region` as the placetype for a holonym and specifies a suffixal
-- affix type, use "region"). In this search, we stop looking if we find an explicit `affix_type`
-- setting; if this is found without an associated affix setting, the assumption is the associated
-- placetype was intended as the affix, not some explicit affix setting associated with a fallback
-- placetype.
-- 2. Otherwise, if the user explicitly requested an affix type, use the actual placetype (principle of
-- least surprise).
-- 3. Finally, fall back to the placetype associated with an explicit `affix_type` setting (which will
-- always exist if we get this far).
affix_type_pt_data, pt_equiv_for_affix_type = m_placetypes.get_equiv_placetype_prop(placetype,
function(pt)
local cdpt = placetype_data[pt]
return cdpt and cdpt.affix_type and cdpt or nil
end
)
affix_pt_data, pt_equiv_for_affix = m_placetypes.get_equiv_placetype_prop(placetype,
function(pt)
local cdpt = placetype_data[pt]
return cdpt and (cdpt.affix_type or cdpt.affix or cdpt.prefix or cdpt.suffix) and cdpt or nil
end
)
if affix_type_pt_data then
affix_type = affix_type_pt_data.affix_type
need_affix = true
end
if affix_pt_data then
prefix = affix_pt_data.prefix or affix_pt_data.affix
suffix = affix_pt_data.suffix or affix_pt_data.affix
need_affix = true
end
no_affix_strings = affix_pt_data and affix_pt_data.no_affix_strings or
affix_type_pt_data and affix_type_pt_data.no_affix_strings
if holonym.affix_type and placetype then
affix_type = holonym.affix_type
prefix = prefix or placetype
suffix = suffix or placetype
need_affix = true
end
if need_affix then
-- At this point the affix_type has been determined and can't change any more, so we can figure out
-- whether we need the calculated prefix or suffix.
affix_is_prefix = affix_type == "pref" or affix_type == "Pref"
if affix_is_prefix then
affix = prefix
else
affix = suffix
end
if not affix then
if not pt_equiv_for_affix_type then
internal_error("Something wrong, `pt_equiv_for_affix_type` not set processing holonym: %s",
holonym)
end
affix = pt_equiv_for_affix_type.placetype
if not affix then
internal_error("Something wrong, no affix could be located in `pt_equiv_for_affix_type` for " ..
"holonym %s: %s", holonym, pt_equiv_for_affix_type)
end
end
no_affix_strings = no_affix_strings or lc(affix)
if holonym.pluralize_affix then
affix = m_placetypes.pluralize_placetype(affix)
end
already_seen_affix = m_placetypes.check_already_seen_string(output, no_affix_strings)
end
end
output = link(output, holonym.langcode or placetype and "en" or nil)
if need_affix and not affix_is_prefix and not already_seen_affix then
output = output .. " " .. (affix_type == "Suf" and ucfirst_all(affix) or affix)
end
if needs_article then
local article = holonym.force_the and "the" or get_holonym_article(output, place_desc, holonym_index)
if article then
output = article .. " " .. output
end
end
if affix_is_prefix and not already_seen_affix then
output = (affix_type == "Pref" and ucfirst_all(affix) or affix) .. " of " .. output
if orig_needs_article then
-- Put the article before the added affix if we're the first holonym in the place description. This is
-- distinct from the article added above for the holonym itself; cf. "c:pref/United States,Canada" ->
-- "the countries of the United States and Canada". We need to use the value of `needs_article` passed
-- in from the function, which indicates whether we're processing the first holonym.
output = "the " .. output
end
end
return output
end
-- Format a holonym for display, taking into account the entry's placetype (specifically, the last placetype if there
-- are more than one, excluding conjunctions and parenthetical items); the holonym's index among the holonyms in the
-- template (which specifies what the previous holonym is and whether it is the first holonym); and the full place
-- description (which helps resolve ambiguities in holonyms when looking up known locations). This may involve putting a
-- preposition ("in" or "of") before the formatted holonym, particularly if it is the first one, and may involve
-- prepending a comma. If `holonym_no_prefix` is specified, nothing except a space is put before the holonym; used
-- when formatting mixed new/old-style descriptions.
local function format_holonym_in_context(entry_placetype, place_desc, holonym_index, holonym_no_prefix)
local desc = ""
-- If holonym.placetype is nil, the holonym is just raw text, e.g. 'in southern'.
if holonym_no_prefix then
desc = " "
else
local holonym = place_desc.holonyms[holonym_index]
if not holonym.no_display then
-- First compute the initial delimiter.
if holonym_index == 1 then
if holonym.placetype then
desc = desc .. " " .. m_placetypes.get_placetype_entry_preposition(entry_placetype) .. " "
elseif not holonym.display_placename:find("^,") then
desc = desc .. " "
end
else
local prev_holonym = place_desc.holonyms[holonym_index - 1]
if prev_holonym.placetype and not holonym.suppress_comma then
local dname = holonym.display_placename
if dname ~= "and" and dname ~= "in" and dname ~= "and the" and dname ~= "in the" and dname ~= "และ" and dname ~= "ใน" then
desc = desc .. ","
end
end
if holonym.placetype or not holonym.display_placename:find("^,") then
desc = desc .. " "
end
end
end
end
return desc .. format_holonym(place_desc, holonym_index, not holonym_no_prefix and holonym_index == 1)
end
-- Return the linked description of a placetype. This splits off any qualifiers and displays them separately.
local function get_placetype_description(placetype)
local splits = m_placetypes.split_qualifiers_from_placetype(placetype)
local prefix = ""
for _, split in ipairs(splits) do
local prev_qualifier, this_qualifier, bare_placetype = unpack(split, 1, 3)
if this_qualifier then
prefix = (prev_qualifier and prev_qualifier .. " " .. this_qualifier or this_qualifier) .. " "
else
prefix = ""
end
local display_form = m_placetypes.get_placetype_display_form(bare_placetype)
if display_form then
return prefix .. display_form
end
placetype = bare_placetype
end
return prefix .. placetype
end
-- Return the linked description of a qualifier (which may be multiple words).
local function get_qualifier_description(qualifier)
local splits = m_placetypes.split_qualifiers_from_placetype(qualifier .. " foo")
local split = splits[#splits]
local prev_qualifier, this_qualifier, bare_placetype = unpack(split, 1, 3)
return prev_qualifier and prev_qualifier .. " " .. this_qualifier or this_qualifier
end
-- Format a set of form-of directive terms.
local function format_form_of_directive(overall_place_spec, directive_terms, ucfirst, from_tcl)
local formatted_terms = {}
local placetypes
if not overall_place_spec.descs[2] then
placetypes = overall_place_spec.descs[1].placetypes
else
placetypes = {}
for _, desc in ipairs(overall_place_spec.descs) do
m_table.extend(placetypes, desc.placetypes)
end
end
for _, termobj in ipairs(directive_terms.terms) do
local placename_article
if not termobj.alt and termobj.term and not termobj.term:find("%[%[") then
placename_article = get_placename_article(termobj.term, placetypes)
end
local linked_term = m_links.full_link(termobj, "term", nil, "show qualifiers")
linked_term = "<span class='form-of-definition-link'>" .. linked_term .. "</span>"
if termobj.eq then
linked_term = linked_term .. " (= " .. m_links.full_link {term = termobj.eq, lang = enlang} .. ")"
end
if placename_article then
linked_term = placename_article .. " " .. linked_term
end
insert(formatted_terms, linked_term)
end
local spec = directive_terms.spec
local text = spec.text
if type(text) == "function" then
text = text(overall_place_spec)
end
if text == "+" then
text = directive_terms.directive
end
if ucfirst then
text = m_strutils.ucfirst(text)
end
if not from_tcl then
local tracking_prefix = "form-of/" .. directive_terms.directive
track(tracking_prefix)
local langcode = overall_place_spec.lang:getCode()
local full_langcode = overall_place_spec.lang:getFullCode()
track(tracking_prefix .. "/" .. langcode)
if full_langcode ~= langcode then
track(tracking_prefix .. "/" .. full_langcode)
end
if full_langcode ~= "en" then
track(tracking_prefix .. "/non-english")
end
end
return (require(form_of_module).format_form_of {
text = text,
lemmas = m_table.serialCommaJoin(formatted_terms, {conj = directive_terms.conj or spec.conjunction or "และ"}),
lemma_classes = false,
-- text_classes = "place-text",
})
end
-- Format a set of extra-info terms for extra information that is sometimes added to a definition, such as the capital,
-- largest city, modern name, official name, etc. `overall_place_spec` is the overall parsed {{tl|place}} spec (see
-- comment at top of file); `extra_info_terms` is the terms spec for this type of extra-info (as returned by
-- `parse_extra_info_arg`); and `sentence_style` indicates whether we're generating a sentence-style definition (as
-- suitable for an English-language term without a translation specified using t=).
local function format_extra_info(overall_place_spec, extra_info_terms, sentence_style)
local formatted_terms = {}
for _, termobj in ipairs(extra_info_terms.terms) do
insert(formatted_terms, m_links.full_link(termobj, nil, nil, "show qualifiers"))
end
local spec = extra_info_terms.spec
local text = spec.text
if type(text) == "function" then
text = text(overall_place_spec)
end
if text == "+" then
text = spec.arg
end
if spec.auto_plural and formatted_terms[2] then
text = pluralize(text)
end
if spec.with_colon then
text = text .. ":"
end
if sentence_style and spec.match_sentence_style then
text = ". " .. m_strutils.ucfirst(text)
else
text = "; " .. text
end
-- FIME: Use joinSegments when available.
-- return text .. " " ..
-- m_table.joinSegments(formatted_terms, {conj = extra_info_terms.conj or spec.conjunction or "และ"})
return text .. " " ..
m_table.serialCommaJoin(formatted_terms, {conj = extra_info_terms.conj or spec.conjunction or "และ"})
end
-- Format an old-style place description (with separate arguments for the placetype and each holonym) for display and
-- return the resulting string.
local function format_old_style_place_desc_for_display(args, place_desc, desc_index, with_article, ucfirst)
-- The placetype used to determine whether "in" or "of" follows is the last placetype if there are
-- multiple slash-separated placetypes, but ignoring "และ", "or" and parenthesized notes
-- such as "(one of 254)".
local entry_placetype = nil
local placetypes = place_desc.placetypes
local function is_and_or(item)
return item == "และ" or item == "หรือ"
end
local parts = {}
local function ins(txt)
insert(parts, txt)
end
local function ins_space()
if #parts > 0 then
ins(" ")
end
end
local and_or_pos
for i, placetype in ipairs(placetypes) do
if is_and_or(placetype) then
and_or_pos = i
-- no break here; we want the last in case of more than one
end
end
local remaining_placetype_index
if and_or_pos then
track("multiple-placetypes-with-and")
if and_or_pos == #placetypes then
error("Conjunctions 'and' and 'or' cannot occur last in a set of slash-separated placetypes: " ..
concat(placetypes, "/"))
end
local items = {}
for i = 1, and_or_pos + 1 do
local pt = placetypes[i]
if is_and_or(pt) then
-- skip
elseif i > 1 and pt:find("^%(") then
-- append placetypes beginning with a paren to previous item
items[#items] = items[#items] .. " " .. pt
else
entry_placetype = pt
insert(items, get_placetype_description(pt))
end
end
ins(m_table.serialCommaJoin(items, {conj = placetypes[and_or_pos]}))
remaining_placetype_index = and_or_pos + 2
else
remaining_placetype_index = 1
end
for i = remaining_placetype_index, #placetypes do
local pt = placetypes[i]
-- Check for and, or and placetypes beginning with a paren (so that things like
-- "{{place|en|county/(one of 254)|s/Texas}}" work).
if m_placetypes.placetype_is_ignorable(pt) then
ins_space()
ins(pt)
else
entry_placetype = pt
-- Join multiple placetypes with comma unless placetypes are already
-- joined with "และ". We allow "the" to precede the second placetype
-- if they're not joined with "และ" (so we get "city and county seat of ..."
-- but "city, the county seat of ...").
if i > 1 then
ins(", ")
local article = m_placetypes.get_placetype_article(pt)
if article ~= "the" and i > remaining_placetype_index then
-- Track cases where we are comma-separating multiple placetypes without the second one starting
-- with "the", as they may be mistakes. The occurrence of "the" is usually intentional, e.g.
-- {{place|zh|municipality/state capital|s/Rio de Janeiro|c/Brazil|t1=Rio de Janeiro}}
-- for the city of [[Rio de Janeiro]], which displays as "a municipality, the state capital of ...".
track("multiple-placetypes-without-and-or-the")
end
if article then
ins(article)
ins(" ")
end
end
ins(get_placetype_description(pt))
end
end
if place_desc.holonyms then
for holonym_index, _ in ipairs(place_desc.holonyms) do
ins(format_holonym_in_context(entry_placetype, place_desc, holonym_index))
end
end
local gloss = concat(parts)
if with_article then
local article
if desc_index == 1 then
article = args.a
else
if not place_desc.holonyms then
-- there isn't a following holonym; the place type given might be raw text as well, so don't add
-- an article.
with_article = false
else
local saw_placetype_holonym = false
for _, holonym in ipairs(place_desc.holonyms) do
if holonym.placetype then
saw_placetype_holonym = true
break
end
end
if not saw_placetype_holonym then
-- following holonym(s)s is/are just raw text; the place type given might be raw text as well,
-- so don't add an article.
with_article = false
end
end
if with_article then
track("second-or-higher-description-with-added-article")
else
track("second-or-higher-description-suppressed-article")
end
end
if with_article then
article = article or m_placetypes.get_placetype_article(place_desc.placetypes[1], ucfirst)
if article then
gloss = article .. " " .. gloss
elseif ucfirst then
gloss = m_strutils.ucfirst(gloss)
end
end
end
return gloss
end
--[==[
Get the full gloss (English description) of a new-style place description. New-style place descriptions are
specified with a single string containing raw text interspersed with placetypes and holonyms surrounded by `<<...>>`.
Exported for use by [[Module:demonyms]].
]==]
function export.format_new_style_place_desc_for_display(args, place_desc, with_article)
local parts = {}
local function ins(txt)
insert(parts, txt)
end
if with_article and args.a then
ins(args.a .. " ")
end
local max_holonym = 0
for _, order in ipairs(place_desc.order) do
local segment_type, segment = order.type, order.value
if segment_type == "raw" then
ins(segment)
elseif segment_type == "placetype" then
ins(get_placetype_description(segment))
elseif segment_type == "qualifier" then
ins(get_qualifier_description(segment))
elseif segment_type == "holonym" then
ins(format_holonym(place_desc, segment, false))
if segment > max_holonym then
max_holonym = segment
end
else
internal_error("Unrecognized segment type %s", segment_type)
end
end
if place_desc.holonyms and max_holonym < #place_desc.holonyms then
local holonym_no_prefix = true
for holonym_index = max_holonym + 1, #place_desc.holonyms do
ins(format_holonym_in_context(nil, place_desc, holonym_index, holonym_no_prefix))
holonym_no_prefix = false
end
end
return concat(parts)
end
-- Return a string with the gloss (the description of the place itself, as opposed to translations). If `ucfirst` is
-- given, the gloss's first letter is made upper case. If `sentence_style` is given, the "extra info" (modern name,
-- capital, largest city, etc.) is displayed as separated sentences; otherwise, it is displayed separated from the main
-- definition by semicolons.
local function get_display_form(data)
local overall_place_spec, ucfirst, sentence_style, drop_extra_info, extra_info_overridden_set, from_tcl =
data.overall_place_spec, data.ucfirst, data.sentence_style, data.drop_extra_info,
data.extra_info_overridden_set, data.from_tcl
local args = overall_place_spec.args
local parts = {}
local function ins(txt)
table.insert(parts, txt)
end
if overall_place_spec.directives and overall_place_spec.directives[1] then
for i, directive_terms in ipairs(overall_place_spec.directives) do
ins(directive_terms.pretext)
if directive_terms.pretext ~= "" then
ucfirst = false
end
if not args.def or args.def == "-" then
ins(format_form_of_directive(overall_place_spec, directive_terms, ucfirst, from_tcl))
ucfirst = false
if i == #overall_place_spec.directives and directive_terms.posttext then
ins(directive_terms.posttext)
end
end
end
end
if args.def == "-" then
return concat(parts)
end
if args.def then
if args.def:find("<<") then
local def_desc = export.parse_new_style_place_desc(args.def, args[1])
ins(export.format_new_style_place_desc_for_display({}, def_desc, false))
else
ins(args.def)
end
else
local include_article = true
for n, desc in ipairs(overall_place_spec.descs) do
if desc.order then
ins(export.format_new_style_place_desc_for_display(args, desc, n == 1))
else
ins(format_old_style_place_desc_for_display(args, desc, n, include_article, ucfirst))
end
if desc.joiner then
ins(desc.joiner)
end
include_article = desc.include_following_article
ucfirst = false
end
end
local addl = args.addl
if addl then
posttext = posttext or ""
if addl:find("^[;:]") then
ins(addl)
elseif addl:find("^_") then
ins(" " .. addl:sub(2))
else
ins(", " .. addl)
end
end
for _, extra_info_terms in ipairs(overall_place_spec.extra_info) do
-- Include a given extra info term either when
-- (1) drop_extra_info not set (it's set by {{tcl}}), or
-- (2) the extra info term is marked as "display even when dropped" (e.g. modern= or full=, to help understand
-- the term's sense), or
-- (3) the term was overridden by a `place_*=` setting in {{tcl}}.
if not drop_extra_info or extra_info_terms.spec.display_even_when_dropped or
extra_info_overridden_set and extra_info_overridden_set[extra_info_terms.spec.arg] then
ins(format_extra_info(overall_place_spec, extra_info_terms, sentence_style))
end
end
return concat(parts)
end
-- Return the definition line.
local function get_def(data)
local overall_place_spec, from_tcl, drop_extra_info, extra_info_overridden_set, translation_follows =
data.overall_place_spec, data.from_tcl, data.drop_extra_info, data.extra_info_overridden_set,
data.translation_follows
local args = overall_place_spec.args
local sentence_style = overall_place_spec.lang:getCode() == "en"
local ucfirst = sentence_style and not args.nocap
if #args.t > 0 then
local gloss = get_display_form {
overall_place_spec = overall_place_spec,
ucfirst = false,
sentence_style = false,
drop_extra_info = drop_extra_info,
extra_info_overridden_set = extra_info_overridden_set,
from_tcl = from_tcl,
}
if from_tcl and not args.tcl_nolc then
gloss = m_strutils.lcfirst(gloss)
end
if translation_follows then
return (gloss == "" and "" or gloss .. ": ") .. get_translations(args.t, args.tid)
else
return get_translations(args.t, args.tid) .. (gloss == "" and "" or " (" .. gloss .. ")")
end
else
return get_display_form {
overall_place_spec = overall_place_spec,
ucfirst = ucfirst,
sentence_style = sentence_style,
drop_extra_info = drop_extra_info,
extra_info_overridden_set = extra_info_overridden_set,
from_tcl = from_tcl,
}
end
end
---------- Functions for the category wikicode
-- The code in this section finds the categories to which a given place belongs. See comment at top of file.
--[=[
Find the appropriate category specs for a given place description and placetype. For example, for the template
invocation {{tl|place|en|city/and/county|s/Pennsylvania|c/US}}, which results in the place description
```
{
placetypes = {"city", "และ", "county"},
holonyms = {
{placetype = "state", display_placename = "Pennsylvania", unlinked_placename = "Pennsylvania"},
{placetype = "country", display_placename = "United States", unlinked_placename = "United States"},
},
holonyms_by_placetype = {
state = {"Pennsylvania"},
country = {"United States"},
},
}
```
the call
```
find_placetype_cat_specs {
entry_placetype = "city",
place_desc = {
placetypes = {"city", "และ", "county"},
holonyms = {
{placetype = "state", display_placename = "Pennsylvania", unlinked_placename = "Pennsylvania"},
{placetype = "country", display_placename = "United States", unlinked_placename = "United States"},
},
holonyms_by_placetype = {
state = {"Pennsylvania"},
country = {"United States"},
},
},
}
```
might produce the return value
```
{
entry_placetype = "city",
cat_specs = {"Cities in Pennsylvania, USA"},
triggering_holonym = {placetype = "state", display_placename = "Pennsylvania", unlinked_placename = "Pennsylvania"},
triggering_holonym_index = 1,
}
```
See the comment at the top of the section for a description of category specs and the overall algorithm.
On entry, `data` is an object with the following fields:
* `entry_placetype`: the entry placetype (or equivalent) used to look up the category data in placetype_data,
which must have already been resolved to a placetype with an entry in `placetype_data`;
* `place_desc`: the full place description as documented at the top of the file (used only for its holonyms);
* `first_holonym_index`: the index of the first holonym to consider when iterating through the holonyms (used to
implement the `:also` holonym placetype modifier);
* `overriding_holonym`: an optional overriding holonym to use, in place of iterating through the holonyms (used to
implement categorizing other holonyms of the same type as the triggering holonym, so that e.g.
{{tl|place|en|river|s/Kansas,Nebraska}}, or equivalently {{tl|place|en|river|s/Kansas|and|s/Nebraska}}, works);
* `from_demonym`: we are called from {{tl|demonym-noun}} or {{tl|demonym-adj}} instead of {{tl|place}}, and should
generate categories appropriate to those templates.
* `form_of_directive`: A form-of directive prefix such as `FORMER_NAME_OF`. If specified, use that type prefix to
generate categories appropriate to the form-of directive (in addition to the regular categories generated for the
{{tl|place}} invocation, which happens in a separate call).
The return value is {nil} if no category specs could be located, otherwise an object with the following fields:
* `entry_placetype`: the placetype that should be used to construct categories when `true` is one of the returned
category specs (normally the same as the `entry_placetype` passed in, but will be different when a "fallback" key
exists and is used);
* `cat_specs`: list of category specs as described above;
* `triggering_holonym`: the triggering holonym (see the comment at the top of the section), or nil if there was no
triggering holonym;
* `triggering_holonym_index`: the index of the triggering holonym in the list of holonyms in `place_desc`, or nil if
an overriding holonym was passed in or there was no triggering holonym.
]=]
local function find_placetype_cat_specs(data)
local entry_placetype, place_desc, first_holonym_index, overriding_holonym, from_demonym =
data.entry_placetype, data.place_desc, data.first_holonym_index, data.overriding_holonym, data.from_demonym
local form_of_directive = data.form_of_directive
local function fetch_cat_specs(holonym_to_match, index, no_fallback)
local holonym_placetype = holonym_to_match.placetype
if not holonym_placetype then
-- raw text in place of holonym
return nil
end
local holonym_placename = holonym_to_match.unlinked_placename
if not holonym_placename then
internal_error("Missing unlinked_placename in holonym (index %s): %s", index, holonym_to_match)
end
local cat_specs, equiv_entry_placetype_and_qualifier = m_placetypes.get_equiv_placetype_prop(entry_placetype,
function(equiv_entry_pt)
return m_placetypes.get_equiv_placetype_prop(holonym_placetype,
function(equiv_holonym_pt) return m_placetypes.political_division_cat_handler {
entry_placetype = equiv_entry_pt,
holonym_placetype = equiv_holonym_pt,
holonym_placename = holonym_placename,
holonym_index = index,
place_desc = place_desc,
from_demonym = from_demonym,
} end)
end,
{no_fallback = no_fallback, form_of_directive = form_of_directive}
)
if cat_specs and cat_specs[1] then
return cat_specs, equiv_entry_placetype_and_qualifier.placetype
end
local cat_handler, equiv_entry_placetype_and_qualifier = m_placetypes.get_equiv_placetype_prop(entry_placetype,
function(equiv_entry_pt)
local entry_placetype_data = m_placetypes.placetype_data[equiv_entry_pt]
if entry_placetype_data and entry_placetype_data.cat_handler then
return entry_placetype_data.cat_handler
end
end,
{no_fallback = no_fallback, form_of_directive = form_of_directive}
)
if cat_handler then
local cat_specs = m_placetypes.get_equiv_placetype_prop(holonym_placetype,
function(equiv_holonym_pt) return cat_handler {
entry_placetype = equiv_entry_placetype_and_qualifier.placetype,
holonym_placetype = equiv_holonym_pt,
holonym_placename = holonym_placename,
holonym_index = index,
place_desc = place_desc,
from_demonym = from_demonym,
} end)
if cat_specs and cat_specs[1] then
return cat_specs, equiv_entry_placetype_and_qualifier.placetype
end
end
if not no_fallback then
local cat_specs, equiv_entry_placetype_and_qualifier = m_placetypes.get_equiv_placetype_prop(entry_placetype,
function(equiv_entry_pt)
local entry_placetype_data = m_placetypes.placetype_data[equiv_entry_pt]
if entry_placetype_data then
return m_placetypes.get_equiv_placetype_prop(holonym_placetype,
function(equiv_holonym_pt)
return entry_placetype_data[equiv_holonym_pt .. "/*"]
end)
end
end,
{form_of_directive = form_of_directive}
)
if cat_specs and cat_specs[1] then
return cat_specs, equiv_entry_placetype_and_qualifier.placetype
end
end
return nil
end
if overriding_holonym then
-- FIXME, change the algorithm to eliminate overriding_holonym
local cat_specs, fetched_entry_placetype = fetch_cat_specs(overriding_holonym, nil)
if cat_specs and cat_specs[1] then
return {
entry_placetype = fetched_entry_placetype,
cat_specs = cat_specs,
triggering_holonym = overriding_holonym,
-- no triggering_holonym_index
}
end
else
-- We loop twice over holonyms, the first time setting `no_fallback` so that we process only category specs for
-- the specifically given entry placetype (possibly with preceding qualifiers). The reason for this is to
-- correctly handle cases like [[Poblacion IX]]:
-- {{place|en|barangay|mun/Roxas|p/Capiz|c/Philippines}}.
-- "barangay" falls back to "neighborhood", and without the `no_fallback` loop, the neighborhood cat handler run
-- on the mun/Roxas holonym will take precedence over the barangay-specific setting for p/Capiz because we
-- check, for each holonym in turn, first for a matching spec through political_division_cat_handler, then a cat
-- handler, then a wildcard spec like country/*. During the first no-fallback loop, we disable checking for
-- wildcard specs because it seems a fallback matching exactly or through a cat handler on an earlier holonym
-- would be better than a wildcard match for the exact entry placetype at a later holonym. (FIXME: But I don't
-- know for sure; maybe we should check wildcard holonyms on the exact entry placetype first, or contrariwise
-- maybe we should check only exact-match holonyms through political_division_cat_handler on the exact entry
-- placetype first, not even checking other cat handlers.)
for i, holonym in ipairs(place_desc.holonyms) do
if first_holonym_index and i < first_holonym_index then
-- continue
else
local cat_specs, fetched_entry_placetype = fetch_cat_specs(holonym, i, "no_fallback")
if cat_specs and cat_specs[1] then
return {
entry_placetype = fetched_entry_placetype,
cat_specs = cat_specs,
triggering_holonym = holonym,
triggering_holonym_index = i,
}
end
end
end
for i, holonym in ipairs(place_desc.holonyms) do
if first_holonym_index and i < first_holonym_index then
-- continue
else
local cat_specs, fetched_entry_placetype = fetch_cat_specs(holonym, i)
if cat_specs and cat_specs[1] then
return {
entry_placetype = fetched_entry_placetype,
cat_specs = cat_specs,
triggering_holonym = holonym,
triggering_holonym_index = i,
}
end
end
end
end
return nil
end
-- Turn a list of category specs (see comment at section top) into the corresponding categories (minus the language
-- code prefix). The function is given the following arguments:
-- (1) the category specs retrieved using find_placetype_cat_specs();
-- (2) the entry placetype used to fetch the entry in `placetype_data`
-- (3) the triggering holonym (a holonym object; see comment at top of file) used to fetch the category specs
-- (see top-of-section comment); or nil if no triggering holonym.
-- The return value is constructed as described in the top-of-section comment.
local function cat_specs_to_categories(place_desc, cat_data)
local all_cats = {}
local cat_specs, entry_placetype, triggering_holonym, triggering_holonym_index =
cat_data.cat_specs, cat_data.entry_placetype, cat_data.triggering_holonym, cat_data.triggering_holonym_index
if triggering_holonym then
for _, cat_spec in ipairs(cat_specs) do
local cat
if cat_spec == true then
cat = m_placetypes.pluralize_placetype(entry_placetype, "ucfirst") .. " " ..
m_placetypes.get_placetype_entry_preposition(entry_placetype) .. " +++"
else
cat = cat_spec
end
if cat:find("%+%+%+") then
local group, key, spec, container_trail = m_placetypes.find_matching_holonym_location {
holonym_placetype = triggering_holonym.placetype,
holonym_placename = triggering_holonym.unlinked_placename,
holonym_index = triggering_holonym_index,
place_desc = place_desc,
}
if group then
cat = cat:gsub("%+%+%+", m_strutils.replacement_escape(m_placetypes.get_prefixed_key(key, spec)))
insert(all_cats, cat)
else
mw.log(("Unable to insert category for cat spec '%s' because holonym '%s/%s' did not match a " ..
"known location"):format(cat, triggering_holonym.placetype, triggering_holonym.unlinked_placename))
track("cant-match-holonym-for-category-spec")
end
else
insert(all_cats, cat)
end
end
else
for _, cat_spec in ipairs(cat_specs) do
local cat
if cat_spec == true then
cat = m_placetypes.pluralize_placetype(entry_placetype, "ucfirst")
else
cat = cat_spec
if cat:find("%+%+%+") then
internal_error("Category %s contains +++ but there is no holonym to substitute", cat)
end
end
insert(all_cats, cat)
end
end
return all_cats
end
-- Return the categories (without initial lang code) that should be added to the entry, given the place description
-- (which specifies the entry placetype(s) and holonym(s); see top of file) and a particular entry placetype (e.g.
-- "city"). Note that only the holonyms from the place description are looked at, not the entry placetypes in the place
-- description.
local function get_placetype_cats(place_desc, entry_placetype, from_demonym, form_of_directive)
local cats = {}
local first_holonym_index = 1
while first_holonym_index <= #place_desc.holonyms do
-- Find the category specs (see top-of-file comment) corresponding to the holonym(s) in the place description.
local cat_data = find_placetype_cat_specs {
entry_placetype = entry_placetype,
place_desc = place_desc,
first_holonym_index = first_holonym_index,
from_demonym = from_demonym,
form_of_directive = form_of_directive,
}
-- Check if no category spec could be found.
if not cat_data then
break
end
local triggering_holonym = cat_data.triggering_holonym
if not triggering_holonym then
internal_error("find_placetype_cat_specs should have returned a triggering holonym: %s", cat_data)
end
-- Generate categories for the category specs found.
extend(cats, cat_specs_to_categories(place_desc, cat_data))
-- Also generate categories for other holonyms of the same placetype, so that e.g.
-- {{place|en|city|s/Kansas|and|s/Missouri|c/USA}} generates both [[:Category:en:Cities in Kansas, USA]] and
-- [[:Category:en:Cities in Missouri, USA]].
first_holonym_index = cat_data.triggering_holonym_index
-- Loop over non-fallback equivalent placetypes to the triggering holonym's placetype, in case it is
-- non-canonical (e.g. `cities/San Francisco`). This matches the loop over equivalent places in
-- key_holonym_into_place_desc().
local equiv_triggering_placetypes = m_placetypes.get_placetype_equivs(triggering_holonym.placetype,
{no_fallback = true})
for _, equiv in ipairs(equiv_triggering_placetypes) do
local other_holonyms_of_same_type = place_desc.holonyms_by_placetype[equiv.placetype]
if other_holonyms_of_same_type then
for _, other_placename_of_same_type in ipairs(other_holonyms_of_same_type) do
if other_placename_of_same_type ~= triggering_holonym.unlinked_placename then
local overriding_holonym = {
placetype = triggering_holonym.placetype,
unlinked_placename = other_placename_of_same_type,
}
local other_cat_data = find_placetype_cat_specs {
entry_placetype = entry_placetype,
place_desc = place_desc,
overriding_holonym = overriding_holonym,
from_demonym = from_demonym,
form_of_directive = form_of_directive,
}
if other_cat_data then
extend(cats, cat_specs_to_categories(place_desc, other_cat_data))
end
end
end
end
end
-- If there are any later-specified holonyms that had the modifier :also, try to produce categories for them
-- as well.
first_holonym_index = first_holonym_index + 1
while first_holonym_index <= #place_desc.holonyms do
if place_desc.holonyms[first_holonym_index].continue_cat_loop then
break
end
first_holonym_index = first_holonym_index + 1
end
end
if cats[1] then
return cats
end
local entry_pt_default, equiv_entry_placetype_and_qualifier =
m_placetypes.get_equiv_placetype_prop(entry_placetype, function(pt)
return m_placetypes.placetype_data[pt] and m_placetypes.placetype_data[pt].default
end,
{form_of_directive = form_of_directive})
if entry_pt_default then
return cat_specs_to_categories(place_desc, {
cat_specs = entry_pt_default,
entry_placetype = equiv_entry_placetype_and_qualifier.placetype,
-- no triggering holonym
})
end
return {}
end
--[==[
Iterate through each type of place and return a list of the categories that need to be added to the entry. The returned
categories need to be formatted using `format_cats`, as they can be either topic-style categories (by default) or
langname-style categories (if prefixed with `cln:`). The function is passed the overall place spec, which contains all
the parsed info on the {{tl|place}} call (see comment at top of file), the parsed arguments (needed for arguments
not parsed by `parse_overall_place_spec` and used primarily to add "bare categories" corresponding to toponyms for known
locations), and `from_demonym`, which is true if we're being called from {{tl|demonym-noun}} or {{tl|demonym-adj}} (in
this case, we only want certain categories added, specifically bare categories corresponding to the specified
holonym(s)).
]==]
function export.get_cats(args, overall_place_spec, from_demonym)
local cats = {}
local place_descriptions = overall_place_spec.descs
handle_category_implications(place_descriptions, m_placetypes.cat_implications)
m_placetypes.augment_holonyms_with_container(place_descriptions)
if overall_place_spec.directives then -- not necessarily when called from [[Module:demonym]]
for _, directive_terms in ipairs(overall_place_spec.directives) do
local spec_cats = directive_terms.spec.cat
if spec_cats then
if type(spec_cats) == "string" then
spec_cats = {spec_cats}
end
for _, spec_cat in ipairs(spec_cats) do
insert(cats, spec_cat)
end
end
if directive_terms.spec.type_prefix then
for _, place_desc in ipairs(place_descriptions) do
for _, placetype in ipairs(place_desc.placetypes) do
if not m_placetypes.placetype_is_ignorable(placetype) then
extend(cats, get_placetype_cats(place_desc, placetype, from_demonym,
directive_terms.spec.type_prefix))
end
end
end
end
end
end
if not from_demonym then
local bare_categories = m_placetypes.get_bare_categories(args, overall_place_spec)
extend(cats, bare_categories)
end
for _, place_desc in ipairs(place_descriptions) do
if not from_demonym then
for _, placetype in ipairs(place_desc.placetypes) do
if not m_placetypes.placetype_is_ignorable(placetype) then
extend(cats, get_placetype_cats(place_desc, placetype))
end
end
end
-- Also add generic place categories for the holonyms listed (e.g. a category like
-- [[Category:Places in Merseyside, England]]). This is handled through the special placetype "*".
extend(cats, get_placetype_cats(place_desc, "*", from_demonym))
end
if args.cat then -- not necessarily when called from [[Module:demonym]]
for _, cat in ipairs(args.cat) do
local split_cats = split_on_comma(cat)
extend(cats, split_cats)
end
end
return cats
end
-- Return the category link for a category, given the language code and the name of the category.
local function format_cats(lang, cats, sort_key)
local full_cats = {}
local langcode = lang:getFullCode()
for _, cat in ipairs(cats) do
-- 'cln' corresponds to {{cln}}, which generates lang-name categories like [[:Category:English abbreviations]]
-- (as opposed to topic categories like [[:Category:en:Abbreviations of states of the United States]]).
local cln_cat = cat:match("^cln:(.*)$")
if cln_cat then
insert(full_cats, lang:getFullName() .. " " .. cln_cat)
else
insert(full_cats, langcode .. ":" .. cat)
end
end
return require(utilities_module).format_categories(full_cats, lang, sort_key, nil,
force_cat or m_placetypes.get_force_cat())
end
----------- Main entry point
--[==[
Implementation of {{tl|place}}. Meant to be callable from another module (specifically, [[Module:transclude]]). The
single argument `data` is an object with the following fields:
* `template_args`: Raw arguments specified by {{tl|place}}, possibly modified by {{tl|tcl}}.
* `from_tcl`: True if we're being invoked from {{tl|tcl}}.
* `drop_extra_info`: True if we should drop most of the "extra info" specified using extra info arguments (capital,
largest city, etc.). Usually true when invoked from {{tl|tcl}}. Note that some extra info is still displayed even
when `drop_extra_info` is set in order to establish the context (e.g. {{para|full}} and {{para|modern}}), and any
extra info overridden at the {{tl|tcl}} level is displayed regardless.
* `extra_info_overridden_set`: Set of booleans specifying, for each extra info arg, whether it was overridden at the
{{tl|tcl}} level. This means, for example, that the values are interpreted according to the language in {{para|1}}
instead of always defaulting to English, as is the case when {{tl|place}} is called directly.
* `form_of_overridden_args`: Set of objects of the form `{new_directive = ``directive``, new_value = ``value``}` for
overriding a given form-of directive (the key) with new directive ``directive`` and new unparsed value ``value``.
Both the key and the replacing directive should be canonical. ``value`` will be parsed in the same way as a regular
form-of directive except that all specified terms are interpreted in the language specified in {{para|1}}, never in
English. This is present so that {{tl|tcl}} can be used on abbreviations like [[GDR]] and [[FYROM]], whose
equivalents in a foreign language have language-specific expansions but where the rest of the call should stay the
same.
* `translation_follows`: If true, any translation specified using t= should follow the definition, after a colon,
rather than preceding, with the definition in parens.
]==]
function export.format(data)
local template_args = data.template_args
local list_param = {list = true}
local boolean_param = {type = "boolean"}
local params = {
[1] = {required = true, type = "language", default = "und"},
[2] = {required = true, list = true},
["t"] = list_param,
["tid"] = {list = true, allow_holes = true},
["cat"] = list_param,
["nocat"] = boolean_param,
["nocap"] = boolean_param,
["sort"] = true,
["pagename"] = true, -- for testing or documentation purposes
["a"] = true,
["addl"] = true,
["def"] = true,
-- params that are only used when transcluding using {{tcl}}/{{transclude}}, to transmit information to {{tcl}}.
["tcl"] = true,
["tcl_t"] = list_param,
["tcl_tid"] = list_param,
["tcl_nolb"] = true,
["tcl_nolc"] = boolean_param,
["tcl_noextratext"] = boolean_param,
}
-- add "extra info" parameters
for _, extra_arg_spec in ipairs(export.extra_info_args) do
params[extra_arg_spec.arg] = list_param
end
-- FIXME, once we've flushed out any uses, delete the following clause. That will cause def= to be ignored.
if template_args.def == "" then
error("Cannot currently pass def= as an empty parameter; use def=- if you want to suppress the definition display")
end
local args = require("Module:parameters").process(template_args, params)
if args.a then
track("a")
if args.a:find("^[Aa]n?$") or args.a:find("^[Tt]he$") then
track("a/article")
else
error("a= can only be used to specify a definite or indefinite article (and preferably use |nocap=1 instead to get the initial letter lowercase); see especially the documentation on the [[Template:place#Mixed format|mixed format]], which can be used to add arbitrary text before the placetype")
end
end
data.args = args
local overall_place_spec = parse_overall_place_spec(data)
data.overall_place_spec = overall_place_spec
return get_def(data) .. (
args.nocat and "" or format_cats(args[1], export.get_cats(args, overall_place_spec), args.sort))
end
--[==[
Actual entry point of {{tl|place}}.
]==]
function export.show(frame)
return export.format {
template_args = frame:getParent().args,
}
end
return export
7oy34n5np1xetyvtopk0kg7ib5dxhdr
ᥖᥣᥭ
0
289654
5720822
1651642
2026-04-21T08:19:40Z
Ai Ku Karng
17824
/* ภาษาไทใต้คง */
5720822
wikitext
text/x-wiki
== ภาษาไทใต้คง ==
=== รากศัพท์ ===
{{inh+|tdd|tai-swe-pro|*taːjᴬ²}}, จาก{{inh|tdd|tai-pro|*p.taːjᴬ}}; ร่วมเชื้อสายกับ{{cog|th|ตาย}}, {{cog|nod|ᨲᩣ᩠ᨿ}}, {{cog|lo|ຕາຍ}}, {{cog|khb|ᦎᦻ}}, {{cog|blt|ꪔꪱꪥ}}, {{cog|shn|တၢႆ}}, {{cog|kht|တၢဲႈ}}, {{cog|phk|တႝ}}, {{cog|aho|𑜄𑜩}}, {{cog|za|dai}}
=== การออกเสียง ===
* {{IPA|tdd|/taːj˧˧/}}
=== คำกริยา ===
{{tdd-verb}}
# [[ตาย]]
mg0eapmp57gwvgvvanek0y93vxdzuom
ᥐᥨᥝ
0
296722
5720734
1651996
2026-04-21T05:46:45Z
Ai Ku Karng
17824
/* ภาษาไทใต้คง */
5720734
wikitext
text/x-wiki
== ภาษาไทใต้คง ==
=== การออกเสียง ===
* {{IPA|tdd|/ko˧˧/}}
=== รากศัพท์ 1 ===
ร่วมเชื้อสายกับ{{cog|th|กลัว}}, {{cog|lo|ກົວ}}, {{cog|tts|กัว}}, {{cog|nod|ᨠᩖ᩠ᩅᩫ}}, {{cog|kkh|ᨠ᩠ᩅᩫ}}, {{cog|khb|ᦷᦂ}}, {{cog|shn|ၵူဝ်}}, {{cog|blt|ꪀꪺ}}, {{cog|aho|𑜀𑜥}}
==== คำกริยา ====
{{tdd-verb}}
# {{lb|tdd|สกรรม}} [[กลัว]]
=== รากศัพท์ 2 ===
{{inh+|tdd|tai-pro|*koːᴬ}}; ร่วมเชื้อสายกับ{{cog|th|กอ}}, {{cog|lo|ກໍ}}, {{cog|tts|กอ}}, {{cog|khb|ᦂᦸ}}, {{cog|shn|ၵေႃ}}, {{cog|blt|ꪀꪷ}}, {{cog|aho|𑜀𑜦𑜡}}, {{cog|za|go}}
==== คำนาม ====
{{tdd-noun}}
# [[กอ]] {{gloss|กลุ่มพืช}}
{{topics|tdd|ความกลัว}}
63chloxyp7jt444qjyukvwicsgru1od
ᥐᥝᥲ
0
300442
5720737
1652138
2026-04-21T06:21:46Z
Ai Ku Karng
17824
/* ภาษาไทใต้คง */
5720737
wikitext
text/x-wiki
== ภาษาไทใต้คง ==
=== รากศัพท์ ===
{{inh+|tdd|tai-pro|*kɤwꟲ}}, จาก{{der|tdd|ltc|-}} {{ltc-l|九}}, จาก{{der|tdd|och|-}} {{och-l|九}}, จาก{{der|tdd|sit-pro|*d/s-kəw}}; ร่วมเชื้อสายกับ{{cog|th|เก้า}}, {{cog|nod|ᨠᩮᩢ᩶ᩣ}}, {{cog|lo|ເກົ້າ}}, {{cog|khb|ᦂᧁᧉ}}, {{cog|blt|ꪹꪀ꫁ꪱ}}, {{cog|shn|ၵဝ်ႈ}}, {{cog|aho|𑜀𑜧}}, {{cog|pcc|guz}}, {{cog|za|gouj}}, {{cog|skb|กู̂}}
=== การออกเสียง ===
* {{IPA|tdd|/kaw˧˩/}}
=== เลข ===
{{tdd-num}}
# [[เก้า]]
qb1ym41jysogtb6b7nsin76ujkpzuor
ᥟᥧᥱ
0
301006
5720736
1513076
2026-04-21T06:07:28Z
Ai Ku Karng
17824
/* ภาษาไทใต้คง */
5720736
wikitext
text/x-wiki
== ภาษาไทใต้คง ==
=== รากศัพท์ ===
ร่วมเชื้อสายกับ{{cog|shn|ဢူႇ}}, {{cog|nod|ᩋᩪ᩵}}, {{cog|tts|อู่}}, {{cog|lo|ອູ່}}, {{cog|kkh|ᩋᩪ᩵}}, {{cog|khb|ᦀᦴᧈ}}, {{cog|blt|ꪮꪴ꪿}}, {{cog|za|uq|tr=อู่}}, {{cog|zzj|wq|tr=อื่อ|t=อู่}}
=== การออกเสียง ===
* {{IPA|tdd|/ʔu˩˩/}}
=== คำนาม ===
{{tdd-noun}}
# [[เปล]], [[อู่]] {{gloss|เปล}}
p2f4ldhxubgusqgk78q4b0cpogvvc74
ᥕᥣᥒ
0
303770
5720743
4614495
2026-04-21T06:42:03Z
Ai Ku Karng
17824
/* ภาษาไทใต้คง */
5720743
wikitext
text/x-wiki
== ภาษาไทใต้คง ==
=== รากศัพท์ ===
{{inh+|tdd|tai-swe-pro|*jaːŋᴮ²}}; ร่วมเชื้อสายกับ{{cog|th|ย่าง}}, {{cog|tts|ญ่าง}} หรือ {{m|tts|ย่าง}}, {{cog|lo|ຍ່າງ}}, {{cog|nyw|ญ่าง}}, {{cog|khb|ᦍᦱᧂᧈ}}, {{cog|blt|ꪑ꪿ꪱꪉ}}, {{cog|shn|ယၢင်ႈ}}, {{cog|zzj|yangz}}, {{cog|za|yangz}}
=== การออกเสียง ===
* {{IPA|tdd|/jaːŋ˧˧/}}
=== คำกริยา ===
{{tdd-verb}}
# [[เดิน]]
76fm6aom7613neane6fjnga72s9js58
ᥕᥣᥒᥲ
0
303772
5720742
1652270
2026-04-21T06:38:53Z
Ai Ku Karng
17824
/* ภาษาไทใต้คง */
5720742
wikitext
text/x-wiki
== ภาษาไทใต้คง ==
=== รากศัพท์ ===
{{inh+|tdd|tai-swe-pro|*ˀjaːŋꟲ¹}}, จาก{{inh|tdd|tai-pro|*ˀjɯəŋꟲ}}; ร่วมเชื้อสายกับ{{cog|th|ย่าง}}, {{cog|lo|ຢ້າງ}}, {{cog|khb|ᦊᦱᧂᧉ}}, {{cog|shn|ယၢင်ႈ}}
=== การออกเสียง ===
* {{IPA|tdd|/jaːŋ˧˩/}}
=== คำกริยา ===
{{tdd-verb}}
# [[ย่าง]], [[ปิ้ง]]
8k0oqlj2sso2z5y1ftmllastjn5m5a2
ဝေင်ꩻ
0
311609
5720708
1581456
2026-04-21T01:59:10Z
OctraBot
3198
/* ภาษากะเหรี่ยงปะโอ */
5720708
wikitext
text/x-wiki
== ภาษากะเหรี่ยงปะโอ ==
=== คำนาม ===
{{head|blk|คำนาม}}
# [[นคร]], [[เมืองใหญ่]], [[เวียง]]
1pb90fnn57evf5mgezznrm9ldnv04b9
राजधानी
0
314314
5720696
1605653
2026-04-21T01:38:56Z
OctraBot
3198
บอต: แทนที่ข้อความโดยอัตโนมัติ (-\|เมืองใหญ่\}\} +|นคร}})
5720696
wikitext
text/x-wiki
== ภาษากงกัณ ==
=== รากศัพท์ ===
{{lbor|kok|sa|राजधानी}}, จาก{{com|sa|राज|धानी|gloss2=บ้าน, เรือน|nocat=1}}
=== คำนาม ===
{{kok-pos|n|razdhani|ರಾಜ್ಧಾನಿ}}
# [[เมืองหลวง]], [[เมืองเอก]]
{{topics|kok|นคร}}
== ภาษาเนปาล ==
=== รากศัพท์ ===
{{lbor|ne|sa|राजधानी}}, จาก{{com|sa|राज|धानी|gloss2=บ้าน, เรือน|nocat=1}}
=== การออกเสียง ===
* {{ne-IPA|rājdhānī}}
=== คำนาม ===
{{ne-noun}}
# [[เมืองหลวง]], [[เมืองเอก]]
{{topics|ne|นคร}}
== ภาษาฮินดี ==
=== รากศัพท์ ===
{{lbor|hi|sa|राजधानी}}, จาก{{com|sa|राज|धानी|gloss2=บ้าน, เรือน|nocat=1}}
=== การออกเสียง ===
* {{hi-IPA}}
=== คำนาม ===
{{hi-noun|g=f}}
# [[เมืองหลวง]], [[เมืองเอก]]
#: {{syn|hi|दारुलहुकूमत}}
==== การผันรูป ====
{{hi-ndecl|<F>}}
{{topics|hi|นคร}}
d8hgdimotuve1y6zlqunuslq209wfvr
หมวดหมู่:th:นครในไทย
14
316295
5720693
1909146
2026-04-21T01:34:30Z
OctraBot
3198
OctraBot ย้ายหน้า [[หมวดหมู่:th:เมืองใหญ่ในไทย]] ไปยัง [[หมวดหมู่:th:นครในไทย]] โดยไม่สร้างหน้าเปลี่ยนทางตามมา
1610099
wikitext
text/x-wiki
{{auto cat}}
eomzlm5v4j7ond1phrju7cnue91g5qx
แม่แบบ:langname-lite
10
324689
5720746
1626212
2026-04-21T06:48:49Z
OctraBot
3198
5720746
wikitext
text/x-wiki
<includeonly>{{#switch:{{str left|{{{1<noinclude>|en</noinclude>}}}|1}}
|a={{#switch:{{{1|}}}
|aa=Afar
|aag=Ambrak
|aak=Ankave
|aan=Anambé
|aau=Abau
|aav={{langname-lite/familycode|Austroasiatic|{{{is family|}}}|{{{allow family|}}}}}
|aav-pro=Proto-Austroasiatic
|aav-khs-pro=Proto-Khasian
|aaz=Amarasi
|ab=Abkhaz
|abc=Ambala Ayta
|abe=Abenaki
|abp=Abenlen Ayta
|abs=Ambonese Malay
|abx=Inabaknon
|ace=Acehnese
|acv=Achumawi
|acw=Hijazi Arabic
|acy=Cypriot Arabic
|acz=Acheron
|ada=Adangme
|adl=Galo
|adw=Amondawa
|ady=West Circassian
|adz=Adzera
|ae=Avestan
|aeb=Tunisian Arabic
|aek=Haeke
|aem=Arem
|aey=Amele
|af=Afrikaans
|afa-pro=Proto-Afroasiatic
|afb=Gulf Arabic
|agn=Agutaynen
|agv=Remontado Agta
|aho=Ahom
|aht=Ahtna
|aii=Assyrian Neo-Aramaic
|ain=Ainu
|aio=Aiton
|ajg=Aja (West Africa)
|aji=Ajië
|ajp=South Levantine Arabic
|ak=Akan
|akk=Akkadian
|akl=Aklanon
|akr=Araki
|alc=Kawésqar
|ale=Aleut
|ali=Amaimon
|alj=Alangan
|als={{langname-lite/etymcode|Tosk Albanian|Albanian|{{{allow etym|}}}}}
|alt=Southern Altai
|alu='Are'are
|alv-gbe-pro=Proto-Gbe
|ami=Amis
|amm=Ama
|ams=Southern Amami Ōshima
|amu=Guerrero Amuzgo
|an=Aragonese
|ane=Xârâcùù
|ang=Old English
|anm=Anāl
|anq=Jarawa
|anw=Anaang
|aoa=Angolar
|aot=Atong (India)
|apc=North Levantine Arabic
|apl=Lipan
|apt=Apatani
|apw=Western Apache
|aqd=Ampari Dogon
|aqg=Arigidi
|ar=Arabic
|arc=Aramaic
|ark=Arikapú
|arn=Mapudungun
|arx=Aruá
|ary=Moroccan Arabic
|arz=Egyptian Arabic
|as=Assamese
|asb=Assiniboine
|ask=Ashkun
|ast=Asturian
|atc=Atsahuaca
|atd=Ata Manobo
|ath-pro=Proto-Athabaskan
|att=Pamplona Atta
|atz=Arta
|aui=Anuki
|avu=Avokaya
|awb=Awa (New Guinea)
|awg=Anguthimri
|awt=Araweté
|awx=Awara
|ay=Aymara
|ayl=Libyan Arabic
|az=Azerbaijani
|azc-nah-pro=Proto-Nahuan
|azc-pro=Proto-Uto-Aztecan
|azd=Eastern Durango Nahuatl
|azg=San Pedro Amuzgos Amuzgo
|#default={{langname-lite/unknowncode|{{{1}}}}}}}
|b={{#switch:{{{1|}}}
|ba=Bashkir
|ban=Balinese
|bar=Bavarian
|bat={{langname-lite/familycode|Baltic|{{{is family|}}}|{{{allow family|}}}}}
|bat-pro={{langname-lite/etymcode|Proto-Baltic|Proto-Balto-Slavic|{{{allow etym|}}}}}
|bay=Batuley
|bbb=Barai
|bbd=Bau
|bbn=Uneapa
|bbr=Girawa
|bca=Central Bai
|bch=Bariai
|bcl=Central Bikol
|bdq=Bahnar
|be=Belarusian
|bej=Beja
|bem=Bemba
|ber-pro=Proto-Berber
|beu=Blagar
|bew=Betawi
|bew-kot={{langname-lite/etymcode|Betawi Kota|Betawi|{{{allow etym|}}}}}
|bfa=Bari
|bfs=Southern Bai
|bft=Balti
|bg=Bulgarian
|bgs=Tagabawa
|bgt=Bughotu
|bhg=Binandere
|bi=Bislama
|bji=Burji
|bjn=Banjarese
|bkd=Binukid
|bkl=Berik
|bks=Masbate Sorsogon
|bla=Blackfoot
|ble=Balanta-Kentohe
|bll=Biloxi
|bln=Southern Catanduanes Bikol
|blr=Blang
|blt=Tai Dam
|blx=Mag-Indi Ayta
|bm=Bambara
|bmh=Kein
|bmi=Bagirmi
|bmr=Muinane
|bmu=Somba-Siawari
|bmx=Baimak
|bn=Bengali
|bnn=Bunun
|bno=Asi
|bnq=Bantik
|bnt-lal=Lala (South Africa)
|bnt-phu=Phuthi
|bnt-pro=Proto-Bantu
|bo=Tibetan
|bor=Borôro
|bpg=Bonggo
|bpi=Bagupi
|bps=Sarangani Blaan
|bqb=Bagusa
|bqc=Boko
|bqp=Busa
|br=Breton
|brg=Baure
|brh=Brahui
|brx=Bodo (India)
|bsa=Abinomn
|bsh=Kamkata-viri
|bsk=Burushaski
|bsq=Bassa
|btn=Ratagnon
|bto=Rinconada Bikol
|btw=Butuanon
|bug=Buginese
|byn=Blin
|byt=Berti
|bzj=Belizean Creole
|#default={{langname-lite/unknowncode|{{{1}}}}}}}
|c={{#switch:{{{1|}}}
|ca=Catalan
|caa=Ch'orti'
|cab=Garifuna
|cal=Carolinian
|car=Kari'na
|cav=Cavineña
|cba-nut=Nutabe
|cbi=Chachi
|cbk=Chavacano
|ccs-pro=Proto-Kartvelian
|cdc-cbm-pro=Proto-Central Chadic
|cdc-pro=Proto-Chadic
|cdm=Chepang
|cdo=Eastern Min
|ce=Chechen
|ceb=Cebuano
|cel={{langname-lite/familycode|Celtic|{{{is family|}}}|{{{allow family|}}}}}
|cel-bry-pro=Proto-Brythonic
|cel-gau=Gaulish
|cel-pro=Proto-Celtic
|cgc=Kagayanen
|ch=Chamorro
|chb=Chibcha
|chg=Chagatai
|chk=Chuukese
|chl=Cahuilla
|chn=Chinook Jargon
|cho=Choctaw
|chp=Chipewyan
|chy=Cheyenne
|cia=Cia-Cia
|cic=Chickasaw
|cim=Cimbrian
|cja=Western Cham
|cjm=Eastern Cham
|cjo=Pajonal Ashéninka
|cjs=Shor
|ckb=Central Kurdish
|ckv=Kavalan
|clc=Chilcotin
|clw=Chulym
|cmc-pro=Proto-Chamic
|cmn=Mandarin
|cmn-ear={{langname-lite/etymcode|Early Mandarin|Mandarin|{{{allow etym|}}}}}
|cng=Northern Qiang
|cnk=Khumi Chin
|cnx=Middle Cornish
|co=Corsican
|cof=Tsafiki
|com=Comanche
|con=Cofán
|coo=Comox
|cps=Capiznon
|crg=Michif
|crh=Crimean Tatar
|cro=Crow
|crs=Seychellois Creole
|crw=Chrau
|cs=Czech
|csb=Kashubian
|ctd=Tedim Chin
|cts=Northern Catanduanes Bikol
|cu=Old Church Slavonic
|cus-pro=Proto-Cushitic
|cv=Chuvash
|cy=Welsh
|cyo=Cuyunon
|#default={{langname-lite/unknowncode|{{{1}}}}}}}
|d={{#switch:{{{1|}}}
|da=Danish
|dag=Dagbani
|dak=Dakota
|dcr=Negerhollands
|de=German
|dgc=Casiguran Dumagat Agta
|dgr=Dogrib
|dhv=Drehu
|dif=Dieri
|din=Dinka
|dis=Dimasa
|dje=Zarma
|djk=Aukan
|dlm=Dalmatian
|dmn-dam=Dama (Sierra Leone)
|dng=Dungan
|dni=Lower Grand Valley Dani
|doz=Dorze
|dra-okn=Old Kannada
|dsb=Lower Sorbian
|dtp=Central Dusun
|duf=Dumbea
|dum=Middle Dutch
|duo=Dupaningan Agta
|duu=Drung
|dux=Duun
|dv=Dhivehi
|dz=Dzongkha
|#default={{langname-lite/unknowncode|{{{1}}}}}}}
|E={{#switch:{{{1|}}}
|EL.={{langname-lite/etymcode|Ecclesiastical Latin|Latin|{{{allow etym|}}}}}
|#default={{langname-lite/unknowncode|{{{1}}}}}}}
|e={{#switch:{{{1<noinclude>|en</noinclude>}}}
|ebk=Eastern Bontoc
|ee=Ewe
|eee=E
|efi=Efik
|egl=Emilian
|egy=Egyptian
|el=Greek
|emb=Embaloh
|emi=Mussau-Emira
|en=English
|enm=Middle English
|eo=Esperanto
|es=Spanish
|esx-esk-pro=Proto-Eskimo
|esx-inu-pro=Proto-Inuit
|et=Estonian
|ett=Etruscan
|eu=Basque
|euq-pro=Proto-Basque
|evn=Evenki
|ext=Extremaduran
|eya=Eyak
|#default={{langname-lite/unknowncode|{{{1}}}}}}}
|f={{#switch:{{{1|}}}
|fa=Persian
|fab=Annobonese
|fad=Wagi
|fax=Fala
|fbl=West Miraya Bikol
|ff=Fula
|fi=Finnish
|fit=Meänkieli
|fiu-pro={{langname-lite/etymcode|Proto-Finno-Ugric|Proto-Uralic|{{{allow etym|}}}}}
|fj=Fijian
|fkv=Kven
|fmp=Fe'fe'
|fng=Fanagalo
|fo=Faroese
|foi=Foi
|fon=Fon
|fos=Siraya
|fr=French
|fr-CA={{langname-lite/etymcode|Canadian French|French|{{{allow etym|}}}}}
|frd=Fordata
|frk={{langname-lite/etymcode|Frankish|Proto-West Germanic|{{{allow etym|}}}}}
|frm=Middle French
|fro=Old French
|fro-nor={{langname-lite/etymcode|Old Northern French|Old French|{{{allow etym|}}}}}
|frp=Franco-Provençal
|frr=North Frisian
|fud=East Futuna
|fur=Friulian
|fut=Futuna-Aniwa
|fwa=Fwâi
|fy=West Frisian
|#default={{langname-lite/unknowncode|{{{1}}}}}}}
|g={{#switch:{{{1|}}}
|ga=Irish
|gaa=Ga
|gad=Gaddang
|gag=Gagauz
|gah=Alekano
|gal=Galoli
|gap=Gal
|gaw=Nobonob
|gbf=Gaikundi
|gce=Galice
|gcf=Antillean Creole
|gd=Scottish Gaelic
|gem={{langname-lite/familycode|Germanic|{{{is family|}}}|{{{allow family|}}}}}
|gem-pro=Proto-Germanic
|ges=Geser-Gorom
|gil=Gilbertese
|gim=Gimi (Papuan)
|gkm={{langname-lite/etymcode|Byzantine Greek|Ancient Greek|{{{allow etym|}}}}}
|gl=Galician
|gmh=Middle High German
|gml=Middle Low German
|gmq={{langname-lite/familycode|North Germanic|{{{is family|}}}|{{{allow family|}}}}}
|gmq-mno=Middle Norwegian
|gmq-oda=Old Danish
|gmq-osw=Old Swedish
|gmq-pro=Proto-Norse
|gmu=Gumalu
|gmw-cfr=Central Franconian
|gmw-ecg=East Central German
|gmw-jdt=Jersey Dutch
|gmw-pro=Proto-West Germanic
|gmw-stm=Sathmar Swabian
|gmy=Mycenaean Greek
|goh=Old High German
|gor=Gorontalo
|got=Gothic
|grc=Ancient Greek
|grh=Gbiri-Niragu
|grk-mar=Mariupol Greek
|grk-pro=Proto-Hellenic
|grt=Garo
|gsw=Alemannic German
|gtu=Aghu Tharrnggala
|gu=Gujarati
|gug=Paraguayan Guarani
|gul=Gullah
|gun=Mbya Guarani
|gur=Farefare
|guw=Gun
|gv=Manx
|gwi=Gwich'in
|gyb=Garus
|#default={{langname-lite/unknowncode|{{{1}}}}}}}
|h={{#switch:{{{1|}}}
|ha=Hausa
|haa=Hän
|hai=Haida
|hak=Hakka
|hal=Halang
|haw=Hawaiian
|hch=Huichol
|hdy=Hadiyya
|he=Hebrew
|hi=Hindi
|hid=Hidatsa
|hil=Hiligaynon
|hit=Hittite
|hmn-pro=Proto-Hmongic
|hmx-pro=Proto-Hmong-Mien
|ho=Hiri Motu
|hop=Hopi
|hro=Haroi
|hrx=Hunsrik
|hsb=Upper Sorbian
|ht=Haitian Creole
|hts=Hadza
|hu=Hungarian
|hup=Hupa
|huq=Tsat
|hur=Halkomelem
|huu=Murui Huitoto
|hvk=Haveke
|hwc=Hawaiian Creole
|hy=Armenian
|#default={{langname-lite/unknowncode|{{{1}}}}}}}
|i={{#switch:{{{1|}}}
|ia=Interlingua
|iba=Iban
|ibg=Ibanag
|ibl=Ibaloi
|id=Indonesian
|idb=Indo-Portuguese
|idi=Idi
|ie=Interlingue
|ifb=Batad Ifugao
|ifu=Mayoyao Ifugao
|ig=Igbo
|igl=Igala
|igo=Isebe
|ii=Nuosu
|iir-pro=Proto-Indo-Iranian
|ijj=Ede Ije
|ik=Inupiaq
|ilk=Ilongot
|ilo=Ilocano
|imn=Imonda
|inc-ash=Ashokan Prakrit
|inc-kho=Kholosi
|inc-oas=Early Assamese
|pra=Prakrit
|ine-bsl-pro=Proto-Balto-Slavic
|ine-pro=Proto-Indo-European
|ine-toc-pro=Proto-Tocharian
|ing=Deg Xinag
|inn=Isinai
|io=Ido
|iow=Chiwere
|ira-pro=Proto-Iranian
|iry=Iraya
|is=Icelandic
|isd=Isnag
|ish=Esan
|ist=Istriot
|it=Italian
|itc-ola={{langname-lite/etymcode|Old Latin|Latin|{{{allow etym|}}}}}
|itc-pro=Proto-Italic
|itl=Itelmen
|its=Itsekiri
|itv=Itawit
|iu=Inuktitut
|ium=Iu Mien
|ivb=Ibatan
|ivv=Ivatan
|izh=Ingrian
|#default={{langname-lite/unknowncode|{{{1}}}}}}}
|j={{#switch:{{{1|}}}
|ja=Japanese
|jam=Jamaican Creole
|jaz=Jawe
|jct=Krymchak
|jje=Jeju
|jkr=Koro (India)
|jpx-pro=Proto-Japonic
|jpx-ryu-pro=Proto-Ryukyuan
|jra=Jarai
|juc=Jurchen
|juh=Hone
|jv=Javanese
|#default={{langname-lite/unknowncode|{{{1}}}}}}}
|k={{#switch:{{{1|}}}
|ka=Georgian
|kaa=Karakalpak
|kab=Kabyle
|kac=Jingpho
|kak=Kayapa Kallahan
|kam=Kamba
|kar-pro=Proto-Karen
|kaw=Old Javanese
|kay=Kamayurá
|kbd=East Circassian
|kbk=Grass Koiari
|kbq=Kamano
|kcg=Tyap
|kdr=Karaim
|kea=Kabuverdianu
|kek=Q'eqchi
|ket=Ket
|kgp=Kaingang
|kha=Khasi
|khb=Lü
|khi-kun=ǃKung
|khl=Lusi
|kht=Khamti
|ki=Kikuyu
|kij=Kilivila
|kim=Tofa
|kiy=Kirikiri
|kjh=Khakas
|kju=Kashaya
|kk=Kazakh
|kky=Guugu Yimidhirr
|kl=Greenlandic
|klg=Tagakaulu Kalagan
|klq=Rumu
|kls=Kalasha
|klu=Klao
|klv=Maskelynes
|klw=Lindu
|km=Khmer
|kmb=Kimbundu
|kmc=Southern Kam
|kmf=Kare (New Guinea)
|kmk=Limos Kalinga
|kmr=Northern Kurdish
|knb=Lubuagan Kalinga
|kne=Kankanaey
|knf=Mankanya
|ko=Korean
|kok=Konkani
|kos=Kosraean
|koy=Koyukon
|kpg=Kapingamarangi
|kpm=Koho
|kpv=Komi-Zyrian
|kpw=Kobon
|kpx=Mountain Koiari
|kqf=Kakabai
|kqi=Koitabu
|kr=Kanuri
|kri=Krio
|krj=Kinaray-a
|krl=Karelian
|ks=Kashmiri
|ksd=Tolai
|ksi=Krisa
|ksk=Kansa
|ksw=S'gaw Karen
|ksx=Kedang
|ktb=Kambaata
|ktz=Juǀ'hoan
|kud=Auhelawa
|kum=Kumyk
|kus=Kusaal
|kuu=Upper Kuskokwim
|kw=Cornish
|kwa=Dâw
|kwe=Kwerba
|kwk=Kwak'wala
|kxd=Brunei Malay
|kxo=Kanoé
|kxs=Kangjia
|ky=Kyrgyz
|kzg=Kikai
|#default={{langname-lite/unknowncode|{{{1}}}}}}}
|L={{#switch:{{{1|}}}
|LL.={{langname-lite/etymcode|Late Latin|Latin|{{{allow etym|}}}}}
|#default={{langname-lite/unknowncode|{{{1}}}}}}}
|l={{#switch:{{{1|}}}
|la=Latin
|la-ecc={{langname-lite/etymcode|Ecclesiastical Latin|Latin|{{{allow etym|}}}}}
|la-lat={{langname-lite/etymcode|Late Latin|Latin|{{{allow etym|}}}}}
|la-med={{langname-lite/etymcode|Medieval Latin|Latin|{{{allow etym|}}}}}
|la-vul={{langname-lite/etymcode|Vulgar Latin|Latin|{{{allow etym|}}}}}
|lac=Lacandon
|lad=Ladino
|lay=Lama Bai
|lb=Luxembourgish
|lbk=Central Bontoc
|lbl=Libon Bikol
|lbn=Lamet
|lew=Ledo Kaili
|lg=Luganda
|lhu=Lahu
|li=Limburgish
|lic=Hlai
|lif=Limbu
|lij=Ligurian
|liv=Livonian
|lkt=Lakota
|lld=Ladin
|llu=Lau
|lml=Raga
|lmo=Lombard
|lmy=Laboya
|ln=Lingala
|lng={{langname-lite/etymcode|Lombardic|Old High German|{{{allow etym|}}}}}
|lo=Lao
|loc=Inonhan
|loj=Lou
|los=Loniu
|lou=Louisiana Creole
|lsi=Lashi
|lt=Lithuanian
|ltc=Middle Chinese
|ltg=Latgalian
|lud=Ludian
|luo=Luo
|lus=Mizo
|lut=Lushootseed
|lv=Latvian
|lwh=White Lachi
|lzz=Laz
|#default={{langname-lite/unknowncode|{{{1}}}}}}}
|M={{#switch:{{{1|}}}
|ML.={{langname-lite/etymcode|Medieval Latin|Latin|{{{allow etym|}}}}}
|#default={{langname-lite/unknowncode|{{{1}}}}}}}
|m={{#switch:{{{1|}}}
|mad=Madurese
|mag=Magahi
|mak=Makasar
|mam=Man
|map={{langname-lite/familycode|Austronesian|{{{is family|}}}|{{{allow family|}}}}}
|map-ata-pro=Proto-Atayalic
|map-pro=Proto-Austronesian
|maw=Mampruli
|maz=Central Mazahua
|mba=Higaonon
|mbb=Western Bukidnon Manobo
|mbd=Dibabawon Manobo
|mbi=Ilianen Manobo
|mbj=Nadëb
|mch=Ye'kwana
|mcz=Mawan
|mdf=Moksha
|mdh=Maguindanao
|mee=Mengen
|mel=Central Melanau
|men=Mende (Sierra Leone)
|meo=Kedah Malay
|mfe=Mauritian Creole
|mfh=Matal
|mg=Malagasy
|mga=Middle Irish
|mh=Marshallese
|mhn=Mòcheno
|mhr=Eastern Mari
|mhx=Lhao Vo
|mi=Māori
|mih=Chayuco Mixtec
|min=Minangkabau
|miq=Miskito
|mis-phi=Philistine
|mk=Macedonian
|mkh-ban-pro=Proto-Bahnaric
|mkh-pro=Proto-Mon-Khmer
|mkh-vie-pro=Proto-Vietic
|mkj=Mokilese
|mkt=Vamale
|ml=Malayalam
|mlp=Bargam
|mlu=To'abaita
|mmg=North Ambrym
|mmn=Mamanwa
|mmr=Western Xiangxi Miao
|mn=Mongolian
|mnc=Manchu
|mnd=Mondé
|mnk=Mandinka
|mnp=Northern Min
|mnw=Mon
|moa=Mwan
|mog=Mongondow
|moh=Mohawk
|mop=Mopan Maya
|mos=Moore
|mpg=Marba
|mps=Dadibi
|mqe=Matepi
|mpj=Martu Wangka
|mqs=West Makian
|mqv=Mosimo
|mqw=Murupi
|mr=Marathi
|mrc=Maricopa
|mrk=Hmwaveke
|mro=Mru
|mrw=Maranao
|ms=Malay
|ms-cla={{langname-lite/etymcode|Classical Malay|Malay|{{{allow etym|}}}}}
|ms-old={{langname-lite/etymcode|Old Malay|Malay|{{{allow etym|}}}}}
|msb=Masbatenyo
|msk=Mansaka
|msm=Agusan Manobo
|msn=Vurës
|msq=Caac
|mt=Maltese
|mtc=Munit
|mte=Alu
|mtq=Muong
|mtv=Asaro'o
|mul=Translingual
|muz=Mursi
|mva=Manam
|mvd=Mamboru
|mvi=Miyako
|mwl=Mirandese
|mww=White Hmong
|my=Burmese
|myv=Erzya
|mzp=Movima
|mzw=Deg
|#default={{langname-lite/unknowncode|{{{1}}}}}}}
|n={{#switch:{{{1|}}}
|na=Nauruan
|nag=Naga Pidgin
|nah=Nahuatl
|nai-tap=Tapachultec
|nak=Nakanai
|nan=Min Nan
|nan-hbl=Hokkien
|nap=Neapolitan
|naz=Coatepec Nahuatl
|nb=Norwegian Bokmål
|nbk=Nake
|nce=Yale
|ncf=Notsi
|ncg=Nisga'a
|nch=Central Huasteca Nahuatl
|nci=Classical Nahuatl
|ncj=Northern Puebla Nahuatl
|nd=Northern Ndebele
|nds=Low German
|nds-de=German Low German
|nds-nl=Dutch Low Saxon
|ne=Nepali
|nec=Nedebang
|nef=Nefamese
|nem=Nemi
|nev=Nyaheun
|nfl=Äiwoo
|ngf-pro=Proto-Trans-New Guinea
|nhe=Eastern Huasteca Nahuatl
|nhn=Central Nahuatl
|nht=Ometepec Nahuatl
|nhx=Mecayapan Nahuatl
|nia=Nias
|nic-pro=Proto-Niger-Congo
|nio=Nganasan
|niu=Niuean
|niv=Nivkh
|niz=Ningil
|njm=Angami
|njo=Ao
|njz=Nyishi
|nkp=Niuatoputapu
|nkr=Nukuoro
|nl=Dutch
|nlc=Nalca
|nlg=Gela
|nmb=Big Nambas
|nn=Norwegian Nynorsk
|no=Norwegian
|nod=Northern Thai
|nog=Nogai
|non=Old Norse
|non-oen={{langname-lite/etymcode|Old East Norse|Old Norse|{{{allow etym|}}}}}
|nr=Southern Ndebele
|nrf=Norman
|nrl=Ngarluma
|nrn=Norn
|nso=Northern Sotho
|ntp=Northern Tepehuan
|nua=Yuanga
|nuk=Nootka
|nup=Nupe
|nus=Nuer
|nut=Nùng
|nv=Navajo
|nxq=Naxi
|ny=Chichewa
|nys=Nyunga
|nza=Tigon Mbembe
|nzd=Nzadi
|#default={{langname-lite/unknowncode|{{{1}}}}}}}
|o={{#switch:{{{1|}}}
|obr=Old Burmese
|obt=Old Breton
|oc=Occitan
|och=Old Chinese
|oco=Old Cornish
|odt=Old Dutch
|ofs=Old Frisian
|oge=Old Georgian
|ohu=Old Hungarian
|oj=Ojibwe
|ojp=Old Japanese
|oka=Okanagan
|okm=Middle Korean
|okn=Okinoerabu
|oko=Old Korean
|okz=Old Khmer
|okz-ang={{langname-lite/etymcode|Angkorian Old Khmer|Old Khmer|{{{allow etym|}}}}}
|olo=Livvi
|om=Oromo
|oma=Omaha-Ponca
|omq-otp-pro=Proto-Oto-Pamean
|omq-pro=Proto-Oto-Manguean
|omx=Old Mon
|ono=Onondaga
|ood=O'odham
|oon=Önge
|opo=Opao
|oro=Orokolo
|orv=Old East Slavic
|os=Ossetian
|osc=Oscan
|osp=Old Spanish
|osx=Old Saxon
|ota=Ottoman Turkish
|ote=Mezquital Otomi
|otk=Old Turkic
|oto-otm-pro=Proto-Otomi
|oto-pro=Proto-Otomian
|otw=Ottawa
|oui=Old Uyghur
|ovd=Elfdalian
|owl=Old Welsh
|#default={{langname-lite/unknowncode|{{{1}}}}}}}
|p={{#switch:{{{1|}}}
|pa=Punjabi
|paa-nha-pro=Proto-North Halmahera
|pac=Pacoh
|pag=Pangasinan
|pal=Middle Persian
|pam=Kapampangan
|pap=Papiamentu
|pau=Palauan
|pbv=Pnar
|pcc=Bouyei
|pcm=Nigerian Pidgin
|pdc=Pennsylvania German
|pdt=Plautdietsch
|pdu=Kayan
|peh=Bonan
|peo=Old Persian
|phi-pro=Proto-Philippine
|phk=Phake
|phl=Palula
|phn=Phoenician
|pi=Pali
|pis=Pijin
|piz=Pije
|pjt=Pitjantjatjara
|pkc=Baekje
|pkp=Pukapukan
|pl=Polish
|ple=Palu'e
|plg=Pilagá
|pln=Palenquero
|plu=Palikur
|plv=Southwest Palawano
|plw=Brooke's Point Palawano
|ply=Bolyu
|pml=Sabir
|pms=Piedmontese
|pnr=Panim
|pns=Ponosakan
|pnw=Panyjima
|pon=Pohnpeian
|poo=Central Pomo
|pov=Guinea-Bissau Creole
|pox=Polabian
|poz-cet-pro=Proto-Central-Eastern Malayo-Polynesian
|poz-mcm-pro=Proto-Malayo-Chamic
|poz-mly-pro=Proto-Malayic
|poz-msa-pro=Proto-Malayo-Sumbawan
|poz-oce-pro=Proto-Oceanic
|poz-pep-pro=Proto-Eastern Polynesian
|poz-pnp-pro=Proto-Nuclear Polynesian
|poz-pol={{langname-lite/familycode|Polynesian|{{{is family|}}}|{{{allow family|}}}}}
|poz-pol-pro=Proto-Polynesian
|poz-pro=Proto-Malayo-Polynesian
|ppk=Uma
|ppl=Pipil
|ppu=Papora
|pqe-pro=Proto-Eastern Malayo-Polynesian
|prc=Parachi
|prg=Old Prussian
|pri=Paicî
|pro=Old Occitan
|ps=Pashto
|pt=Portuguese
|pwn=Paiwan
|#default={{langname-lite/unknowncode|{{{1}}}}}}}
|q={{#switch:{{{1|}}}
|qfa-kms-pro=Proto-Kam-Sui
|qfa-lic-pro=Proto-Hlai
|qfa-sub={{langname-lite/familycode|substrate|{{{is family|}}}|{{{allow family|}}}}}
|qfa-tak={{langname-lite/familycode|Kra-Dai|{{{is family|}}}|{{{allow family|}}}}}
|qfa-yen-pro=Proto-Yeniseian
|qsb-ibe={{langname-lite/etymcode|Paleo-Hispanic|Undetermined|{{{allow etym|}}}}}
|qu=Quechua
|qua=Quapaw
|quc=K'iche'
|#default={{langname-lite/unknowncode|{{{1}}}}}}}
|r={{#switch:{{{1|}}}
|rad=Rade
|rah=Rabha
|ran=Riantana
|rap=Rapa Nui
|raw=Rawang
|ray=Rapa
|rbl=East Miraya Bikol
|rel=Rendille
|rgn=Romagnol
|rhg=Rohingya
|ril=Riang
|rki=Rakhine
|rm=Romansh
|rme=Angloromani
|rmf=Kalo Finnish Romani
|rmg=Traveller Norwegian
|rmn=Balkan Romani
|rmo=Sinte Romani
|rmp=Rempi
|rmq=Caló
|rmt=Domari
|rmw=Welsh Romani
|rng=Ronga
|ro=Romanian
|roa={{langname-lite/familycode|Romance|{{{is family|}}}|{{{allow family|}}}}}
|roa-brg=Bourguignon
|roa-fcm=Franc-Comtois
|roa-gal=Gallo
|roa-leo=Leonese
|roa-oca=Old Catalan
|roa-ole=Old Leonese
|roa-opt=Old Galician-Portuguese
|roa-tar=Tarantino
|rog=Northern Roglai
|rol=Romblomanon
|rom=Romani
|roo=Rotokas
|rop=Australian Kriol
|rpt=Rapting
|rth=Ratahan
|rtm=Rotuman
|ru=Russian
|rue=Carpathian Rusyn
|rug=Roviana
|ruo=Istro-Romanian
|rup=Aromanian
|ruq=Megleno-Romanian
|rw=Rwanda-Rundi
|rwo=Rawa
|ryn=Northern Amami Ōshima
|rys=Yaeyama
|ryu=Okinawan
|#default={{langname-lite/unknowncode|{{{1}}}}}}}
|s={{#switch:{{{1|}}}
|sa=Sanskrit
|sah=Yakut
|sai-ayo=Ayomán
|sai-men=Menien
|sai-nje-pro=Proto-Northern Jê
|sai-tap=Tapayuna
|sat=Santali
|sav=Saafi-Saafi
|sbf=Shabo
|sbl=Botolan Sambal
|sc=Sardinian
|sce=Dongxiang
|scn=Sicilian
|sco=Scots
|sd=Sindhi
|sdc=Sassarese
|sdg=Savi
|sdn=Gallurese
|se=Northern Sami
|sea=Semai
|sed=Sedang
|sei=Seri
|sel=Selkup
|sem-pro=Proto-Semitic
|ses=Koyraboro Senni
|sg=Sango
|sga=Old Irish
|sgb=Mag-Anchi Ayta
|sgd=Surigaonon
|sgs=Samogitian
|sh=Serbo-Croatian
|shh=Shoshone
|shk=Shilluk
|shn=Shan
|si=Sinhalese
|sid=Sidamo
|sio-pro=Proto-Siouan
|sip=Sikkimese
|sit={{langname-lite/familycode|Sino-Tibetan|{{{is family|}}}|{{{allow family|}}}}}
|sit-jap=Japhug
|sit-pro=Proto-Sino-Tibetan
|sit-sit=Situ
|sit-tan-pro=Proto-Tani
|sjd=Kildin Sami
|sje=Pite Sami
|sjm=Mapun
|sjt=Ter Sami
|sju=Ume Sami
|sk=Slovak
|skb=Saek
|sky=Sikaiana
|sl=Slovene
|sla={{langname-lite/familycode|Slavic|{{{is family|}}}|{{{allow family|}}}}}
|sla-pro=Proto-Slavic
|slm=Pangutaran Sama
|slr=Salar
|slu=Selaru
|sm=Samoan
|sma=Southern Sami
|smi-pro=Proto-Samic
|smj=Lule Sami
|smk=Bolinao
|smn=Inari Sami
|smr=Simeulue
|sms=Skolt Sami
|sn=Shona
|snf=Noon
|snp=Siane
|snr=Sihan
|snu=Senggi
|so=Somali
|sog=Sogdian
|sou=Southern Thai
|sq=Albanian
|sqj-pro=Proto-Albanian
|squ=Squamish
|sra=Saruga
|srn=Sranan Tongo
|srq=Sirionó
|srr=Serer
|srv=Waray Sorsogon
|ss=Swazi
|ssf=Thao
|ssl=Western Sisaala
|ssq=So'a
|ssy=Saho
|stf=Seta
|stp=Southeastern Tepehuan
|stq=Saterland Frisian
|str=Saanich
|stw=Satawalese
|suq=Suri
|sux=Sumerian
|sv=Swedish
|sw=Swahili
|swb=Maore Comorian
|swg=Swabian
|swi=Sui
|swm=Samosa
|sxn=Sangir
|sxw=Saxwe Gbe
|syc=Classical Syriac
|szl=Silesian
|szy=Sakizaya
|#default={{langname-lite/unknowncode|{{{1}}}}}}}
|t={{#switch:{{{1|}}}
|ta=Tamil
|taa=Lower Tanana
|tad=Tause
|tai={{langname-lite/familycode|Tai|{{{is family|}}}|{{{allow family|}}}}}
|tai-pro=Proto-Tai
|tao=Yami
|tay=Atayal
|tbc=Takia
|tbl=Tboli
|tbp=Taworta
|tbq={{langname-lite/familycode|Tibeto-Burman|{{{is family|}}}|{{{allow family|}}}}}
|tbq-bdg-pro=Proto-Bodo-Garo
|tbq-blg=Bailang
|tbq-kuk-pro=Proto-Kuki-Chin
|tbq-lob-pro=Proto-Lolo-Burmese
|tbq-lol-pro=Proto-Loloish
|tbw=Aborlan Tagbanwa
|tby=Tabaru
|tcb=Tanacross
|tcs=Torres Strait Creole
|tdd=Tai Nüa
|tdy=Tadyawan
|te=Telugu
|tet=Tetum
|tew=Tewa
|tfn=Dena'ina
|tft=Ternate
|tg=Tajik
|th=Thai
|ti=Tigrinya
|tim=Timbe
|tio=Teop
|tiy=Tiruray
|tk=Turkmen
|tkl=Tokelauan
|tkw=Teanu
|tl=Tagalog
|tli=Tlingit
|tmh=Tuareg
|tmu=Iau
|tnq=Taíno
|to=Tongan
|tpf=Tarpia
|tpi=Tok Pisin
|tpn=Tupinambá
|tpw=Old Tupi
|tqo=Toaripi
|tqw=Tonkawa
|tr=Turkish
|trk={{langname-lite/familycode|Turkic|{{{is family|}}}|{{{allow family|}}}}}
|trk-cmn-pro={{langname-lite/etymcode|Proto-Common Turkic|Proto-Turkic|{{{allow etym|}}}}}
|trk-oat=Old Anatolian Turkish
|trk-pro=Proto-Turkic
|trv=Taroko
|ts=Tsonga
|tsg=Tausug
|tt=Tatar
|tts=Isan
|ttt=Tat
|tum=Tumbuka
|tuw-pro=Proto-Tungusic
|tuw-sol=Solon
|tvl=Tuvaluan
|tvn=Tavoyan
|tvo=Tidore
|txb=Tocharian B
|txg=Tangut
|ty=Tahitian
|typ=Kuku-Thaypan
|tys=Sapa
|tyv=Tuvan
|tyz=Tày
|tzj=Tz'utujil
|tzm=Central Atlas Tamazight
|tzo=Tzotzil
|#default={{langname-lite/unknowncode|{{{1}}}}}}}
|u={{#switch:{{{1|}}}
|uar=Tairuma
|ubl=Buhi'non Bikol
|uby=Ubykh
|ude=Udihe
|udi=Udi
|ug=Uyghur
|ugo=Gong
|uk=Ukrainian
|ulb=Olukumi
|ulk=Meriam
|umo=Umotína
|umu=Munsee
|und=Undetermined
|unm=Unami
|ur=Urdu
|urj-fin-pro=Proto-Finnic
|urj-pro=Proto-Uralic
|urk=Urak Lawoi'
|ush=Ushojo
|utu=Utu
|uur=Ura (Vanuatu)
|uz=Uzbek
|#default={{langname-lite/unknowncode|{{{1}}}}}}}
|V={{#switch:{{{1|}}}
|VL.={{langname-lite/etymcode|Vulgar Latin|Latin|{{{allow etym|}}}}}
|#default={{langname-lite/unknowncode|{{{1}}}}}}}
|v={{#switch:{{{1|}}}
|vai=Vai
|vam=Vanimo
|ve=Venda
|vec=Venetan
|vep=Veps
|vi=Vietnamese
|vil=Vilela
|vma=Martuthunira
|vo=Volapük
|vot=Votic
|vro=Võro
|vsn={{langname-lite/etymcode|Vedic Sanskrit|Sanskrit|{{{allow etym|}}}}}
|#default={{langname-lite/unknowncode|{{{1}}}}}}}
|w={{#switch:{{{1|}}}
|wa=Walloon
|wam=Massachusett
|war=Waray-Waray
|wba=Warao
|wbl=Wakhi
|wes=Cameroon Pidgin
|wim=Wik-Mungkan
|win=Winnebago
|wiv=Muduapa
|wlm=Middle Welsh
|wmc=Wamas
|wmw=Mwani
|wno=Wano
|wo=Wolof
|woe=Woleaian
|wrh=Wiradjuri
|wrs=Waris
|wsk=Waskia
|wuh=Wutunhua
|wul=Silimo
|wuu=Wu
|wya=Wyandot
|wym=Vilamovian
|#default={{langname-lite/unknowncode|{{{1}}}}}}}
|x={{#switch:{{{1|}}}
|xaa=Andalusian Arabic
|xag=Aghwan
|xbm=Middle Breton
|xbr=Kambera
|xcl=Old Armenian
|xdc=Dacian
|xeu=Keoru-Ahia
|xgn={{langname-lite/familycode|Mongolic|{{{is family|}}}|{{{allow family|}}}}}
|xgn-pro=Proto-Mongolic
|xh=Xhosa
|xib=Iberian
|xil=Illyrian
|xnb=Kanakanabu
|xno={{langname-lite/etymcode|Anglo-Norman|Old French|{{{allow etym|}}}}}
|xok=Xokleng
|xpm=Pumpokol
|xpo=Pochutec
|xpq=Mohegan-Pequot
|xpr=Parthian
|xqa=Karakhanid
|xrn=Arin
|xsb=Sambali
|xsl=South Slavey
|xsm=Kasem
|xsp=Silopi
|xss=Assan
|xsv=Sudovian
|xto=Tocharian A
|xug=Kunigami
|xve=Venetic
|#default={{langname-lite/unknowncode|{{{1}}}}}}}
|y={{#switch:{{{1|}}}
|yag=Yámana
|yai=Yaghnobi
|ybe=Western Yugur
|ycl=Lolopo
|ydk=Yoidik
|yee=Yimas
|yha=Baha
|yi=Yiddish
|yka=Yakan
|yle=Yele
|yll=Yil
|yly=Nyelâyu
|yo=Yoruba
|yog=Yogad
|yoi=Yonaguni
|yol=Yola
|yox=Yoron
|yrk-tun=Tundra Nenets
|yrl=Nheengatu
|yua=Yucatec Maya
|yue=Cantonese
|yuf=Havasupai-Walapai-Yavapai
|yuq=Yuqui
|yuy=East Yugur
|#default={{langname-lite/unknowncode|{{{1}}}}}}}
|z={{#switch:{{{1|}}}
|za=Zhuang
|zag=Zaghawa
|zai=Isthmus Zapotec
|zav=Yatzachi Zapotec
|zca=Coatecas Altas Zapotec
|zea=Zealandic
|zh=Chinese
|zhn=Nong Zhuang
|zia=Zia
|zkg=Goguryeo
|zko=Kott
|zkt=Khitan
|zle-mbe={{langname-lite/etymcode|Middle Belarusian|Old Ruthenian|{{{allow etym|}}}}}
|zle-ono=Old Novgorodian
|zle-ort=Old Ruthenian
|zls={{langname-lite/familycode|South Slavic|{{{is family|}}}|{{{allow family|}}}}}
|zlw-ocs=Old Czech
|zlw-opl=Old Polish
|zlw-slv=Slovincian
|zmo=Molo
|zne=Zande
|zom=Zou
|zpq=Zoogocho Zapotec
|ztn=Santa Catarina Albarradas Zapotec
|ztt=Tejalapan Zapotec
|zu=Zulu
|zza=Zazaki
|#default={{langname-lite/unknowncode|{{{1}}}}}}}
|#default={{langname-lite/unknowncode|{{{1}}}}}
}}</includeonly><noinclude>{{documentation}}[[Category:Lua-free templates]]</noinclude>
938bfiv6mjyq9pw9vtx23b7rrdo8uee
ᨷᩢ
0
331979
5720819
5713608
2026-04-21T07:57:45Z
Ai Ku Karng
17824
/* ภาษาเขิน */
5720819
wikitext
text/x-wiki
{{also/auto}}
== ภาษาเขิน ==
=== รากศัพท์ ===
{{inh+|th|tai-pro|*ɓawᴮ||ไม่}}; ร่วมเชื้อสายกับ{{cog|nod|ᨷᩮᩢ᩵ᩤ}}, {{cog|lo|ເບົ່າ}}, {{cog|khb|ᦢᧁᧈ}}, {{cog|blt|ꪹꪚ꪿ꪱ}}, {{cog|shn|မဝ်ႇ}}
=== การออกเสียง ===
* {{IPA|kkh|/baw˨˨/|a=เชียงตุง}}
* {{คำอ่านไทย|เบ่า}}
=== คำกริยาวิเศษณ์ ===
{{kkh-adv|-}}
# [[ไม่]], [[บ่]]
=== อ้างอิง ===
{{รายการอ้างอิง}}
* ᨩᩣ᩠ᨿᨪᩮᨩᩮ᩠ᨾ. (n.d.). ''ᩋᨽᩥᨵᩤᨶᩈᩢ᩠ᨷᩅᩰᩉᩣ᩠ᩁᨸᩖᩯᨽᩣᩈᩣᨡᩨ᩠ᨶ''.
== ภาษาไทลื้อ ==
=== คำนาม ===
{{khb-noun}}
# {{alternative form of|khb|ᦢᧁᧈ}}
jub11464sn4bzq6gvzdjhhbndz1dm1a
5720820
5720819
2026-04-21T07:58:45Z
Ai Ku Karng
17824
5720820
wikitext
text/x-wiki
{{also/auto}}
== ภาษาเขิน ==
=== รากศัพท์ ===
{{inh+|th|tai-pro|*ɓawᴮ||ไม่}}; ร่วมเชื้อสายกับ{{cog|nod|ᨷᩮᩢ᩵ᩤ}}, {{cog|lo|ເບົ່າ}}, {{cog|khb|ᦢᧁᧈ}}, {{cog|blt|ꪹꪚ꪿ꪱ}}, {{cog|shn|မဝ်ႇ}}
=== การออกเสียง ===
* {{IPA|kkh|/baw˨˨/|a=เชียงตุง}}
* {{คำอ่านไทย|เบ่า}}
=== คำกริยาวิเศษณ์ ===
{{kkh-adv|-}}
# [[ไม่]], [[บ่]]
=== อ้างอิง ===
{{รายการอ้างอิง}}
* ᨩᩣ᩠ᨿᨪᩮᨩᩮ᩠ᨾ. (n.d.). ''ᩋᨽᩥᨵᩤᨶᩈᩢ᩠ᨷᩅᩰᩉᩣ᩠ᩁᨸᩖᩯᨽᩣᩈᩣᨡᩨ᩠ᨶ''.
== ภาษาไทลื้อ ==
=== คำกริยาวิเศษณ์ ===
{{khb-noun}}
# {{alternative form of|khb|ᦢᧁᧈ}}
f4rq0lwjobk49ip0jwtsc2ij3d5m1yq
มอดูล:cop-sortkey
828
335759
5720774
1899755
2026-04-21T07:11:51Z
OctraBot
3198
แทนที่เนื้อหาด้วย "--[[หมวดหมู่:หน้าที่ถูกแจ้งลบ|เปลี่ยนชื่อ]]"
5720774
Scribunto
text/plain
--[[หมวดหมู่:หน้าที่ถูกแจ้งลบ|เปลี่ยนชื่อ]]
qizf2rdzmcbn6ioiwx591u5zqnh1lvx
มอดูล:list of languages, csv format
828
336684
5720770
1902914
2026-04-21T07:01:23Z
OctraBot
3198
บอต: แทนที่ข้อความโดยอัตโนมัติ (-standardChars +standard_chars)
5720770
Scribunto
text/plain
local languages = require("Module:languages/data/all")
local families = require("Module:families/data")
-- based on Module:list_of_languages
local export = {}
local filters = {}
function export.show(frame)
local args = frame.args
local filter = filters[args[1]]
local ids = args["ids"]; if not ids or ids == "" then ids = false else ids = true end
local rows = {}
-- Get a list of all language codes
local codes = {}
for code, _ in pairs(languages) do
table.insert(codes, code)
end
-- Sort the list
table.sort(codes)
local sep = ";"
local minor_sep = ","
local function shallowcopy(array)
local new_array = {}
if type(array) == "string" then array = {array} end
for i, v in ipairs(array) do
new_array[i] = v
end
return new_array
end
-- Now go over each code, and create table rows for those that are selected
local column_names = {
"line", "code", "canonical name", "category", "type", "family code",
"family", "sortkey?", "autodetect?", "exceptional?", "script codes",
"other names", "standard characters"
}
for line, code in ipairs(codes) do
local data = languages[code]
local row = {}
local sc = data[4]
if type(sc) == "string" then sc = mw.text.split(sc, "%s*,%s*") end
-- data[1]: canonical name; data[3]: family code
table.insert(row, line)
table.insert(row, code)
table.insert(row, data[1])
table.insert(row, (data[1]:find("^ภาษา") and "" or "ภาษา") .. data[1])
table.insert(row, data.type or "")
table.insert(row, data[3] or "")
table.insert(row, data[3] and (families[data[3]] and families[data[3]][1] or error(data[3] .. " is not a valid family code (family of " .. code .. ")")))
table.insert(row, data.sort_key and "sortkey" or "")
table.insert(row, data.entry_name and "autodetect" or "")
table.insert(row, code:find("-") and "exceptional" or "")
table.insert(row, sc and table.concat(sc, minor_sep) or "")
table.insert(row, data.otherNames and table.concat(data.otherNames, minor_sep) or "")
table.insert(row, data.standard_chars and "standard characters" or "")
table.insert(rows, table.concat(row, sep))
end
return "<pre>\n" .. table.concat(column_names, sep) .. "\n" .. table.concat(rows, "\n") .. "</pre>"
end
return export
08e6ih7ibhxiygvp4avpglqw2klgbme
5720772
5720770
2026-04-21T07:04:24Z
OctraBot
3198
5720772
Scribunto
text/plain
local languages = require("Module:languages/data/all")
local families = require("Module:families/data")
-- based on Module:list_of_languages
local export = {}
local filters = {}
function export.show(frame)
local args = frame.args
local filter = filters[args[1]]
local ids = args["ids"]; if not ids or ids == "" then ids = false else ids = true end
local rows = {}
-- Get a list of all language codes
local codes = {}
for code, _ in pairs(languages) do
table.insert(codes, code)
end
-- Sort the list
table.sort(codes)
local sep = ";"
local minor_sep = ","
local function shallowcopy(array)
local new_array = {}
if type(array) == "string" then array = {array} end
for i, v in ipairs(array) do
new_array[i] = v
end
return new_array
end
-- Now go over each code, and create table rows for those that are selected
local column_names = {
"line", "code", "canonical name", "category", "type", "family code",
"family", "sortkey?", "autodetect?", "exceptional?", "script codes",
"other names", "standard characters"
}
for line, code in ipairs(codes) do
local data = languages[code]
local row = {}
local sc = data[4]
if type(sc) == "string" then sc = mw.text.split(sc, "%s*,%s*") end
-- data[1]: canonical name; data[3]: family code
table.insert(row, line)
table.insert(row, code)
table.insert(row, data[1])
table.insert(row, (data[1]:find("^ภาษา") and "" or "ภาษา") .. data[1])
table.insert(row, data.type or "")
table.insert(row, data[3] or "")
table.insert(row, data[3] and (families[data[3]] and families[data[3]][1] or error(data[3] .. " is not a valid family code (family of " .. code .. ")")))
table.insert(row, data.sort_key and "sortkey" or "")
table.insert(row, data.entry_name and "autodetect" or "")
table.insert(row, code:find("-") and "exceptional" or "")
table.insert(row, sc and table.concat(sc, minor_sep) or "")
table.insert(row, data.other_names and table.concat(data.other_names, minor_sep) or "")
table.insert(row, data.standard_chars and "standard characters" or "")
table.insert(rows, table.concat(row, sep))
end
return "<pre>\n" .. table.concat(column_names, sep) .. "\n" .. table.concat(rows, "\n") .. "</pre>"
end
return export
oyytuonuvu6j0q9q77zzvthz9a5lnvs
หมวดหมู่:zh:นครในไทย
14
338947
5720705
1914805
2026-04-21T01:56:27Z
OctraBot
3198
OctraBot ย้ายหน้า [[หมวดหมู่:zh:เมืองใหญ่ในไทย]] ไปยัง [[หมวดหมู่:zh:นครในไทย]] โดยไม่สร้างหน้าเปลี่ยนทางตามมา
1914805
wikitext
text/x-wiki
{{auto cat}}
eomzlm5v4j7ond1phrju7cnue91g5qx
มอดูล:lt-common
828
1714667
5720786
4733484
2026-04-21T07:20:08Z
OctraBot
3198
5720786
Scribunto
text/plain
local export = {}
local m_str_utils = require("Module:string utilities")
local u = m_str_utils.char
local ugsub = m_str_utils.gsub
local ulower = m_str_utils.lower
local uupper = m_str_utils.upper
local ufind = m_str_utils.find
local ulen = m_str_utils.len
local ucodepoint = m_str_utils.codepoint
-- Keep native Unicode normalization functions (no replacement available)
local toNFC = mw.ustring.toNFC
local toNFD = mw.ustring.toNFD
-- =============================================================================
-- Unicode constants
-- =============================================================================
local GRAVE = u(0x0300) -- combining grave accent
local ACUTE = u(0x0301) -- combining acute accent
local TILDE = u(0x0303) -- combining tilde
local MACRON = u(0x0304) -- combining macron
local DOTABOVE = u(0x0307) -- combining dot above
local CARON = u(0x030C) -- combining caron
local OGONEK = u(0x0328) -- combining ogonek
local ANY_ACCENT = "[" .. GRAVE .. ACUTE .. TILDE .. "]"
-- Legacy aliases for backward compatibility
local grave = GRAVE
local acute = ACUTE
local tilde = TILDE
local macron = MACRON
local dotabove = DOTABOVE
local caron = CARON
local ogonek = OGONEK
local accents = ANY_ACCENT
-- =============================================================================
-- Internal helper functions
-- =============================================================================
local dotless_to_dotted = {
["ı"] = "i",
["ȷ"] = "j",
}
local function char_to_dotted_form(base, below)
return (dotless_to_dotted[base] or base) .. below
end
local function normalize_dotted_chars(text)
-- Remove any dots above, and convert dotless forms to dotted.
-- On entry, text must be in NFD form.
return ugsub(text, "([iıjȷ])(" .. ogonek .. "?)" .. dotabove, char_to_dotted_form)
end
local function char_to_accent_form(base, below)
-- Add a 'dot above' after the base.
if base == "i" or base == "j" then
return base .. below .. dotabove
end
-- Convert any dotless chars combining with accents to the dotted form,
-- so that they normalize properly. This shouldn't happen, but just in case.
return char_to_dotted_form(base, below)
end
local function stripped_text_form(text)
-- Remove accents.
text = ugsub(toNFD(text), accents .. "+", "")
-- Normalize dotless characters and dot-above diacritics.
return normalize_dotted_chars(text)
end
-- =============================================================================
-- Input validation
-- =============================================================================
-- Reject Private Use Area characters (U+E000–U+F8FF).
function export.reject_pua(s)
if not s then return end
for i = 1, ulen(s) do
local cp = ucodepoint(s, i)
if cp >= 0xE000 and cp <= 0xF8FF then
error(string.format(
"lt-common: private use area character U+%04X detected in \"%s\". " ..
"Please use a standard Unicode character instead.", cp, s))
end
end
end
-- =============================================================================
-- Input normalization
-- =============================================================================
-- Detect nonstandard encoding patterns in the input.
-- Returns: dotless_flag (found ı/ȷ), precomp_i_flag (found precomposed í/ì/ĩ)
function export.detect_nonstandard(s)
if not s then return false, false end
local nfd_s = toNFD(s)
local dotless_flag = ufind(nfd_s, "[ıȷ]") ~= nil
local precomp_i_flag = ufind(nfd_s, "[íìĩ]") ~= nil
return dotless_flag, precomp_i_flag
end
-- Normalize input to clean canonical NFC.
-- Handles dotless i/j (ı, ȷ) and stray dot-above combinations.
function export.canonicalize_input(s)
if not s then return s end
s = toNFD(s)
-- Remove stray dot-above after i/j (with or without ogonek)
s = ugsub(s, "([iıjȷ])(" .. OGONEK .. "?)" .. DOTABOVE, function(base, below)
base = (base == "ı") and "i" or (base == "ȷ") and "j" or base
return base .. below
end)
-- Convert any remaining dotless i/j to standard forms
s = ugsub(s, "ı", "i")
s = ugsub(s, "ȷ", "j")
return toNFC(s)
end
-- =============================================================================
-- Partial NFD conversion (stem_ac representation)
-- =============================================================================
-- Convert canonical NFC to partial NFD (stem_ac).
-- Applies full NFD, then recomposes non-accent diacritics.
-- Only grave/acute/tilde remain as combining characters.
function export.to_stem_ac(s)
if not s then return s end
s = toNFD(s)
-- Recompose ogonek vowels
s = ugsub(s, "a" .. OGONEK, "ą")
s = ugsub(s, "e" .. OGONEK, "ę")
s = ugsub(s, "i" .. OGONEK, "į")
s = ugsub(s, "u" .. OGONEK, "ų")
-- Recompose macron vowel
s = ugsub(s, "u" .. MACRON, "ū")
-- Recompose dot-above e
s = ugsub(s, "e" .. DOTABOVE, "ė")
-- Recompose caron consonants
s = ugsub(s, "c" .. CARON, "č")
s = ugsub(s, "s" .. CARON, "š")
s = ugsub(s, "z" .. CARON, "ž")
return s
end
-- =============================================================================
-- Accent manipulation
-- =============================================================================
-- Strip all accent marks (grave/acute/tilde) from partial NFD text.
function export.to_stem_bare(stem_ac)
if not stem_ac then return stem_ac end
return ugsub(stem_ac, ANY_ACCENT, "")
end
-- Check if partial NFD text contains any accent marks.
function export.has_accent(stem_ac)
return ufind(stem_ac, ANY_ACCENT) ~= nil
end
-- =============================================================================
-- Complete input pipeline
-- =============================================================================
-- Process raw user input through the complete normalization pipeline.
-- Returns: stem_bare, stem_ac, dotless_flag, precomp_flag
function export.process_input(raw)
if not raw then return raw, raw, false, false end
export.reject_pua(raw)
local dotless_flag, precomp_flag = export.detect_nonstandard(raw)
local canon = export.canonicalize_input(raw)
local stem_ac = export.to_stem_ac(canon)
local stem_bare = export.to_stem_bare(stem_ac)
return stem_bare, stem_ac, dotless_flag, precomp_flag
end
-- =============================================================================
-- Display and text processing
-- =============================================================================
function export.makeDisplayText(text, lang, sc)
if not text then return text end
-- Normalize dotless characters and dot-above diacritics (while retaining accents).
text = normalize_dotted_chars(toNFD(text))
-- Add a 'dot above' between "i" or "j" and an accent.
text = ugsub(text, "([iıjȷ])(" .. ogonek .. "?)%f" .. accents, char_to_accent_form)
return toNFC(text)
end
-- Called from [[Module:languages]] since [[Module:lt-common]] is set as the stripDiacritics handler in
-- [[Module:languages/data/2]].
function export.stripDiacritics(text, lang, sc)
if not text then return text end
return toNFC(stripped_text_form(text))
end
local sortkey_substitutes = {
[ogonek] = u(0xF000),
[caron] = u(0xF001),
[macron] = u(0xF002),
[dotabove] = u(0xF003),
["y"] = "i" .. u(0xF004),
}
function export.makeSortKey(text, lang, sc)
if not text then return text end
-- Normalize to the stripped-text form and convert diacritics to Private Use
-- Area characters so they sort after all other characters.
text = stripped_text_form(ulower(text))
:gsub(".[\128-\191]*", sortkey_substitutes)
return toNFC(uupper(text))
end
return export
goom2dkjrwlrqdqdxdvrmutp68b77go
มอดูล:place/locations
828
2297279
5720697
5715284
2026-04-21T01:40:16Z
OctraBot
3198
5720697
Scribunto
text/plain
local export = {}
export.force_cat = false -- set to true to force category generation even on non-mainspace pages
local m_table = require("Module:table")
local string_utilities_module = "Module:string utilities"
local en_utilities_module = "Module:en-utilities"
local insert = table.insert
local concat = table.concat
local dump = mw.dumpObject
local unpack = unpack or table.unpack -- Lua 5.2 compatibility
--[==[ intro:
This module contains data on all known locations, along with some lower-level code to process them (higher-level
known-location code is in [[Module:place/placetypes]]). You must load this module using require(), not using
mw.loadData().
===Location data===
'''NOTE: In order to understand the following better, first read the introductory documentation in [[Module:place]],
especially the section `More about known locations`.'''
The bulk of the code in this module (after some helper functions and placetype tables) describes the known locations
and their relationships. Locations are grouped into ''location groups'' that share some common properties (examples are
states of the United States and cities in Brazil). Each location group is associated with two tables, a ''data table''
that lists the locations and their individual properties, and a ''metadata table'' that lists group-level properties and
defaults for the location properties. Each metadata table points to the associated data table (i.e. contains the data
table as its `data` field), and the global `locations` variable holds a list of all group metadata tables. A given
location is generally described by three values: (a) the group metadata table for the group the location is part of; (b)
the location's canonical ''key'', which is the actual key in the group's data table and is globally unique across all
locations; and (c) the location's ''spec'', which is the initialized object describing the properties of the location
and comes from the value in the data table corresponding to the canonical key, transformed by the `initialize_spec()`
function. These are typically named `group`, `key` and `spec`, respectively and in that order, and are found in the
arguments to many functions.
In a per-group data table, the keys are either ''canonical keys'' describing locations (which, as mentioned above, must
be globally unique) or ''alias keys'' specifying an allowed alias for a given location. There may be multiple aliases
for a given location and the alias keys only need to be unique within a particular group data table, not across all
groups. It is also possible for the same string to serve as an alias key in one group and a canonical key in another
group. (For example, `Newcastle` appears as an alias key in two different groups, referring to two different locations,
canonically known as `Newcastle upon Tyne`, for the city in England, and `Newcastle, New South Wales`, for the city in
New South Wales, ออสเตรเลีย; and `Birmingham` appears both as a canonical key in the group of English cities and an alias
key for canonical `Birmingham, Alabama` in the group of US cities.) The corresponding value objects are different for
canonical and alias keys. Corresponding to canonical keys are ''location specs'', describing the properies of the
location that cannot be derived from default properties of the group or global defaults. Corresponding to alias keys
are ''alias specs'', which are highly restricted in the properties they can contain, and whose properties do not have
per-group defaults, but only global defaults.
The canonical key is always the same as the bare category corresponding to the location, which is one of the reasons it
must be globally unique. For example, the country of Georgia uses the canonical key `Georgia` and corresponding bare
category [[:Category:Georgia]], while the US state of Georgia uses the canonical key `Georgia, USA` and corresponding
bare category [[:Category:Georgia, USA]]. The following conventions are followed in naming keys:
* Countries, ''country-like entities'' (which are a mixture of unrecognized de-facto states and dependent territories)
and ''former countries'' (which also includes other types of polities, such as the Roman Empire) use their unqualified
placename as the canonical key. (See the documentation for [[Module:place]] for the distinction between keys and
placenames, which is critical to understand when working with location data.) This also applies to constituent
countries (such as England, Aruba and the Faroe Islands) and constituent parts of grouped dependent territories (such
as the island of Saint Helena, which is administratively part of the British overseas territory of Saint Helena,
Ascension and Tristan da Cunha).
* Cities (including prefecture-level cities in China, which behave in most respects more like non-city administrative
divisions) also normally use their unqualified placename as the canonical key, but if this causes name conflicts or
ambiguities, they use a ''qualified key'' containing either the country name or immediate containing division (if
different) following a comma, such as the case of `Newcastle, New South Wales` and `Birmingham, Alabama` above.
Examples of name conflicts are the two cities just given; examples of ambiguities are the major cities of León and
Mérida in Mexico and city of Cartagena, Colombia, which are given the respective canonical keys of `León, Guanajuato`,
`Mérida, Yucatán` and `Cartagena, Colombia` to avoid ambiguity with the well-known respective cities of the same name
in Spain, even though none of those cities are large enough to be included as known locations in this module. (The
cutoff is generally having a metro area of at least 1,000,000 inhabitants, although there are exceptions.)
* Administrative divisions of countries, other than the exceptions noted above for constituent countries and dependent
territories, use a qualified key that contains the name of the country or constituent country in it, e.g.
`Normandy, ฝรั่งเศส` (a region), `Calvados, ฝรั่งเศส` (a department in the region of Normandy), `Herefordshire, England`
(a ceremonial county), `Northwest Territories, Canada` (a territory), `Central Finland, ฟินแลนด์` (a region),
`Antalya Province, Turkey` (a province), `Cluj County, Romania` (a county), `County Cork, ไอร์แลนด์` (a county) and
`New York, USA` (a state). As shown in these various examples, (a) first and second-level divisions are sometimes both
included (as in France, the United Kingdom and China); (b) the qualifier after the comma is sometimes a constituent
country (England) instead of a country (United Kingdom), and is sometimes abbreviated (USA rather than United States
or Unites States of America); (c) the word `the` is not normally included in the key even if the location is normally
preceded by `the` when following a preposition (there is a property in the location and alias specs to indicate this),
except in a very few cases (most notably `The Hague`); (d) the country is included as a qualifier even if it creates
an apparent redundancy, as with `Central Finland, ฟินแลนด์`; and (e) sometimes the placetype is included in the key, as
with provinces in Turkey and several other countries; states in Nigeria; and counties in Ireland, Romania and several
other countries. Whether the placetype is included, and whether it follows or precedes the placename, depends on
per-country conventions. For example, provinces in Turkey, อิหร่าน and several other countries (likewise for states in
Nigeria, oblasts in Russia, etc.) conventionally include the word "จังหวัด", "รัฐ", "Oblast" etc. in their name
because they are normally named after the largest city in the division, which would otherwise lead to ambiguity; and
counties in Ireland and Northern Ireland (and likewise County Durham, England) normally have the word "เทศมณฑล"
preceding rather than following them in their conventional name, so we follow this practice. The Wikipedia article
naming scheme for a given administrative division is a strong clue as to how the division is normally referred to,
and we usually follow this practice. (A minor exception is that the Wikipedia articles for provinces in Iran, Laos and
Thailand include the word `province` with an initial lowercase letter while provinces elsewhere, e.g. North and South
Korea, Saudi Arabia and Turkey, use uppercase `Province`; we normalize to uppercase `Province` in all cases.)
As mentioned above, associated with canonical keys in the group data table are location specs, which are objects
containing properties. It is important here to distinguish ''initialized specs'' from ''uninitialized specs''.
Unininitialized specs are as directly specified in [[Module:place/locations]], containing only those properties that
differ from the per-group or global defaults. Initialized specs result from calling `initialize_spec()` on an
uninitialized spec (it is idempotent in that it will do nothing if encountering an already-initialized spec). This
copies all group-level defaults that are not overridden in the location spec itself from the group-level metadata table
into the location spec, so that in general, no more reference need be made to the group to fetch the correct value of a
given location property. (The initialization process also does more transformations in a few cases, noted below.) Note
that the default value of a given property is stored under a key in the group metadata table that is preceded by the
string `default_`; for example, the default value corresponding to the `placetype` property of a given location is
specified in the `default_placetype` key in the group metadata table.
The following are the properties of the location spec.
* `placetype`: String specifying the placetype of the location (e.g. "ประเทศ", "รัฐ", province"). This can also be a
table of such types; in this case, the first listed type is the canonical type that will be used in descriptions, but
the location will be recognized (e.g. in a holonym, or for categorizing into the bare category) when tagged with any
of the specified types. The placetype '''must''' be either specified on an individual location or defaulted at the
group level, or an error occurs.
* `container`: Either a string, a ''canonicalized container'' structure or a list of either type, specifying the
immediate ''container'' (or containers) of the given location. A container is another location which this location is
considered to be directly part of, either politically or (above the country level) geographically. Some locations
belong to multiple immediate containers; this applies especially to transcontinental countries such as Russia and
Turkey. Containers can themselves have containers, forming a tree (or more correctly, a [[w:directed acyclic graph]])
of locations. The list of immediate container(s), followed by the container(s) of the container(s), etc., is termed
the ''container trail'', and some functions compute and return this trail as part of their operation. When a location
spec is initialized, the given container spec is canonicalized into ''canonical container form'', which consists of a
list of canonicalized container structures, each of which is of the form
`{key = "``container_key``", placetype = "``container_placetype``"}`, where ``container_key`` is a canonical location
key and ``container_placetype`` should be the listed placetype for the location, or the first listed placetype if
there are multiple. (FIXME: Since the key uniquely identifies the container location, we should eliminate the
placetype from the container structure.) The list of canonicalized container structures is stored into the
`.containers` field of the location spec (this happens even if the container value is unset in its uninitialized spec
form, causing it to default to the corresponding group-level value), and the `.container` field is set to {nil}. The
canonicalization process is described in more detail below under [[#Container spec canonicalization]].
* `divs`: List of recognized political divisions; e.g. for the Netherlands, a specification of the form
`divs = {"จังหวัด", "เทศบาล"}` will allow categories such as [[:Category:de:Provinces of the Netherlands]]
and [[:Category:pt:Municipalities of the Netherlands]] to be created. Any division that appears here must also be
found in `placetype_data`, or an error occurs. The entities appearing in the `divs` list can be structures as well as
just strings; this is explained more below under [[#Location divisions]]. Additional political divisions that apply to
all locations in a group can be specified at the group level using the group-only property `addl_divs`, which has the
same format as `divs`. This is intended to be used in the situation where some division types are shared among all
locations in the group and others differ from location to location. An example where this is used is the United
States, where `census-designated places` is specified in the group-level `addl_divs` so that all 50 states have
census-designated places categorized as e.g. [[:Category:Census-designated places in Arizona, USA]], but `counties`
and `county seats` are specified in the group-level `default_divs` because not all states have counties and county
seats (Alaska has boroughs and borough seats and Louisiana has parishes and parish seats), and some states have
additional divisions (New Jersey and Pennsylvania also have boroughs, while Colorado and Connecticut have
municipalities). Note that under most circumstances (particularly, if `container_parent_type` is not set as a property
associated with the division type), any division type specified on a sub-country-level location must also be specified
on all containers up through the country. For example, since French departments specify `communes` and
`municipalities` in `default_divs`, the same division types must be (and are) specified on French regions and for
France itself.
* `keydesc`: String directly specifying a description of the location, for use in generating the contents of category
pages related to the location. In place of a string, a function of three arguments (`group`, `key`, `spec`, as is
normal for locations) that computes the location description can also be given. This is used, for example, for
Russian federal subjects; see `construct_russia_federal_subject_keydesc`. The special string `+++` contained in the
keydesc is replaced with the default value of the location description, which specifies the location's placename,
placetype, and the corresponding values for each container in the container trail, generally up through (but not
beyond) the country level; see `no_include_container_in_desc` below. The location description is used to construct
the full description of various categories, such as bare location categories, whose description generally reads
`"{{(((}}langname}}} terms related to the people, culture, or territory of ``keydesc``."` where ``keydesc`` is the
specified or auto-constructed location description.
* `fulldesc`: String overriding the full description for the bare location category (but not for any other category).
This is currently used only for the location `Earth`, at the very top of the tree (because the standard
`people, culture or territory of ...` text doesn't make sense here), and for `Antarctica` (because it has no permanent
inhabitants). FIXME: This should be renamed `bare_category_fulldesc`.
* `addl_parents`: Specify additional parents for the bare location category, in addition to the category or categories
generated based on the immediate container(s). For example, `Hawaii, USA` specifies `Polynesia` as an additional
parent category; both `North Korea` and `South Korea` specify `Korea` (which is a specially handled location category)
as an additional parent; and `Earth` specifies `nature` (not a location category, but still a topic category) as an
additional parent (which in this case becomes the first parent, as `Earth` has no container). The only restriction on
the categories in `addl_parents` is that they must be topic categories, because each language-specific version of the
bare location category gets the corresponding language-specific versions of the categories in `addl_parents`. FIXME:
This shoudl be renamed `bare_category_addl_parents`.
* `wp`: Spec describing how to construct the Wikipedia article for the location. Each spec is either `true` (equivalent
to `"%l"`, i.e. use the full location placename directly) or a string containing formatting directives, indicating how
to construct the article name. The allowed formatting directives are `%l` (the full location placename), `%e` (the
elliptical location placename) and `%c` (the full placename of the first immediate container). For example, the
default value of `wp` for the group of United States cities is `"%l, %c"` since the city articles tend to be named
e.g. `Austin, Texas` (but with many exceptions, specified using `wp` fields at the city level). Another example is
Thai provinces, which specify a group-level default of `"%e province"` as the Wikipedia articles have lowercase
`province` in their name but the Thai province keys specified in this module have uppercase `Province`. Here we have
to use `%e` to get the placename without the word `Province` in it. The default is `true`, which simply uses the full
location placename as the article name. Note that the Wikipedia article, along with the Wikipedia and Commons category
pages, are shown in the upper right of bare category pages.
* `wpcat`: Spec describing how to construct the Wikipedia category page for the location (i.e. the page listing articles
and categories relevant to the location). The format is the same as with `wp`, and it defaults to the value of `wp`.
It rarely needs to be specified because the category page and the article page almost always follow the same format.
* `commonscat`: Spec describing how to construct the Commons category page for the location (i.e. the page on the
MediaWiki Commons site listing articles and categories relevant to the location). It has the same format as `wp` and
`wpcat` and defaults to `wpcat`, which is usually (but not always) correct.
* `the`: Boolean specifying whether a location should be preceded by `the` when following a preposition, e.g. in
category names such as [[:Category:Cities in the Northern Territory, ออสเตรเลีย]] and in old-style place descriptions
when the location occurs as the first holonym, such as the city [[Darwin]] described using
{{tl|place|city|terr/Northern Territory|c/Australia}}. Note that the global default for this and all Boolean
properties is {nil}, which amounts to the same as {false}.
* `british_spelling`: Boolean indicating whether the location in question uses British spelling. Currently this only
affects whether the spelling `neighborhoods` or `neighbourhoods` is used in categories such as
[[:Category:Neighborhoods of New York City]] and [[:Category:Neighbourhoods of Sydney]]. This usually needs to be set
only at the top level (i.e. country or country-like entity), because lower-level entities look up the container trail
for any container that has `british_spelling = true` set, and if found, assume that British spelling applies. The
general principle used in setting this is that all countries in Europe, all dependent territories of any such country,
all former British colonies, and any dependent territories of these former colonies, are assumed to use British
spelling, while all other countries and associated dependent territories are assumed to use American spelling. This
can potentially be modified on a case-by-case basis.
* `is_city`: Boolean indicating whether the location in question is a city. This is explicitly set to `true` for
city-states (e.g. Monaco and Vatican City), dependent territories that are cities (e.g. Hong Kong, Macau, Bonaire,
Gibraltar, etc.), certain city-level administrative divisions (such as `City of Belfast, Northern Ireland`) and
(through a group-levell setting) New York boroughs. In addition, it is set to `true` in initialize_spec() whenever
the group-level `default_placetype == "นคร"`, so that all cities get it set without explicitly needing to add a
group-level setting for this. Note that the condition `default_placetype == "นคร"` intentionally excludes Chinese
prefecture-level cities, which aren't really cities in that (for example) they don't directly contain neighborhoods,
but do contain cities within them. This setting is used in various places: (a) to add cities, rivers, etc. to
categories like [[:Category:Rivers in Osaka, ญี่ปุ่น]] and [[:Category:Cities in Wuhan]] for holonyms that
are ''not'' cities; (b) to add districts, neighborhoods, and the like to categories like
[[:Category:Neighborhoods of Brooklyn]] and [[:Category:Neighborhoods of Monaco]] for holoynms that ''are'' cities;
(c) generally, to determine which "generic" placetypes (cities, rivers, neighborhoods, etc.) apply to the location.
(Those that can occur with cities have a `generic_before_cities` setting in [[Module:place/placetypes]], and those
that can occur with non-cities have a `generic_before_non_cities` setting.)
* `is_former_place`: Boolean that should be set on former places such as the Soviet Union and the Roman Empire. For such
places, categories such as [[:Category:fr:Rivers in the Soviet Union]] are neither generated nor recognized (more
generally, no "generic" placetypes apply except for `places`), and category descriptions include the word `former`.
* `overriding_bare_label_parents`: Document me!
* `bare_category_parent_type`: Document me!
* `no_container_cat`: Document me!
* `no_container_parent`: Document me!
* `no_generic_place_cat`: Document me!
* `no_check_holonym_mismatch`: Document me!
* `no_auto_augment_container`: Document me!
* `no_include_container_in_desc`: Document me!
====Location divisions====
The `divs` field of a location describes the recognized political division types of that location. Specifying a given
division type will cause places defined as being of the specified division type and with the location as a holonym will
cause the place to be categorized as ` ``placetypes`` in/of ``location`` `; for example, specifying that the United
States has `"รัฐ"` as a division will cause anything defined as {{tl|place|fr|state|c/US}} to be categorized under
[[:Category:fr:States of the United States]]. Note that you do not have to explicitly specify division types for
"generic" placetypes (those that have a `generic_before_non_cities` field if the location is not a city, or that have a
`generic_before_cities` field if the location is a city); this includes things like cities, towns, villages,
neighbo(u)rhoods and rivers. A given element in the `divs` list is usually a string naming a plural placetype; the
placetype is automatically converted to the singular for recognizing the placetype in a {{tl|place}} spec, and irregular
plurals such as `kibbutzim` are handled correctly as long as the placetype specifies an appropriate `plural` field
(if the `plural` isn't explicitly given, the default singularization algorithm in [[Module:en-utilities]] is run, which
gets most things correctly but has problems with `passes` and `fortresses`, which are singularized to `passe` and
`fortresse`; for this reason, an explicit plural entry is added to terms in ''-ss''). In place of a string, an object
can be given with the plural placetype in the `type` field; this allows additional properties to be specified along with
the placetype. An example of this is the `divs` list for Canada:
{
["แคนาดา"] = {divs = {
{type = "รัฐ", cat_as = "รัฐและดินแดน"},
{type = "ดินแดน", cat_as = "รัฐและดินแดน"},
"เทศมณฑล", "อำเภอ", "เทศบาล", "regional municipalities",
"rural municipalities", "parishes",
"Indian reserves",
"census divisions",
{type = "townships", prep = "ใน"},
}, ...},
}
Here, both provinces and territories are set to categorize as `provinces and territories`, meaning that there is a
single category [[:Category:Provinces and territories of Canada]] rather than separate categories for provinces and
territories. Similar things are done for other countries that have more than one type of first-level administrative
division (e.g. Australia, จีน, อินเดีย and Pakistan). Note that any placetype listed under `cat_as` must exist in the
table of placetypes in [[Module:place/placetypes]], and in fact there is a category-only entry there for `provinces and
territories!` (the use of exclamation point following a plural placetype means that the placetype is present only for
use in categories and won't be recognized as the placetype field in a {{tl|place}} description). In addition, townships
are declared to use `in` rather than `of` as the preposition in the category; hence the category name will be
[[:Category:Townships in Canada]] rather than [[:Category:Townships of Canada]]. (The use of `in` vs. `of` is somewhat
related to whether a given placetype is an official administrative or statistical division of the location in question
and comes in a defined list, in which case `of` should be used, or is more ill-defined, in which case `in` should be
used; the default is `of`, and the use of `in` with `townships` is probably by analogy with the use of `in` with cities
and towns.)
Another more complex example is the divisions given for Quebec:
{
["Quebec, Canada"] = {divs = {
"เทศมณฑล",
{type = "regional county municipalities", container_parent_type = "regional municipalities"},
{type = "ภูมิภาค", container_parent_type = false},
{type = "townships", prep = "ใน"},
{type = "parish municipalities", cat_as = {{type = "parishes", container_parent_type = "เทศมณฑล"}, "เทศบาล"}},
{type = "township municipalities", cat_as = {{type = "townships", prep = "ใน"}, "เทศบาล"}},
{type = "village municipalities", cat_as = {{type = "villages", prep = "ใน"}, "เทศบาล"}},
}, ...},
}
Here, `container_parent_type` controls the second parent category of the placetype/location category associated with the
entry. In this case, for example, [[:Category:Counties of Quebec, Canada]] will have [[:Category:Counties of Canada]] as
its second or ''container-level'' parent. However, this doesn't make sense for `regional county municipalities`, which
exist only in Quebec (so the parent category [[:Category:Regional county municipalities of Canada]] would have only one
subcategory); but they are similar to regional municipalities in British Columbia, Nova Scotia and Ontario, so the
`container_parent_type = "regional municipalities"` spec causes the container-level parent of this category to be
[[:Category:Regional municipalities of Canada]]. Likewise, `regions` as administrative divisions (as opposed to mere
geographic regions) exist only in Quebec; they have no equivalent elsewhere, so we disable the container-level parent
using `container_parent_type = false`. The specs for `parish municipalities`, `township municipalities` and
`village municipalities` show both that multiple types can be specified under `cat_as` (here, for example, we categorize
`parish municipalities` as both `parishes` and `municipalities`) and that these types can themselves have properties,
just as for entries directly under `divs`. Specifically, `{type = "parishes", container_parent_type = "เทศมณฑล"}`
means that any place defined as a parish municipality in Quebec will be categorized under both [[:Category:Parishes of
Quebec, Canada]] and [[:Category:Municipalities of Quebec, Canada]], and that the former will have a container-level
parent of [[:Category:Counties of Canada]] (rather than the default of [[:Category:Parishes of Canada]]). Similarly,
`township municipalities` will be categorized under both [[:Category:Townships in Quebec, Canada]] (''not''
[[:Category:Townships of Quebec, Canada]]) and [[:Category:Municipalities of Quebec, Canada]].
====Container spec canonicalization====
A fully canonicalized container spec for a given location consists of a list of ''canonicalized container objects'',
each with a `key` and `placetype` field. The `key` field should name the canonical key of some other location at a
higher level (e.g. French cities are contained in French departments, which are contained in French regions, which are
contained in France, which is contained in Europe, which is contained in Eurasia, which is contained in the Earth). The
`placetype` field should correspond to the first (canonical) placetype listed for the key in question. The process of
initializing a locaion spec converts the container spec in `.container` into a canonicalized spec in `.containers` and
removes the spec from `.container`. It works as follows:
# If the `container` field is missing, and there is a group-level `default_container` field, it is used in its place.
For example, none of the Brazilian states listed in `brazil_states` specifies a container, but the group specifies
`default_container = "บราซิล"`.
# A single string or canonicalized container object is allowed and made into a one-element list.
# If a list element is a string that did ''not'' come from `default_container`, and there is a group-level
`canonicalize_key_container` field, it is assumed to be a one-argument function and is called on the string to get
a canonicalized container object.
# Any remaining strings are assumed to be countries and are used directly as the `key`, with `placetype` set to
`"ประเทศ"`.
====Alias keys====
Aliases can be provided for canonical keys using ''alias keys''. Alias keys have a very different location spec
structure from canonical keys. This structure does not, in general, have defaults at the group level and is not
initialized using `initialize_spec()`, but is used as-is. The following properties are recognized in an alias location
spec:
* `alias_of`: The canonical key of which this key is an alias. Required.
* `the`: If true, this alias key is preceded by `the` following a preposition. Defaults to the group-level `default_the`
but does not pay attention to the value of `the` for the corresponding canonical key.
* `display`: This is a display alias, meaning that holonyms using the placename corresponding to this alias will be
converted to the placename corresponding to the canonical key when formatting the holonym for display. (Otherwise,
the aliasing applies only to categorization.) If the value is true, the display canonicalization is to the placename
of the canonical key; otherwise, the value should be a key whose corresponding placename is used when display
canonicalizing.
* `placetype`: The placetype of the alias. Rarely needs to be specified as it defaults to the canonical key's placetype,
and if that is unspecified, to the group-level default placetype.
====Location group metadata tables====
As mentioned above, associated with each location group is a ''metadata table'' listing group-level properties. The
metadata table contains two types of keys: group-level defaults (named like the corresponding location-level keys but
preceded by `default_`, e.g. `default_placetype` corresponding to the location-level `placetype` key) and group-only
keys, which are mostly functions. The following are the possible group-only keys:
* `data`: This points to the group data table for the group, as described above.
* `key_to_placename`: This is a function of one argument to transform the location's key (whether canonical or alias)
into the full and elliptical placenames. The difference between full and elliptical placenames is described in the
documentation for [[Module:place]], but in essence, it applies for keys that include the placetype in them (e.g.
`Phuket Province, Thailand` or `County Mayo, ไอร์แลนด์`), in which case the full placename includes the placetype and
the elliptical placename does not. For keys that do not include the placetype in them (e.g. `Arizona, USA` or
`Gloucestershire, England`), the full and elliptical placenames are identical. Note that neither the full nor the
elliptical placename includes the container in it; hence, for `Phuket Province, Thailand`, the full placename is
`Phuket Province` and the elliptical placename is just `Phuket`. (Note that the full vs. elliptical placename
distinction is intended only for handling cases where the placetype follows or precedes the raw placename and there
is no difference between the two in whether they are normally preceded by `the`. More complex situations, such as
`State of Mexico` (which normally takes `the`) vs. just `Mexico` (which doesn't), or `Islamabad Capital Territory` vs.
just `Islamabad`, should be handled instead by aliases.) The `key_to_placename` function takes one argument, the key,
and returns two arguments, the full and elliptical placenames, respectively. If left undefined, the default is to
chop off anything starting with a comma and return the result as both full and elliptical placename, and if
specifically set to `false`, the key is used directly as both full and elliptical placename. If it needs to be
defined, it is best to use the helper function `make_key_to_placename`, if possible (or
`make_irish_type_key_to_placename` in the case of Ireland and Northern Ireland, where `County` precedes), rather than
rolling your own. In addition, you should use the global `key_to_placename` function (which takes care of the default
implementation and such) rather than directly calling the function in the `key_to_placename` field.
* `placename_to_key`: This is approximately the inverse of `key_to_placename`, transforming a placename (which can be
either in full or elliptical form) into the corresponding key. As with `key_to_placename`, if you need to define this
(generally, when the full and elliptical placenames are different), prefer using `make_placename_to_key` (or
`make_irish_type_placename_to_key` for Ireland and Northern Ireland) to rolling your own. In addition, similarly to
`key_to_placename`, use the global `placename_to_key` function to convert placenames to keys rather than directly
invoking the function in the `placename_to_key` field. If the field is set to `false`, the placename is used unchanged
as the key. Otherwise, the default algorithm works as follows:
*# If the group-level `default_placetype == "นคร"`, use the placename unchanged as the key.
*# Otherwise, if the group-level `default_container` exists and is a string, append it to the placename after a comma +
space and use the result as the key.
*# Otherwise, if the group-level `default_container` is a canonical container object (an object with `key` and
`placetype` fields), and the `placetype` field is either `country` or `constituent country`, append the `key` field
to the placename after a comma + space and use the result as the key.
*# Otherwise, use the placename unchanged as the key.
* `canonicalize_key_container`: A function of one argument to convert the specified `container` field, when a string,
to canonical form. Described in more detail above under [[#Container spec canonicalization]]. It is preferable to
construct the function using `make_canonicalize_key_container`, if possible, rather than rolling your own.
* `addl_divs`: Additional political divisions appended, for all locations in the group, to the list of divisions derived
from the location-level `divs` or group-level `default_divs` fields to get the final list of divisions for the
location. See [[#Location divisions]] for more details.
]==]
-----------------------------------------------------------------------------------
-- Helper functions --
-----------------------------------------------------------------------------------
--[==[
Throw an error. `fmt` is a format string and the remaining arguments are passed through `mw.dumpObject` and then used to
format the format string as if `fmt:format(...)` were called. In general, callers should use `internal_error` unless the
error was due to bad user input rather than a logic error (which usually isn't the case in deep back-end code like
this).
]==]
function export.process_error(fmt, ...)
local args = {...}
for i = 1, select("#", ...) do
args[i] = dump(args[i])
end
return error(string.format(fmt, unpack(args)))
end
--[==[
Throw an internal error (a logic error that should never happen unless there is a bug in the code, as opposed to a user
error triggered by bad input or a system error due to something like running out of memory or hitting a time limit).
`fmt` is a format string and the remaining arguments are passed through `mw.dumpObject` and then used to format the
format string as if `fmt:format(...)` were called.
]==]
function export.internal_error(fmt, ...)
export.process_error("Internal error: " .. fmt, ...)
end
local internal_error = export.internal_error
-- Return whether `list_or_element` (a list of strings, or a single string) "contains" `item` (a string). If
-- `list_or_element` is a list, this returns true if `item` is in the list; otherwise it returns true if `item`
-- equals `list_or_element`.
local function list_or_element_contains(list_or_element, item)
if type(list_or_element) == "table" then
return m_table.contains(list_or_element, item) and true or false
end
return list_or_element == item
end
--[==[
Call the location group's `key_to_placename` function if it exists (see the comment at the top of [[Module:place]] for
the distinction between keys and placenames). Two values are returned, the full and elliptical placenames (e.g. full
`"County Durham"` vs. elliptical `"Durham"`). If the group does not define `key_to_placename`, both full and elliptical
placenames are computed by chopping off anything starting with a comma.
]==]
function export.key_to_placename(group, key)
if group.key_to_placename == false then
return key, key
end
if group.key_to_placename then
local full_placename, elliptical_placename = group.key_to_placename(key)
if type(full_placename) ~= "string" then
internal_error("Key %s returned a non-string full placename: %s", key, full_placename)
end
if type(elliptical_placename) ~= "string" then
internal_error("Key %s returned a non-string elliptical placename: %s", key, elliptical_placename)
end
return full_placename, elliptical_placename
end
key = key:gsub(",.*", "")
return key, key
end
--[==[
Call the location group's `placename_to_key` function if it exists (see the comment at the top of [[Module:place]] for
the distinction between keys and placenames) and return the result. If `placename_to_key` exists with the value `false`,
return the placename unchanged. If the group does not define `placename_to_key`, and it defines a `default_container`
whose placetype is either `country` or `constituent country`, the container name is appended to the placename after a
comma and a space. Otherwise the placename is returned unchanged.
]==]
function export.placename_to_key(group, placename)
if group.placename_to_key == false then
return placename
elseif group.placename_to_key then
local key = group.placename_to_key(placename)
if type(key) ~= "string" then
internal_error("Placename %s returned a non-string key: %s", placename, key)
end
return key
elseif group.default_placetype == "นคร" then
return placename
else
local defcon = group.default_container
if not defcon then
return placename
elseif type(defcon) == "string" then
return placename .. ", " .. defcon
elseif type(defcon) == "table" and (defcon.placetype == "ประเทศ" or
defcon.placetype == "constituent country") then
return placename .. ", " .. defcon.key
else
return placename
end
end
end
--[==[
Initialize the location spec `spec`, augmenting it with default values taken from `group` if the spec itself doesn't
specify values for the properties. This sets `containers` to a canonicalized list of objects, each with `key` and
`placetype` keys, describing the immediate containers of the location, and erases (sets to nil) the original
non-canonicalized `container` field. (Most locations have only one immediate container but some, e.g. Russia, have more
than one. Containers should be carefully distinguished from category parents. Generally the container is the first
category parent, or the first ``n`` parents if there are ``n`` containers, but there may be additional category parents,
which indicate some sort of relation between the category parent and the location but not necessarily one of
containment.)
This function is idempotent in that nothing happens if called more than once on the same spec.
FIXME: Consider reimplementing this in a more standardly object-oriented way using metatables.
]==]
function export.initialize_spec(group, key, spec)
if spec.initialized then
return
end
local container = spec.container
local containers
local container_from_default
if not container then
container = group.default_container
container_from_default = true
end
if container then
if type(container) == "string" or container.key then
container = {container}
end
containers = {}
for _, cont in ipairs(container) do
if type(cont) == "string" then
if group.canonicalize_key_container and not container_from_default then
cont = group.canonicalize_key_container(cont)
else
cont = {key = cont, placetype = "ประเทศ"}
end
end
insert(containers, cont)
end
end
spec.containers = containers
spec.container = nil
local function value_with_default(val, default_val)
if val == nil then
return default_val
else
return val
end
end
local function set_or_default(prop)
spec[prop] = value_with_default(spec[prop], group["default_" .. prop])
end
set_or_default("placetype")
if not spec.placetype then
internal_error("No placetype found in key %s for spec %s or in group `default_placetype`", key, spec)
end
set_or_default("divs")
spec.addl_divs = group.addl_divs
for _, prop in ipairs {
"keydesc",
"fulldesc",
"addl_parents",
"overriding_bare_label_parents",
"bare_category_parent_type",
"wp",
"wpcat",
"commonscat",
"british_spelling",
"the",
"no_container_cat",
"no_container_parent",
"no_generic_place_cat",
"no_check_holonym_mismatch",
"no_auto_augment_container",
"no_include_container_in_desc",
"is_city",
"is_former_place",
} do
set_or_default(prop)
end
-- `default_placetype == "นคร"` is correct; if `default_placetype` has something else like `prefecture-level city`
-- as the canonical placetype but also lists `city` (as Chinese prefecture-level cities do), don't mark as
-- is_city.
spec.is_city = value_with_default(spec.is_city, group.default_placetype == "นคร")
spec.initialized = true
end
--[=[
Given a location group, key and possible placetypes that the placename must match, check if the key exists in the group
with at least one of the group's key's placetypes matching one of the passed-in placetypes. If so, return two values:
the group key (which potentially could differ from the passed-in key due to aliases) and the corresponding spec object,
which (as with all functions that return spec objects) has been initialized using `initialize_spec()` (i.e. default
property values have been copied from the group into the spec, if the spec doesn't itself specify a value for the
property in question).
`alias_resolution` controls how aliases are resolved. Normally, both display and category aliases are followed, and
the returned key will reflect the canonical location key. However, if `alias_resolution` is {"none"}, no alias following
happens. In that case, if the key specifies an alias, the spec for the alias rather than the spec for the canonical
location is returned, and importantly, it is returned uninitialized, meaning that properties from the group are not
copied into the spec. (If the key specifies a canonical location, its spec is returned initialized, as in the normal
case where `alias_resolution` is unspecified.) The caller needs to check whether the returned spec is an alias by
looking for an `alias_of` property. If `alias_resolution` is {"display"}, the behavior is the same as for {"none"}
except that if the alias contains a setting `display = true`, the returned key will reflect the canonical location key,
and if the alias contains a setting `display = ``string`` `, the returned key will reflect that string.
This is a low-level function meant for internal use; external callers should generally use `get_matching_location` (for
internally-derived locations), `find_matching_holonym_location` (for externally-derived locations) or
`find_canonical_key` (for known-canonical locations where the placetype isn't known).
]=]
local function find_matching_key_in_group(group, placetypes, key, alias_resolution)
if alias_resolution ~= nil and alias_resolution ~= "none" and alias_resolution ~= "display" and
alias_resolution ~= "all" then
internal_error("Bad value for 'alias_resolution': %s", alias_resolution)
end
local spec = group.data[key]
if not spec then
return nil
end
local function check_correct_placetype(placetype)
if type(placetype) == "table" then
for _, pt in ipairs(placetype) do
if list_or_element_contains(placetypes, pt) then
return true
end
end
return false
else
return list_or_element_contains(placetypes, placetype)
end
end
if spec.alias_of then
local resolved_key = spec.alias_of
local resolved_spec = group.data[resolved_key]
if not resolved_spec then
internal_error("Key %s is an alias of %s, which doesn't exist", key, resolved_key)
elseif resolved_spec.alias_of then
internal_error("Key %s is an alias of %s, which is itself an alias; indirect aliasing not allowed",
key, resolved_key)
end
if alias_resolution == "none" or alias_resolution == "display" then
-- We could be working with non-initialized/defaulted spec, since we're pulling it directly from the group.
local placetype = spec.placetype or resolved_spec.placetype or group.default_placetype
if not placetype then
internal_error("No placetype found for key %s in any of spec %s, alias-resolved spec %s or in group " ..
"`default_placetype`", key, spec, resolved_spec)
end
if not check_correct_placetype(placetype) then
return nil
end
if alias_resolution == "display" then
if spec.display == true then
key = resolved_key
elseif spec.display then
key = spec.display
end
end
return key, spec
end
key = resolved_key
spec = resolved_spec
end
-- We could be working with non-initialized/defaulted spec, since we're pulling it directly from the group.
local placetype = spec.placetype or group.default_placetype
if not placetype then
internal_error("No placetype found for key %s in spec %s or group `default_placetype`", key, spec)
end
if not check_correct_placetype(placetype) then
return nil
end
export.initialize_spec(group, key, spec)
return key, spec
end
--[=[
Given a location group, placename and possible placetypes that the placename must match, check if the placename exists
in the group with at least one of the placetypes of the key in the group that corresponds to the placename matching one
of the passed-in placetypes. If so, return two values: the key corrsponding to the passed-in placename and the
corresponding spec object. This is similar to `find_matching_key_in_group()` but works with placenames rather than keys.
`alias_resolution` is as in `find_matching_key_in_group()`.
This is a low-level function meant for internal use; external callers should generally use `get_matching_location` (for
internally-derived locations), `find_matching_holonym_location` (for externally-derived locations) or
`find_canonical_key` (for known-canonical locations where the placetype isn't known).
]=]
local function find_matching_placename_in_group(group, placetypes, placename, alias_resolution)
local key = export.placename_to_key(group, placename)
return find_matching_key_in_group(group, placetypes, key, alias_resolution)
end
--[==[
If `key` is a canonical known location key (i.e. not an alias), return the corresponding group and initialized spec.
If no such key exists, return {nil}. This throws an internal error if two locations with the same key are found.
]==]
function export.find_canonical_key(key)
local found_locations = {}
for _, group in ipairs(export.locations) do
local spec = group.data[key]
if not spec then
-- do nothing
elseif spec.alias_of then
mw.log(("Skipping alias '%s' of canonical '%s'"):format(key, spec.alias_of))
else
insert(found_locations, {group, spec})
end
end
if not found_locations[1] then
return nil
elseif found_locations[2] then
internal_error("Found multiple matching locations for canonical key %s: %s", key, found_locations)
else
local group, spec = unpack(found_locations[1])
export.initialize_spec(group, key, spec)
return group, spec
end
end
--[==[
Iterator that returns all locations matching a given description, where the description consists of either a placename
or a key along with a list of possible placetypes. Usually there will be at most one such location. The iterator
returns three values at each iteration: the location group, canonical key by which the location is known and the spec
object describing the location. `data` contains the following possible fields:
* `placetypes`: A list of possible placetypes, one of which must match one of the location's placetypes; or a string
specifying a placetype, which must match one of the location's placetypes. This must be specified.
* `placename`: The placename of the location. Either this or `key` must be specified.
* `key`: The key of the location. Either this or `placename` must be specified.
* `alias_resolution`: If specified, it behaves the same as for `find_matching_key_in_group`.
The spec is normally initialized using `initialize_spec()` prior to it being returned (but may not be if
`alias_resolution` is given and the specified key or placename is an alias; see the documentation for
`find_matching_key_in_group`).
]==]
function export.iterate_matching_location(data)
local i = 0
local n = #export.locations
return function()
while true do
i = i + 1
if i > n then
break
end
local group = export.locations[i]
local key, spec
if data.placename then
key, spec = find_matching_placename_in_group(group, data.placetypes, data.placename,
data.alias_resolution)
else
if not data.key then
internal_error("'.placename' or '.key' must be defined: %s", data)
end
key, spec = find_matching_key_in_group(group, data.placetypes, data.key, data.alias_resolution)
end
if key then
return group, key, spec
end
end
end
end
--[==[
Return the location matching a given description, where the description consists of either a placename or a key along
with a list of possible placetypes. This is similar to `iterate_matching_location()` but throws an internal error if
there is not exactly one location found; as such, it is for use with internally specified locations (such as the
containers of known locations) rather than externally specified locations, which may not match a known location and in
some cases may match multiple known locations. For finding an externally specified location, consider using
`find_matching_holonym_location`, which returns {nil} rather than throwing an error if the location isn't found, but
also (more importantly) checks to make sure there are no conflicting holonyms among the user-specified holonyms (e.g.
{{tl|place|city|s/Delaware|c/USA|t=Newark}} will not match the known location `Newark` (in New Jersey, not Delaware).
]==]
function export.get_matching_location(data)
local all_found = {}
for group, key, spec in export.iterate_matching_location(data) do
insert(all_found, {group, key, spec})
end
if not all_found[1] then
internal_error("Couldn't find matching location for data %s", data)
elseif all_found[2] then
internal_error("Found multiple matching locations for data %s: %s", data, all_found)
else
return unpack(all_found[1])
end
end
--[==[
Successively iterate over a location's containers, and then the containers of those containers, etc. Keep in mind that
locations may have multiple containers (e.g. Russia has both Europe and Asia as containers, and both Europe and Asia
have Eurasia as their container). A given container will never be returned twice (e.g. in the case where a specific
location A has locations B and C as containers, and B has C as its container, C will not be returned twice). An
internal error happens if a container loop is detected. The return value is a list of location objects, each of which
contains `group`, `key` and `spec` fields.
]==]
function export.iterate_containers(group, key, spec)
local keys_seen = {}
keys_seen[key] = true
local iterations = 0
local last_iteration_containers = {{group = group, key = key, spec = spec}}
return function()
iterations = iterations + 1
if iterations > 10 then
internal_error("Probable loop in containers when processing key %s", key)
end
local next_iteration_containers = {}
for _, location in ipairs(last_iteration_containers) do
local containers = location.spec.containers
if containers then
for _, container in ipairs(containers) do
local container_group, container_key, container_spec = export.get_matching_location {
placetypes = container.placetype,
key = container.key,
}
if not keys_seen[container_key] then
insert(next_iteration_containers, {
group = container_group, key = container_key, spec = container_spec
})
keys_seen[container_key] = true
end
end
end
end
if not next_iteration_containers[1] then
return nil
end
last_iteration_containers = next_iteration_containers
return next_iteration_containers
end
end
--[==[
Given a placename, convert it into a link (two-part if `display_form` is given and differs from `placename`) and add
`"the "` to the beginning if called for in `spec`.
]==]
function export.construct_linked_placename(spec, placename, display_form)
local linked_placename = display_form and placename ~= display_form and ("[[%s|%s]]"):format(placename,
display_form) or ("[[%s]]"):format(placename)
if spec.the then
linked_placename = "the " .. linked_placename
end
return linked_placename
end
--[=[
This is typically used to define `key_to_placename`. It generates a function that chops off parts of a string (a
location key), typically at the end, in order to get the full and elliptical versions of a placename. (See the
documentation above for `key_to_placename` under "Location group tables" for the difference between full and elliptical
placenames.) `container_patterns` is a Lua pattern or a list of possible patterns matching the container at the end of
the key, which will be used to remove that container. If multiple patterns are specified, each one is tried until one
matches. If `container_patterns` is omitted, this part of the process is skipped. The reulting string becomes the full
placename. If `divtype_patterns` is specified, it is likewise either a Lua pattern or list of possible patterns to match
and remove the political division affixed onto the end (or possibly the beginning) of the key in the keys of certain
countries (such as South Korean and North Korean counties, which include the word "เทศมณฑล" in the key). The resulting
chopped string becomes the elliptical placename. If `divtype_patterns` is omitted, this part of the process is skipped
and the full and elliptical placenames are the same.
Typical usage is as follows:
```
key_to_placename = make_key_to_placename(", England$"),
```
or (when the political division is part of the key)
```
key_to_placename = make_key_to_placename(", South Korea$", " County$")
```
]=]
local function make_key_to_placename(container_patterns, divtype_patterns)
if type(container_patterns) == "string" then
container_patterns = {container_patterns}
end
if type(divtype_patterns) == "string" then
divtype_patterns = {divtype_patterns}
end
return function(key)
local full_placename = key
if container_patterns then
for _, container_pattern in ipairs(container_patterns) do
local nsubs
full_placename, nsubs = full_placename:gsub(container_pattern, "")
if nsubs > 0 then
break
end
end
end
local elliptical_placename = full_placename
if divtype_patterns then
for _, divtype_pattern in ipairs(divtype_patterns) do
local nsubs
elliptical_placename, nsubs = elliptical_placename:gsub(divtype_pattern, "")
if nsubs > 0 then
break
end
end
end
return full_placename, elliptical_placename
end
end
--[=[
This is typically used to define `placename_to_key`. It generates a function that appends a string to the end of a given
placename to get the key (see the definition of `placename_to_key` above in the documentation under "Location group
tables"). Optional `divtype_suffix` is a raw string (which should not contain hyphens or other characters that have
special meaning in Lua patterns) to be appended first to the placename; if already present at the end, it is not
appended. `container_suffix` is then added in the same fashion if given. Typical usage is like this:
```
placename_to_key = make_placename_to_key(", England")
```
(which will convert e.g. `"Hampshire"` into `"Hampshire, England"`)
or
```
placename_to_key = make_placename_to_key(", South Korea", " County")
```
(which will convert e.g. `"Gangwon"` or `"Gangwon County"` into `"Gangwon County, South Korea"`).
]=]
local function make_placename_to_key(container_suffix, divtype_suffix)
return function(placename)
local key = placename
if divtype_suffix then
if not key:find("^" .. divtype_suffix) then --th; เปลี่ยนไปเติมข้างหน้าแทน
key = divtype_suffix .. key --th
end
end
if container_suffix then
key = container_suffix .. key --th
end
return key
end
end
--[=[
This is typically used to define `canonicalize_key_container`, which converts a container as specified in the location
data into the canonical form containing both the full container key and its placetype. It generates a function to do
the canonicalization of a given container. If the container is a string, `suffix` is appended onto the string (use {nil}
or {""} if there is no suffix to append), and the placetype is set to `placetype`. Otherwise the container is left
as-is. Typical usage is like this:
```
canonicalize_key_container = make_canonicalize_key_container(", Canada", "จังหวัด")
```
which will convert e.g. `"Ontario"` into `{key = "Ontario, Canada", placetype = "จังหวัด"}`.
]=]
local function make_canonicalize_key_container(suffix, placetype)
return function(container)
if type(container) == "string" then
return {key = container .. (suffix or ""), placetype = placetype}
else
return container
end
end
end
-----------------------------------------------------------------------------------
-- Top-level tables --
-----------------------------------------------------------------------------------
export.continents = {
["โลก"] = {the = true, placetype = "ดาวเคราะห์", addl_parents = {"ธรรมชาติ"},
fulldesc = "=the planet [[Earth]] and the features found on it"},
["แอฟริกา"] = {placetype = "ทวีป", container = {key = "โลก", placetype = "ดาวเคราะห์"}},
["อเมริกา"] = {placetype = {"มหาทวีป", "ทวีป"}, container = {key = "โลก", placetype = "ดาวเคราะห์"},
keydesc = "[[America]], in the sense of [[North America]] and [[South America]] combined",
wp = "Americas"},
["อเมริกาส์"] = {alias_of = "อเมริกา", the = true},
["อเมริกาเหนือ"] = {placetype = "ทวีป", container = {key = "อเมริกา", placetype = "มหาทวีป"}},
["แคริบเบียน"] = {the = true, placetype = {"continental region", "ภูมิภาค"}, container = {key = "อเมริกาเหนือ", placetype = "ทวีป"}},
["อเมริกากลาง"] = {placetype = {"continental region", "ภูมิภาค"}, container = {key = "อเมริกาเหนือ", placetype = "ทวีป"}},
["อเมริกาใต้"] = {placetype = "ทวีป", container = {key = "อเมริกา", placetype = "มหาทวีป"}},
["แอนตาร์กติกา"] = {placetype = "ทวีป", container = {key = "โลก", placetype = "ดาวเคราะห์"},
fulldesc = "=the territory of [[Antarctica]]"},
["ยูเรเชีย"] = {placetype = {"มหาทวีป", "ทวีป"}, container = {key = "โลก", placetype = "ดาวเคราะห์"},
keydesc = "[[Eurasia]], i.e. [[Europe]] and [[Asia]] together"},
["เอเชีย"] = {placetype = "ทวีป", container = {key = "ยูเรเชีย", placetype = "มหาทวีป"}},
["ยุโรป"] = {placetype = "ทวีป", container = {key = "ยูเรเชีย", placetype = "มหาทวีป"}},
["โอเชียเนีย"] = {placetype = "ทวีป", container = {key = "โลก", placetype = "ดาวเคราะห์"}},
["เมลานีเชีย"] = {placetype = {"continental region", "ภูมิภาค"}, container = {key = "โอเชียเนีย", placetype = "ทวีป"}},
["ไมโครนีเชีย (ภูมิภาค)"] = {placetype = {"continental region", "ภูมิภาค"}, container = {key = "โอเชียเนีย", placetype = "ทวีป"}}, --ชื่อซ้ำกัน: ภูมิภาค/สหพันธรัฐ
["พอลินีเชีย"] = {placetype = {"continental region", "ภูมิภาค"}, container = {key = "โอเชียเนีย", placetype = "ทวีป"}},
}
export.continents_group = {
default_overriding_bare_label_parents = {}, -- container parents should be used
default_divs = {{type = "ประเทศ", prep = "ใน"}},
-- It's enough to mention the first-level continent or continent group. It seems excessive to write e.g.
-- "El Salvador, a country in Central America, a continental region in North America, a continent in America, ...".
default_no_include_container_in_desc = true,
default_no_container_cat = true,
default_no_container_parent = true,
default_no_auto_augment_container = true,
default_no_generic_place_cat = true,
-- French Guyana is in France but not in Europe, which should not be an issue, so don't check holonym mismatches at
-- this level. We also run into problems with supercontinents, which have "ทวีป" as the fallback and cause
-- mismatches.
default_no_check_holonym_mismatch = true,
data = export.continents,
}
-- Countries: including those with partial recognition that are normally considered countries (e.g. Kosovo, Taiwan).
export.countries = {
["อัฟกานิสถาน"] = {container = "เอเชีย", divs = {"จังหวัด", "อำเภอ"}},
["แอลเบเนีย"] = {container = "ยุโรป", divs = {"เทศมณฑล", "เทศบาล", "communes",
{type = "administrative units", cat_as = "communes"},
}, british_spelling = true},
["แอลจีเรีย"] = {container = "แอฟริกา", divs = {"จังหวัด", "communes", "อำเภอ", "เทศบาล"}},
["อันดอร์รา"] = {container = "ยุโรป", divs = {"parishes"}, british_spelling = true},
["แองโกลา"] = {container = "แอฟริกา", divs = {"จังหวัด", "เทศบาล"}},
["แอนทีกาและบาร์บิวดา"] = {container = "แคริบเบียน", divs = {"จังหวัด"}, british_spelling = true},
["อาร์เจนตินา"] = {container = "อเมริกาใต้", divs = {"จังหวัด", "departments", "เทศบาล"}},
["อาร์มีเนีย"] = {container = {"ยุโรป", "เอเชีย"}, divs = {"จังหวัด", "อำเภอ", "เทศบาล"},
british_spelling = true},
["สาธารณรัฐอาร์มีเนีย"] = {alias_of = "อาร์มีเนีย", the = true}, -- differs in "the"
-- Both a country and continent
["ออสเตรเลีย"] = {container = "โอเชียเนีย", divs = {
{type = "รัฐ", cat_as = "states and territories"},
{type = "ดินแดน", cat_as = "states and territories"},
{type = "ABBREVIATION_OF states", cat_as = "abbreviations of states and territories"},
{type = "ABBREVIATION_OF territories", cat_as = "abbreviations of states and territories"},
"local government areas", "dependent territories",
}, british_spelling = true},
["ออสเตรีย"] = {container = "ยุโรป", divs = {"รัฐ", "อำเภอ", "เทศบาล"}, british_spelling = true},
["อาเซอร์ไบจาน"] = {container = {"ยุโรป", "เอเชีย"}, divs = {"อำเภอ", "เทศบาล"}, british_spelling = true},
["บาฮามาส"] = {the = true, container = "แคริบเบียน", divs = {"อำเภอ"}, british_spelling = true, wp = "The %l"},
["บาห์เรน"] = {container = "เอเชีย", divs = {"governorates"}},
["บังกลาเทศ"] = {container = "เอเชีย", divs = {"divisions", "อำเภอ", "เทศบาล"}, british_spelling = true},
["บาร์เบโดส"] = {container = "แคริบเบียน", divs = {"parishes"}, british_spelling = true},
["เบลารุส"] = {container = "ยุโรป", divs = {"ภูมิภาค", "อำเภอ"}, british_spelling = true},
["เบลเยียม"] = {container = "ยุโรป", divs = {"ภูมิภาค", "จังหวัด", "เทศบาล"}, british_spelling = true},
["เบลีซ"] = {container = "อเมริกากลาง", divs = {"อำเภอ"}, british_spelling = true},
["เบนิน"] = {container = "แอฟริกา", divs = {"departments", "communes"}},
["ภูฏาน"] = {container = "เอเชีย", divs = {"อำเภอ", "gewogs"}},
["โบลิเวีย"] = {container = "อเมริกาใต้", divs = {"จังหวัด", "departments", "เทศบาล"}},
["บอสเนียและเฮอร์เซโกวีนา"] = {container = "ยุโรป", divs = {"entities", "cantons", "เทศบาล"}, british_spelling = true},
--["Bosnia and Hercegovina"] = {alias_of = "บอสเนียและเฮอร์เซโกวีนา", display = true},
["บอสเนีย-เฮอร์เซโกวีนา"] = {alias_of = "บอสเนียและเฮอร์เซโกวีนา", display = true},
--["Bosnia-Hercegovina"] = {alias_of = "บอสเนียและเฮอร์เซโกวีนา", display = true},
["บอสเนีย"] = {alias_of = "บอสเนียและเฮอร์เซโกวีนา", display = true},
["บอตสวานา"] = {container = "แอฟริกา", divs = {"อำเภอ", "ตำบล"}, british_spelling = true},
["บราซิล"] = {container = "อเมริกาใต้", divs = {
"รัฐ", "เทศบาล", "macroregions",
{type = "ABBREVIATION_OF states", cat_as = "abbreviations of states"},
}},
["บรูไน"] = {container = "เอเชีย", divs = {"อำเภอ", "mukims"}, british_spelling = true},
["บัลแกเรีย"] = {container = "ยุโรป", divs = {"จังหวัด", "เทศบาล"}, british_spelling = true},
["บูร์กินาฟาโซ"] = {container = "แอฟริกา", divs = {"ภูมิภาค", "departments", "จังหวัด"}},
["บุรุนดี"] = {container = "แอฟริกา", divs = {"จังหวัด", "communes"}},
["กัมพูชา"] = {container = "เอเชีย", divs = {"จังหวัด", "อำเภอ"}},
["แคเมอรูน"] = {container = "แอฟริกา", divs = {"ภูมิภาค", "departments"}},
["แคนาดา"] = {container = "อเมริกาเหนือ", divs = {
{type = "รัฐ", cat_as = "รัฐและดินแดน"}, --ตาม thwiki
{type = "ดินแดน", cat_as = "รัฐและดินแดน"},
{type = "ABBREVIATION_OF provinces", cat_as = "abbreviations of รัฐและดินแดน"},
{type = "ABBREVIATION_OF territories", cat_as = "abbreviations of รัฐและดินแดน"},
"เทศมณฑล", "อำเภอ", "เทศบาล", "regional municipalities",
"rural municipalities", "parishes",
-- Don't change the following to something more politically correct (e.g. "First Nations reserves") until/unless
-- the Canadian government makes a similar switch (and note that as of Apr 18 2025, the Wikipedia article is
-- still at [[w:Indian reserves]]).
"Indian reserves",
"census divisions",
{type = "townships", prep = "ใน"},
},
british_spelling = true},
["กาบูเวร์ดี"] = {container = "แอฟริกา", divs = {"เทศบาล", "parishes"}},
["เคปเวิร์ด"] = {alias_of = "กาบูเวร์ดี", display = true},
["สาธารณรัฐแอฟริกากลาง"] = {the = true, container = "แอฟริกา", divs = {"prefectures", "subprefectures"}},
["CAR"] = {alias_of = "สาธารณรัฐแอฟริกากลาง", display = true, the = true},
["C.A.R"] = {alias_of = "สาธารณรัฐแอฟริกากลาง", display = true, the = true},
["ชาด"] = {container = "แอฟริกา", divs = {"ภูมิภาค", "departments"}},
["ชิลี"] = {container = "อเมริกาใต้", divs = {"ภูมิภาค", "จังหวัด", "communes"}},
["จีน"] = {container = "เอเชีย", divs = {
{type = "มณฑล", cat_as = "provinces and autonomous regions"}, --ตาม thwiki
{type = "autonomous regions", cat_as = "provinces and autonomous regions"},
{type = "FORMER provinces", cat_as = "former provinces"},
"special administrative regions",
"จังหวัด", --ตาม thwiki
{type = "FORMER prefectures", cat_as = "former prefectures"},
"prefecture-level cities",
{type = "เทศมณฑล", cat_as = "counties and county-level cities"},
{type = "county-level cities", cat_as = "counties and county-level cities"},
{type = "FORMER counties", cat_as = "former counties and county-level cities"},
{type = "FORMER county-level cities", cat_as = "former counties and county-level cities"},
-- "towns" (but not "townships") are automatically added as they are specified as generic_before_non_cities.
"อำเภอ",
{type = "FORMER districts", cat_as = "former districts"},
"ตำบล",
"townships",
"เทศบาล",
{type = "direct-administered municipalities", cat_as = "เทศบาล"},
}},
["สาธารณรัฐประชาชนจีน"] = {alias_of = "จีน", the = true}, -- differs in "the"
["โคลอมเบีย"] = {container = "อเมริกาใต้", divs = {"departments", "เทศบาล"}},
["คอโมโรส"] = {the = true, container = "แอฟริกา", divs = {"autonomous islands"}},
["คอสตาริกา"] = {container = "อเมริกากลาง", divs = {"จังหวัด", "cantons"}},
["โครเอเชีย"] = {container = "ยุโรป", divs = {"เทศมณฑล", "เทศบาล"}, british_spelling = true},
["คิวบา"] = {container = "แคริบเบียน", divs = {"จังหวัด", "เทศบาล"}},
["ไซปรัส"] = {container = {"ยุโรป", "เอเชีย"}, divs = {"อำเภอ"}, british_spelling = true},
["สาธารณรัฐเช็ก"] = {the = true, container = "ยุโรป", divs = {"ภูมิภาค", "อำเภอ", "เทศบาล"}, british_spelling = true},
["เช็กเกีย"] = {alias_of = "สาธารณรัฐเช็ก"}, -- differs in "the"
["สาธารณรัฐประชาธิปไตยคองโก"] = {the = true, container = "แอฟริกา", divs = {"จังหวัด", "ดินแดน"}},
["คองโก"] = {alias_of = "สาธารณรัฐประชาธิปไตยคองโก", display = true, the = true},
["DRC"] = {alias_of = "สาธารณรัฐประชาธิปไตยคองโก", display = true, the = true},
["D.R.C"] = {alias_of = "สาธารณรัฐประชาธิปไตยคองโก", display = true, the = true},
["เดนมาร์ก"] = {container = "ยุโรป", divs = {"ภูมิภาค", "เทศบาล", "dependent territories"},
british_spelling = true,
-- Wikipedia separates [[w:Denmark]] (constituent country) from [[w:Danish Realm]] (country)
},
["จิบูตี"] = {container = "แอฟริกา", divs = {"ภูมิภาค", "อำเภอ"}},
["ดอมินีกา"] = {container = "แคริบเบียน", divs = {"parishes"}, british_spelling = true},
["สาธารณรัฐโดมินิกัน"] = {the = true, container = "แคริบเบียน", divs = {"จังหวัด", "เทศบาล"},
keydesc = "the [[Dominican Republic]], the country that shares the [[Caribbean]] island of [[Hispaniola]] with [[Haiti]]"},
["ติมอร์-เลสเต"] = {container = "เอเชีย", divs = {"เทศบาล"}, wp = "ติมอร์-เลสเต"},
["ติมอร์ตะวันออก"] = {alias_of = "ติมอร์-เลสเต", display = true},
["เอกวาดอร์"] = {container = "อเมริกาใต้", divs = {"จังหวัด", "cantons"}},
["อียิปต์"] = {container = "แอฟริกา", divs = {"governorates", "ภูมิภาค"}, british_spelling = true},
["เอลซัลวาดอร์"] = {container = "อเมริกากลาง", divs = {"departments", "เทศบาล"}},
["อิเควทอเรียลกินี"] = {container = "แอฟริกา", divs = {"จังหวัด"}},
["เอริเทรีย"] = {container = "แอฟริกา", divs = {"ภูมิภาค", "subregions"}},
["เอสโตเนีย"] = {container = "ยุโรป", divs = {"เทศมณฑล", "เทศบาล"}, british_spelling = true},
["เอสวาตินี"] = {container = "แอฟริกา", british_spelling = true},
["สวาซีแลนด์"] = {alias_of = "เอสวาตินี", display = true},
["เอธิโอเปีย"] = {container = "แอฟริกา", divs = {"ภูมิภาค", "zones"}},
["สหพันธรัฐไมโครนีเชีย"] = {the = true, container = "ไมโครนีเชีย", divs = {"รัฐ"}},
["ไมโครนีเชีย"] = {alias_of = "สหพันธรัฐไมโครนีเชีย"}, --ชื่อซ้ำกัน: ภูมิภาค/สหพันธรัฐ
["ฟีจี"] = {container = "เมลานีเชีย", divs = {"divisions", "จังหวัด"}, british_spelling = true},
["ฟินแลนด์"] = {container = "ยุโรป", divs = {"ภูมิภาค", "เทศบาล"}, british_spelling = true},
["ฝรั่งเศส"] = {container = "ยุโรป", divs = {"ภูมิภาค", "cantons", "collectivities",
"communes",
{type = "เทศบาล", cat_as = "communes"},
"departments",
{type = "prefectures", cat_as = {"prefectures", "departmental capitals"}},
{type = "French prefectures", cat_as = {"prefectures", "departmental capitals"}},
"dependent territories", "ดินแดน", "จังหวัด",
}, british_spelling = true},
["กาบอง"] = {container = "แอฟริกา", divs = {"จังหวัด", "departments"}},
["แกมเบีย"] = {the = true, container = "แอฟริกา", divs = {"divisions", "อำเภอ"}, british_spelling = true, wp = "The %l"},
["จอร์เจีย"] = {container = {"ยุโรป", "เอเชีย"}, divs = {"ภูมิภาค", "อำเภอ"},
keydesc = "the country of [[Georgia]], in [[Eurasia]]", british_spelling = true, wp = "%l (country)"},
["เยอรมนี"] = {container = "ยุโรป", divs = {
"รัฐ",
-- Bavaria, Baden-Württemberg, Hesse and North Rhine-Westphalia have administrative regions as divisions, but
-- there aren't really enough of them to categorize per state.
"ภูมิภาค",
"เทศบาล", "อำเภอ"}, british_spelling = true},
["กานา"] = {container = "แอฟริกา", divs = {"ภูมิภาค", "อำเภอ"}, british_spelling = true},
["กรีซ"] = {container = "ยุโรป", divs = {"ภูมิภาค", "regional units", "เทศบาล",
{type = "peripheries", cat_as = {"ภูมิภาค"}},
}, british_spelling = true},
["กรีเนดา"] = {container = "แคริบเบียน", divs = {"parishes"}, british_spelling = true},
["กัวเตมาลา"] = {container = "อเมริกากลาง", divs = {"จังหวัด", "เทศบาล"}},
["กินี"] = {container = "แอฟริกา", divs = {"ภูมิภาค", "prefectures"}},
["กินี-บิสเซา"] = {container = "แอฟริกา", divs = {"ภูมิภาค"}},
["กายอานา"] = {container = "อเมริกาใต้", divs = {"ภูมิภาค"}, british_spelling = true},
["เฮติ"] = {container = "แคริบเบียน", divs = {"departments", "arrondissements"}},
["ฮอนดูรัส"] = {container = "อเมริกากลาง", divs = {"departments", "เทศบาล"}},
["ฮังการี"] = {container = "ยุโรป", divs = {"เทศมณฑล", "อำเภอ"}, british_spelling = true},
["ไอซ์แลนด์"] = {container = "ยุโรป", divs = {"ภูมิภาค", "เทศบาล", "เทศมณฑล"}, british_spelling = true},
["อินเดีย"] = {container = "เอเชีย", divs = {
{type = "รัฐ", cat_as = "states and union territories"},
{type = "union territories", cat_as = "states and union territories"},
{type = "ABBREVIATION_OF states", cat_as = "abbreviations of states and union territories"},
{type = "ABBREVIATION_OF union territories", cat_as = "abbreviations of states and union territories"},
"divisions", "อำเภอ", "เทศบาล",
}, british_spelling = true},
["อินโดนีเซีย"] = {container = "เอเชีย", divs = {"regencies", "จังหวัด",
{type = "ABBREVIATION_OF provinces", cat_as = "abbreviations of provinces"},
}},
["อิหร่าน"] = {container = "เอเชีย", divs = {"จังหวัด", "เทศมณฑล"}},
["อิรัก"] = {container = "เอเชีย", divs = {"governorates", "อำเภอ"}},
["ไอร์แลนด์"] = {container = "ยุโรป", addl_parents = {"British Isles"},
divs = {"เทศมณฑล", "อำเภอ", "จังหวัด"}, british_spelling = true, wp = "Republic of %l"},
["สาธารณรัฐไอร์แลนด์"] = {alias_of = "ไอร์แลนด์", the = true}, -- differs in "the"
["อิสราเอล"] = {container = "เอเชีย", divs = {"อำเภอ"}},
["อิตาลี"] = {container = "ยุโรป", divs = {
"ภูมิภาค", "จังหวัด", "metropolitan cities", "เทศบาล",
{type = "autonomous regions", cat_as = "ภูมิภาค"},
}, british_spelling = true},
["โกตดิวัวร์"] = {container = "แอฟริกา", divs = {"อำเภอ", "ภูมิภาค"}},
-- We should really be using Ivory Coast (common name) but there are political ramifications to the use of
-- Côte d'Ivoire so don't make it a display alias.
["ไอวอรีโคสต์"] = {alias_of = "โกตดิวัวร์"},
["จาเมกา"] = {container = "แคริบเบียน", divs = {"parishes"}, british_spelling = true},
["ญี่ปุ่น"] = {container = "เอเชีย", divs = {"จังหวัด", "กิ่งจังหวัด", "เทศบาล"}},
["จอร์แดน"] = {container = "เอเชีย", divs = {"governorates"}},
["คาซัคสถาน"] = {container = {"เอเชีย", "ยุโรป"}, divs = {"ภูมิภาค", "อำเภอ"}},
["เคนยา"] = {container = "แอฟริกา", divs = {"เทศมณฑล"}, british_spelling = true},
["Kiribati"] = {container = "ไมโครนีเชีย", british_spelling = true},
["Kosovo"] = {container = "ยุโรป", divs = {"อำเภอ", "เทศบาล"}, british_spelling = true},
["Kuwait"] = {container = "เอเชีย", divs = {"governorates", "areas"}},
["Kyrgyzstan"] = {container = "เอเชีย", divs = {"ภูมิภาค", "อำเภอ"}},
["Laos"] = {container = "เอเชีย", divs = {"จังหวัด", "อำเภอ"}},
["Latvia"] = {container = "ยุโรป", divs = {"เทศบาล"}, british_spelling = true},
["Lebanon"] = {container = "เอเชีย", divs = {"governorates", "อำเภอ"}},
["Lesotho"] = {container = "แอฟริกา", divs = {"อำเภอ"}, british_spelling = true},
["Liberia"] = {container = "แอฟริกา", divs = {"เทศมณฑล", "อำเภอ"}},
["Libya"] = {container = "แอฟริกา", divs = {"อำเภอ", "เทศบาล"}},
["Liechtenstein"] = {container = "ยุโรป", divs = {"เทศบาล"}, british_spelling = true},
["Lithuania"] = {container = "ยุโรป", divs = {"เทศมณฑล", "เทศบาล"}, british_spelling = true},
["Luxembourg"] = {container = "ยุโรป", divs = {"cantons", "อำเภอ"}, british_spelling = true},
["Madagascar"] = {container = "แอฟริกา", divs = {"ภูมิภาค", "อำเภอ"}},
["Malawi"] = {container = "แอฟริกา", divs = {"ภูมิภาค", "อำเภอ"}, british_spelling = true},
["Malaysia"] = {container = "เอเชีย", divs = {"รัฐ", "federal territories", "อำเภอ"}, british_spelling = true},
["Maldives"] = {the = true, container = "เอเชีย", divs = {"จังหวัด", "administrative atolls"}, british_spelling = true},
["Mali"] = {container = "แอฟริกา", divs = {"ภูมิภาค", "cercles"}},
["Malta"] = {container = "ยุโรป", divs = {"ภูมิภาค", "local councils"}, british_spelling = true},
["Marshall Islands"] = {the = true, container = "ไมโครนีเชีย", divs = {"เทศบาล"}},
["Mauritania"] = {container = "แอฟริกา", divs = {"ภูมิภาค", "departments"}},
["Mauritius"] = {container = "แอฟริกา", divs = {"อำเภอ"}, british_spelling = true},
["Mexico"] = {container = "อเมริกาเหนือ", addl_parents = {"อเมริกากลาง"}, divs = {
"รัฐ", "เทศบาล",
{type = "ABBREVIATION_OF states", cat_as = "abbreviations of states"},
}},
["Moldova"] = {container = "ยุโรป", divs = {
{type = "อำเภอ", cat_as = "districts and autonomous territorial units"},
{type = "autonomous territorial units", cat_as = "districts and autonomous territorial units"},
"communes", "เทศบาล",
}, british_spelling = true},
["Monaco"] = {placetype = {"city-state", "ประเทศ"}, container = "ยุโรป",
-- We want the first placetype to be 'city-state' so the description of Monaco says it's a city-state, but we
-- want its parent to be "countries in Europe".
bare_category_parent_type = {type = "ประเทศ", prep = "ใน"},
is_city = true, british_spelling = true},
["Mongolia"] = {container = "เอเชีย", divs = {"จังหวัด", "อำเภอ"}},
["Montenegro"] = {container = "ยุโรป", divs = {"เทศบาล"}},
["Morocco"] = {container = "แอฟริกา", divs = {"ภูมิภาค", "prefectures", "จังหวัด"}},
["Mozambique"] = {container = "แอฟริกา", divs = {"จังหวัด", "อำเภอ"}},
["Myanmar"] = {container = "เอเชีย",
divs = {"ภูมิภาค", "รัฐ", "union territories",
{type = "self-administered zones", cat_as = "self-administered areas"},
{type = "self-administered divisions", cat_as = "self-administered areas"},
"อำเภอ"}},
["Burma"] = {alias_of = "Myanmar"}, -- not display-canonicalizing; has political connotations
["Namibia"] = {container = "แอฟริกา", divs = {"ภูมิภาค", "constituencies"}, british_spelling = true},
["Nauru"] = {container = "ไมโครนีเชีย", divs = {"อำเภอ"}, british_spelling = true},
["Nepal"] = {container = "เอเชีย", divs = {"จังหวัด", "อำเภอ"}},
["เนเธอร์แลนด์"] = {the = true, placetype = {"ประเทศ", "constituent country"}, container = "ยุโรป",
divs = {"จังหวัด", "เทศบาล",
{type = "FORMER municipalities", cat_as = "former municipalities"},
"dependent territories", "constituent countries"}, british_spelling = true,
-- Wikipedia separates [[w:Netherlands]] (constituent country) from [[w:Kingdom of the Netherlands]]
-- (country)
},
["New Zealand"] = {container = "พอลินีเชีย", divs = {
"ภูมิภาค", "dependent territories", "territorial authorities",
{type = "อำเภอ", cat_as = "territorial authorities"},
},
british_spelling = true},
["Nicaragua"] = {container = "อเมริกากลาง", divs = {"departments", "เทศบาล"}},
["Niger"] = {container = "แอฟริกา", divs = {"ภูมิภาค", "departments"}},
["Nigeria"] = {container = "แอฟริกา", divs = {
"รัฐ",
-- Categorize the Federal Capital Territory as a state because there's only one of it; we could categorize
-- everything under 'states and territories' but that seems a bit pointless.
{type = "federal territories", cat_as = "รัฐ"},
"local government areas",
}, british_spelling = true},
["North Korea"] = {container = "เอเชีย", addl_parents = {"Korea"}, divs = {"จังหวัด", "เทศมณฑล"}},
["North Macedonia"] = {container = "ยุโรป", divs = {"ภูมิภาค", "เทศบาล"}, british_spelling = true},
["Macedonia"] = {alias_of = "North Macedonia", display = true},
["Republic of North Macedonia"] = {alias_of = "North Macedonia", the = true}, -- differs in "the"
["Republic of Macedonia"] = {alias_of = "North Macedonia", the = true}, -- differs in "the"
["Norway"] = {container = "ยุโรป",
divs = {"เทศมณฑล", "เทศบาล", "dependent territories", "อำเภอ", "unincorporated areas"},
british_spelling = true},
["Oman"] = {container = "เอเชีย", divs = {"governorates", "จังหวัด"}},
["Pakistan"] = {container = "เอเชีย", divs = {
{type = "จังหวัด", cat_as = "provinces and territories"},
{type = "administrative territories", cat_as = "provinces and territories"},
{type = "federal territories", cat_as = "provinces and territories"},
{type = "ดินแดน", cat_as = "provinces and territories"},
"divisions", "อำเภอ",
}, british_spelling = true},
["Palau"] = {container = "ไมโครนีเชีย", divs = {"รัฐ"}},
["Palestine"] = {container = "เอเชีย", divs = {"governorates"}},
["State of Palestine"] = {alias_of = "Palestine", the = true}, -- differs in "the"
["Panama"] = {container = "อเมริกากลาง", divs = {"จังหวัด", "อำเภอ"}},
["Papua New Guinea"] = {container = "เมลานีเชีย", divs = {"จังหวัด", "อำเภอ"}, british_spelling = true},
["Paraguay"] = {container = "อเมริกาใต้", divs = {"departments", "อำเภอ"}},
["Peru"] = {container = "อเมริกาใต้", divs = {"ภูมิภาค", "จังหวัด", "อำเภอ"}},
["Philippines"] = {the = true, container = "เอเชีย", divs = {"ภูมิภาค", "จังหวัด", "อำเภอ", "เทศบาล", "barangays"}},
["Poland"] = {divs = {"voivodeships", "เทศมณฑล",
{type = "Polish colonies", cat_as = {{type = "villages", prep = "ใน"}}},
}, container = "ยุโรป", british_spelling = true},
["Portugal"] = {container = "ยุโรป", divs = {
{type = "autonomous regions", cat_as = "districts and autonomous regions"},
{type = "อำเภอ", cat_as = "districts and autonomous regions"},
"จังหวัด", "เทศบาล"}, british_spelling = true},
["Qatar"] = {container = "เอเชีย", divs = {"เทศบาล", "zones"}},
["Republic of the Congo"] = {the = true, container = "แอฟริกา", divs = {"departments", "อำเภอ"}},
["Congo Republic"] = {alias_of = "Republic of the Congo", display = true, the = true},
["Romania"] = {container = "ยุโรป", divs = {
"ภูมิภาค", "เทศมณฑล", "communes",
{type = "ABBREVIATION_OF counties", cat_as = "abbreviations of counties"},
}, british_spelling = true},
["Russia"] = {container = {"ยุโรป", "เอเชีย"}, divs = {
"federal subjects", "republics", "autonomous oblasts", "autonomous okrugs", "oblasts", "krais", "federal cities",
"อำเภอ", "federal districts"},
british_spelling = true},
["Rwanda"] = {container = "แอฟริกา", divs = {"จังหวัด", "อำเภอ"}},
["Saint Kitts and Nevis"] = {container = "แคริบเบียน", divs = {"parishes"}, british_spelling = true},
["Saint Kitts"] = {alias_of = "Saint Kitts and Nevis", display = true},
["Saint Lucia"] = {container = "แคริบเบียน", divs = {"อำเภอ"}, british_spelling = true},
["Saint Vincent and the Grenadines"] = {container = "แคริบเบียน", divs = {"parishes"}, british_spelling = true},
["Saint Vincent"] = {alias_of = "Saint Vincent and the Grenadines", display = true},
["SVG"] = {alias_of = "Saint Vincent and the Grenadines", display = true},
["S.V.G"] = {alias_of = "Saint Vincent and the Grenadines", display = true},
["Samoa"] = {container = "พอลินีเชีย", divs = {"อำเภอ"}, british_spelling = true},
["San Marino"] = {container = "ยุโรป", divs = {"เทศบาล"}, british_spelling = true},
["São Tomé and Príncipe"] = {container = "แอฟริกา", divs = {"อำเภอ"}},
["São Tome and Principe"] = {alias_of = "São Tomé and Príncipe", display = true},
["São Tomé"] = {alias_of = "São Tomé and Príncipe", display = true},
["São Tome"] = {alias_of = "São Tomé and Príncipe", display = true},
["Saudi Arabia"] = {container = "เอเชีย", divs = {"จังหวัด", "governorates"}},
["Senegal"] = {container = "แอฟริกา", divs = {"ภูมิภาค", "departments"}},
["Serbia"] = {container = "ยุโรป", divs = {"อำเภอ", "เทศบาล", "autonomous provinces"}},
["Seychelles"] = {container = "แอฟริกา", divs = {"อำเภอ"}, british_spelling = true},
["Sierra Leone"] = {container = "แอฟริกา", divs = {"จังหวัด", "อำเภอ"}, british_spelling = true},
["Singapore"] = {container = "เอเชีย", divs = {"อำเภอ", "ภูมิภาค"}, british_spelling = true},
["Slovakia"] = {container = "ยุโรป", divs = {"ภูมิภาค", "อำเภอ"}, british_spelling = true},
["Slovenia"] = {container = "ยุโรป", divs = {"statistical regions", "เทศบาล"}, british_spelling = true},
-- Note: While the official name does not include "the" at the beginning,
-- it sounds strange in English to leave it out and it's commonly included.
["Solomon Islands"] = {the = true, container = "เมลานีเชีย", divs = {"จังหวัด"}, british_spelling = true},
["โซมาเลีย"] = {container = "แอฟริกา", divs = {"ภูมิภาค", "อำเภอ"}},
["South Africa"] = {container = "แอฟริกา", divs = {
"จังหวัด",
"อำเภอ",
{type = "district municipalities", cat_as = "อำเภอ"},
{type = "metropolitan municipalities", cat_as = "อำเภอ"},
"เทศบาล",
}, british_spelling = true},
["South Korea"] = {container = "เอเชีย", addl_parents = {"Korea"}, divs = {"จังหวัด", "เทศมณฑล", "อำเภอ"}},
["South Sudan"] = {container = "แอฟริกา", divs = {"ภูมิภาค", "รัฐ", "เทศมณฑล"}, british_spelling = true},
["Spain"] = {container = "ยุโรป", divs = {"autonomous communities", "จังหวัด", "เทศบาล",
"comarcas", "autonomous cities"},
british_spelling = true},
["Sri Lanka"] = {container = "เอเชีย", divs = {"จังหวัด", "อำเภอ"}, british_spelling = true},
["Sudan"] = {container = "แอฟริกา", divs = {"รัฐ", "อำเภอ"}, british_spelling = true},
["Suriname"] = {container = "อเมริกาใต้", divs = {"อำเภอ"}},
["Sweden"] = {container = "ยุโรป", divs = {"จังหวัด", "เทศมณฑล", "เทศบาล"}, british_spelling = true},
["Switzerland"] = {container = "ยุโรป", divs = {"cantons", "เทศบาล", "อำเภอ"}, british_spelling = true},
["Syria"] = {container = "เอเชีย", divs = {"governorates", "อำเภอ"}},
["ไต้หวัน"] = {container = "เอเชีย", divs = {"เทศมณฑล", "อำเภอ", "townships", "special municipalities"}},
["สาธารณรัฐจีน"] = {alias_of = "ไต้หวัน", the = true}, -- differs in "the", different political connotations
["Tajikistan"] = {container = "เอเชีย", divs = {"ภูมิภาค", "อำเภอ"}},
["Tanzania"] = {container = "แอฟริกา", divs = {"ภูมิภาค", "อำเภอ"}, british_spelling = true},
["ไทย"] = {container = "เอเชีย", divs = {"จังหวัด", "อำเภอ", "ตำบล"}},
["Togo"] = {container = "แอฟริกา", divs = {"จังหวัด", "prefectures"}},
["Tonga"] = {container = "พอลินีเชีย", divs = {"divisions"}, british_spelling = true},
["Trinidad and Tobago"] = {container = "แคริบเบียน", divs = {"ภูมิภาค", "เทศบาล"}, british_spelling = true},
["Tunisia"] = {container = "แอฟริกา", divs = {"governorates", "delegations"}},
["Turkey"] = {container = {"ยุโรป", "เอเชีย"}, divs = {"จังหวัด", "อำเภอ"}},
-- Foreign names generally get display-canonicalized.
["Türkiye"] = {alias_of = "Turkey", display = true},
["Turkmenistan"] = {container = "เอเชีย", divs = {
-- The 5 regions are often also called provinces
"ภูมิภาค", {type = "จังหวัด", cat_as = "ภูมิภาค"}, "อำเภอ"},
},
["Tuvalu"] = {container = "พอลินีเชีย", divs = {"atolls"}, british_spelling = true},
["Uganda"] = {container = "แอฟริกา", divs = {"อำเภอ", "เทศมณฑล"}, british_spelling = true},
["Ukraine"] = {container = "ยุโรป", divs = {
{type = "oblasts", cat_as = "oblasts and autonomous republics"},
{type = "autonomous republics", cat_as = "oblasts and autonomous republics"},
"raions", "hromadas",
}, british_spelling = true},
["United Arab Emirates"] = {the = true, container = "เอเชีย", divs = {"emirates"}},
-- Abbreviations get display-canonicalized.
["UAE"] = {alias_of = "United Arab Emirates", display = true, the = true},
["U.A.E."] = {alias_of = "United Arab Emirates", display = true, the = true},
["สหราชอาณาจักร"] = {the = true, container = "ยุโรป", addl_parents = {"British Isles"},
divs = {"constituent countries", "เทศมณฑล", "อำเภอ", "boroughs", "ดินแดน", "dependent territories",
"traditional counties"},
keydesc = "the [[United Kingdom]] of Great Britain and Northern Ireland", british_spelling = true},
-- Abbreviations get display-canonicalized.
["UK"] = {alias_of = "สหราชอาณาจักร", display = true, the = true},
["U.K."] = {alias_of = "สหราชอาณาจักร", display = true, the = true},
["สหรัฐอเมริกา"] = {the = true, container = "อเมริกาเหนือ",
divs = {"เทศมณฑล", "county seats", "รัฐ", "ดินแดน", "dependent territories",
{type = "ABBREVIATION_OF states", cat_as = "abbreviations of states"},
{type = "DEROGATORY_NAME_FOR states", cat_as = "derogatory names for states"},
{type = "NICKNAME_FOR states", cat_as = "nicknames for states"},
{type = "OFFICIAL_NICKNAME_FOR states", cat_as = "official nicknames for states"},
{type = "boroughs", prep = "ใน"}, -- exist in Pennsylvania and New Jersey
"เทศบาล", -- these exist politically at least in Colorado and Connecticut
{type = "census-designated places", prep = "ใน"},
{type = "unincorporated communities", prep = "ใน"},
-- Don't change the following to something more politically correct until/unless the US government makes a
-- similar switch (and note that as of Apr 18 2025, the Wikipedia article is still at
-- [[w:Indian reservations]]).
"Indian reservations",
}},
-- Abbreviations and long forms (when possible) get display-canonicalized.
["US"] = {alias_of = "สหรัฐอเมริกา", display = true, the = true},
["U.S."] = {alias_of = "สหรัฐอเมริกา", display = true, the = true},
["USA"] = {alias_of = "สหรัฐอเมริกา", display = true, the = true},
["U.S.A."] = {alias_of = "สหรัฐอเมริกา", display = true, the = true},
["สหรัฐ"] = {alias_of = "สหรัฐอเมริกา", display = true, the = true},
["Uruguay"] = {container = "อเมริกาใต้", divs = {"departments", "เทศบาล"}},
["Uzbekistan"] = {container = "เอเชีย", divs = {"ภูมิภาค", "อำเภอ"}},
["Vanuatu"] = {container = "เมลานีเชีย", divs = {"จังหวัด"}, british_spelling = true},
["Vatican City"] = {placetype = {"city-state", "ประเทศ"}, container = "ยุโรป",
-- First placetype should be 'city-state' for to shown up in its description,
-- Its parent should still be "countries in Europe".
bare_category_parent_type = {type = "ประเทศ", prep = "ใน"},
addl_parents = {"Rome"}, is_city = true, british_spelling = true},
["Vatican"] = {alias_of = "Vatican City", the = true}, -- differs in "the"
["Venezuela"] = {container = "อเมริกาใต้", divs = {"รัฐ", "เทศบาล"}},
["เวียดนาม"] = {container = "เอเชีย", divs = {"จังหวัด", "อำเภอ", "เทศบาล"}},
["Western Sahara"] = {placetype = {"ดินแดน", "ประเทศ"}, container = "แอฟริกา",
bare_category_parent_type = {type = "ประเทศ", prep = "ใน"},
},
-- Not display-canonicalizable both due to differences in 'the' and the sovereignty dispute over Western Sahara
["Sahrawi Arab Democratic Republic"] = {alias_of = "Western Sahara", the = true},
["SADR"] = {alias_of = "Sahrawi Arab Democratic Republic", display = true, the = true},
["Yemen"] = {container = "เอเชีย", divs = {"governorates", "อำเภอ"}},
["Zambia"] = {container = "แอฟริกา", divs = {"จังหวัด", "อำเภอ"}, british_spelling = true},
["Zimbabwe"] = {container = "แอฟริกา", divs = {"จังหวัด", "อำเภอ"}, british_spelling = true},
}
local function canonicalize_continent_container(key)
if type(key) ~= "string" then
return key
end
if export.continents[key] then
return {key = key, placetype = export.continents[key].placetype}
end
internal_error("Unrecognized key %s in `canonicalize_continent_like`", key)
end
export.countries_group = {
canonicalize_key_container = canonicalize_continent_container,
default_overriding_bare_label_parents = {"+++", "ประเทศ"},
default_placetype = "ประเทศ",
default_no_container_cat = true,
default_no_container_parent = true,
-- No need to augment country holonyms with continents; not needed for disambiguation.
default_no_auto_augment_container = true,
data = export.countries,
}
-- Country-like entities: typically overseas territories or de-facto independent countries, which in both cases
-- are not internationally recognized as sovereign nations but which we treat similarly to countries.
export.country_like_entities = {
-- British Overseas Territory
["Akrotiri and Dhekelia"] = {
placetype = {"overseas territory", "ดินแดน"},
container = "สหราชอาณาจักร",
addl_parents = {"ไซปรัส", "ยุโรป", "เอเชีย"},
british_spelling = true,
},
-- Åland: Listed as a region of Finland. Wikipedia lists this under "dependent territories" in
-- [[w:List of sovereign states and dependent territories by continent]].
-- unincorporated territory of the United States
["American Samoa"] = {
placetype = {"unincorporated territory", "overseas territory", "ดินแดน"},
container = "สหรัฐอเมริกา",
addl_parents = {"พอลินีเชีย"},
},
-- British Overseas Territory
["Anguilla"] = {
placetype = {"overseas territory", "ดินแดน"},
container = "สหราชอาณาจักร",
addl_parents = {"แคริบเบียน"},
british_spelling = true,
},
-- de-facto independent state, internationally recognized as part of Georgia
["Abkhazia"] = {
placetype = {"unrecognized country", "ประเทศ"},
addl_parents = {"Georgia", "ยุโรป", "เอเชีย"},
divs = {"อำเภอ"},
keydesc = "the de-facto independent state of [[Abkhazia]], internationally recognized as part of the country of [[Georgia]]",
british_spelling = true,
},
-- Australian external territory
["Ashmore and Cartier Islands"] = {
the = true,
placetype = {"external territory", "ดินแดน"},
container = "ออสเตรเลีย",
addl_parents = {"เอเชีย"},
},
-- constituent country of the Netherlands
["Aruba"] = {
placetype = {"constituent country", "ประเทศ"},
container = "เนเธอร์แลนด์",
addl_parents = {"แคริบเบียน"},
british_spelling = true,
},
-- British Overseas Territory
["Bermuda"] = {
placetype = {"overseas territory", "ดินแดน"},
container = "สหราชอาณาจักร",
addl_parents = {"อเมริกาเหนือ"},
british_spelling = true,
},
-- special municipality of the Netherlands
["Bonaire"] = {
placetype = {"special municipality", "เทศบาล", "overseas territory", "ดินแดน"},
container = "เนเธอร์แลนด์",
addl_parents = {"แคริบเบียน"},
is_city = true,
british_spelling = true,
},
-- British Overseas Territory
["British Indian Ocean Territory"] = {
the = true,
placetype = {"overseas territory", "ดินแดน"},
container = "สหราชอาณาจักร",
addl_parents = {"เอเชีย"},
british_spelling = true,
},
-- British Overseas Territory
["British Virgin Islands"] = {
the = true,
placetype = {"overseas territory", "ดินแดน"},
container = "สหราชอาณาจักร",
addl_parents = {"แคริบเบียน"},
british_spelling = true,
},
-- Norwegian dependent territory
["Bouvet Island"] = {
placetype = {"dependent territory", "ดินแดน"},
container = "Norway",
addl_parents = {"แอฟริกา"},
british_spelling = true,
},
-- British Overseas Territory
["Cayman Islands"] = {
the = true,
placetype = {"overseas territory", "ดินแดน"},
container = "สหราชอาณาจักร",
addl_parents = {"แคริบเบียน"},
british_spelling = true,
},
-- Australian external territory
["Christmas Island"] = {
placetype = {"external territory", "ดินแดน"},
container = "ออสเตรเลีย",
addl_parents = {"เอเชีย"},
british_spelling = true,
},
-- Sui generis French "state private property" per Wikipedia; classify as overseas territory like the
-- French Southern and Antarctic Lands.
["Clipperton Island"] = {
placetype = {"overseas territory", "ดินแดน"},
container = "ฝรั่งเศส",
addl_parents = {"อเมริกาเหนือ"},
},
-- Australian external territory; also called the Keeling Islands or (officially) the Cocos (Keeling) Islands
["Cocos Islands"] = {
the = true,
placetype = {"external territory", "ดินแดน"},
container = "ออสเตรเลีย",
addl_parents = {"เอเชีย"},
wp = "Cocos (Keeling) Islands",
british_spelling = true,
},
["Cocos (Keeling) Islands"] = {alias_of = "Cocos Islands", display = true, the = true},
["Keeling Islands"] = {alias_of = "Cocos Islands", display = true, the = true},
-- self-governing but in free association with New Zealand
["Cook Islands"] = {
the = true,
placetype = {"ประเทศ"},
container = "New Zealand",
addl_parents = {"พอลินีเชีย"},
british_spelling = true,
},
-- constituent country of the Netherlands
["Curaçao"] = {
placetype = {"constituent country", "ประเทศ"},
container = "เนเธอร์แลนด์",
addl_parents = {"แคริบเบียน"},
british_spelling = true,
},
-- special territory of Chile
["Easter Island"] = {
placetype = {"special territory", "ดินแดน"},
container = "ชิลี",
addl_parents = {"พอลินีเชีย"},
},
-- British Overseas Territory
["Falkland Islands"] = {
the = true,
placetype = {"overseas territory", "ดินแดน"},
container = "สหราชอาณาจักร",
addl_parents = {"อเมริกาใต้"},
british_spelling = true,
},
-- autonomous territory of Denmark
["Faroe Islands"] = {
the = true,
placetype = {"autonomous territory", "ดินแดน"},
container = "เดนมาร์ก",
addl_parents = {"ยุโรป"},
british_spelling = true,
},
-- overseas department and region of France
["French Guiana"] = {
placetype = {"overseas department", "department", "administrative region", "ภูมิภาค"},
container = "ฝรั่งเศส",
divs = {"communes"},
addl_parents = {"อเมริกาใต้"},
british_spelling = true,
},
-- overseas collectivity of France
["French Polynesia"] = {
placetype = {"overseas collectivity", "collectivity"},
container = "ฝรั่งเศส",
addl_parents = {"พอลินีเชีย"},
british_spelling = true,
},
-- French overseas territory
["French Southern and Antarctic Lands"] = {
the = true,
placetype = {"overseas territory", "ดินแดน"},
container = "ฝรั่งเศส",
addl_parents = {"แอฟริกา"},
},
-- British Overseas Territory
["Gibraltar"] = {
placetype = {"overseas territory", "ดินแดน"},
container = "สหราชอาณาจักร",
addl_parents = {"ยุโรป"},
is_city = true,
british_spelling = true,
},
-- autonomous territory of Denmark
["Greenland"] = {
placetype = {"autonomous territory", "ดินแดน"},
container = "เดนมาร์ก",
addl_parents = {"อเมริกาเหนือ"},
divs = {"เทศบาล"},
british_spelling = true,
},
-- overseas department and region of France
["Guadeloupe"] = {
placetype = {"overseas department", "department", "administrative region", "ภูมิภาค"},
container = "ฝรั่งเศส",
addl_parents = {"แคริบเบียน"},
divs = {"communes"},
british_spelling = true,
},
-- unincorporated territory of the United States
["Guam"] = {
placetype = {"unincorporated territory", "overseas territory", "ดินแดน"},
container = "สหรัฐอเมริกา",
addl_parents = {"ไมโครนีเชีย"},
},
-- self-governing British Crown dependency; technically called the Bailiwick of Guernsey
["Guernsey"] = {
placetype = {"crown dependency", "dependency", "dependent territory", "bailiwick", "ดินแดน"},
container = "สหราชอาณาจักร",
addl_parents = {"British Isles", "ยุโรป"},
british_spelling = true,
wp = "Bailiwick of %l",
},
["Bailiwick of Guernsey"] = {alias_of = "Guernsey", the = true},
-- Australian external territory
["Heard Island and McDonald Islands"] = {
the = true,
placetype = {"external territory", "ดินแดน"},
container = "ออสเตรเลีย",
addl_parents = {"แอฟริกา"},
},
-- special administrative region of China
["Hong Kong"] = {
placetype = {"special administrative region", "นคร"},
container = "จีน",
is_city = true,
british_spelling = true,
},
-- self-governing British Crown dependency
["Isle of Man"] = {
the = true,
placetype = {"crown dependency", "dependency", "dependent territory", "ดินแดน"},
container = "สหราชอาณาจักร",
addl_parents = {"British Isles", "ยุโรป"},
british_spelling = true,
},
-- Norwegian unincorporated area
["Jan Mayen"] = {
placetype = {"unincorporated area", "dependent territory", "ดินแดน", "เกาะ"},
container = "Norway",
addl_parents = {"ยุโรป"},
british_spelling = true,
},
-- self-governing British Crown dependency; technically called the Bailiwick of Jersey
["Jersey"] = {
placetype = {"crown dependency", "dependency", "dependent territory", "bailiwick", "ดินแดน"},
container = "สหราชอาณาจักร",
addl_parents = {"British Isles", "ยุโรป"},
british_spelling = true,
},
["Bailiwick of Jersey"] = {alias_of = "Jersey", the = true},
-- special administrative region of China
["Macau"] = {
placetype = {"special administrative region", "นคร"},
container = "จีน",
is_city = true,
british_spelling = true,
},
-- overseas department and region of France
["Martinique"] = {
placetype = {"overseas department", "department", "administrative region", "ภูมิภาค"},
container = "ฝรั่งเศส",
divs = {"communes"},
addl_parents = {"แคริบเบียน"},
british_spelling = true,
},
-- overseas department and region of France
["Mayotte"] = {
placetype = {"overseas department", "department", "administrative region", "ภูมิภาค"},
container = "ฝรั่งเศส",
divs = {"communes"},
addl_parents = {"แอฟริกา"},
british_spelling = true,
},
-- British Overseas Territory
["Montserrat"] = {
placetype = {"overseas territory", "ดินแดน"},
container = "สหราชอาณาจักร",
addl_parents = {"แคริบเบียน"},
british_spelling = true,
},
-- special collectivity of France
["New Caledonia"] = {
placetype = {"special collectivity", "collectivity"},
container = "ฝรั่งเศส",
addl_parents = {"เมลานีเชีย"},
british_spelling = true,
},
-- dependent territory of New Zealand
["New Zealand Subantarctic Islands"] = {
the = true,
placetype = {"dependent territory", "ดินแดน"},
container = "New Zealand",
addl_parents = {"แอนตาร์กติกา"},
british_spelling = true,
},
-- self-governing but in free association with New Zealand
["Niue"] = {
placetype = {"ประเทศ"},
container = "New Zealand",
addl_parents = {"พอลินีเชีย"},
british_spelling = true,
},
-- Australian external territory
["Norfolk Island"] = {
placetype = {"external territory", "ดินแดน"},
container = "ออสเตรเลีย",
addl_parents = {"พอลินีเชีย"},
british_spelling = true,
},
-- de-facto independent state, internationally recognized as part of Cyprus
["Northern Cyprus"] = {
placetype = {"unrecognized country", "ประเทศ"},
addl_parents = {"ไซปรัส", "Turkey", "ยุโรป", "เอเชีย"},
divs = {"อำเภอ"},
keydesc = "the de-facto independent state of [[Northern Cyprus]], internationally recognized as part of the country of [[Cyprus]]",
british_spelling = true,
},
-- commonwealth, unincorporated territory of the United States
["Northern Mariana Islands"] = {
the = true,
placetype = {"commonwealth", "unincorporated territory", "overseas territory", "ดินแดน"},
container = "สหรัฐอเมริกา",
addl_parents = {"ไมโครนีเชีย"},
},
-- British Overseas Territory
["Pitcairn Islands"] = {
the = true,
placetype = {"overseas territory", "ดินแดน"},
container = "สหราชอาณาจักร",
addl_parents = {"พอลินีเชีย"},
british_spelling = true,
},
-- commonwealth of the United States
["Puerto Rico"] = {
placetype = {"commonwealth", "overseas territory", "ดินแดน"},
container = "สหรัฐอเมริกา",
addl_parents = {"แคริบเบียน"},
divs = {"เทศบาล"},
},
-- overseas department and region of France
["Réunion"] = {
placetype = {"overseas department", "department", "administrative region", "ภูมิภาค"},
container = "ฝรั่งเศส",
divs = {"communes"},
addl_parents = {"แอฟริกา"},
british_spelling = true,
},
-- special municipality of the Netherlands
["Saba"] = {
placetype = {"special municipality", "เทศบาล", "overseas territory", "ดินแดน"},
container = "เนเธอร์แลนด์",
addl_parents = {"แคริบเบียน"},
is_city = true,
british_spelling = true,
},
-- overseas collectivity of France
["Saint Barthélemy"] = {
placetype = {"overseas collectivity", "collectivity"},
container = "ฝรั่งเศส",
addl_parents = {"แคริบเบียน"},
british_spelling = true,
},
-- British Overseas Territory
["Saint Helena, Ascension and Tristan da Cunha"] = {
placetype = {"overseas territory", "ดินแดน"},
container = "สหราชอาณาจักร",
divs = {{type = "constituent parts", container_parent_type = false}},
addl_parents = {"มหาสมุทรแอตแลนติก", "แอฟริกา"},
british_spelling = true,
},
-- constituent parts of the combined oveseas territory
["Ascension Island"] = {
placetype = {"constituent part", "ดินแดน", "เกาะ"},
container = {key = "Saint Helena, Ascension and Tristan da Cunha", placetype = "overseas territory"},
addl_parents = {"มหาสมุทรแอตแลนติก"},
overriding_bare_label_parents = {},
no_container_cat = false,
no_container_parent = false,
no_auto_augment_container = false,
},
["Saint Helena"] = {
placetype = {"constituent part", "ดินแดน", "เกาะ"},
container = {key = "Saint Helena, Ascension and Tristan da Cunha", placetype = "overseas territory"},
addl_parents = {"มหาสมุทรแอตแลนติก"},
overriding_bare_label_parents = {},
no_container_cat = false,
no_container_parent = false,
no_auto_augment_container = false,
},
["Tristan da Cunha"] = {
placetype = {"constituent part", "ดินแดน", "archipelago"},
container = {key = "Saint Helena, Ascension and Tristan da Cunha", placetype = "overseas territory"},
addl_parents = {"มหาสมุทรแอตแลนติก"},
overriding_bare_label_parents = {},
no_container_cat = false,
no_container_parent = false,
no_auto_augment_container = false,
},
-- overseas collectivity of France
["Saint Martin"] = {
placetype = {"overseas collectivity", "collectivity"},
container = "ฝรั่งเศส",
addl_parents = {"แคริบเบียน"},
british_spelling = true,
},
-- overseas collectivity of France
["Saint Pierre and Miquelon"] = {
placetype = {"overseas collectivity", "collectivity"},
container = "ฝรั่งเศส",
divs = {"communes"},
addl_parents = {"อเมริกาเหนือ"},
british_spelling = true,
},
-- special municipality of the Netherlands
["Sint Eustatius"] = {
placetype = {"special municipality", "เทศบาล", "overseas territory", "ดินแดน"},
container = "เนเธอร์แลนด์",
addl_parents = {"แคริบเบียน"},
is_city = true,
british_spelling = true,
},
-- constituent country of the Netherlands
["Sint Maarten"] = {
placetype = {"constituent country", "ประเทศ"},
container = "เนเธอร์แลนด์",
addl_parents = {"แคริบเบียน"},
british_spelling = true,
},
-- de-facto independent state, internationally recognized as part of Somalia
["Somaliland"] = {
placetype = {"unrecognized country", "ประเทศ"},
addl_parents = {"โซมาเลีย", "แอฟริกา"},
keydesc = "the de-facto independent state of [[Somaliland]], internationally recognized as part of the country of [[Somalia]]",
british_spelling = true,
},
-- British Overseas Territory
-- FIXME: We should form the group "South Georgia and the South Sandwich Islands" like we did for
-- "Saint Helena, Ascension and Tristan da Cunha".
["South Georgia"] = {
placetype = {"overseas territory", "ดินแดน"},
container = "สหราชอาณาจักร",
addl_parents = {"มหาสมุทรแอตแลนติก"},
british_spelling = true,
},
-- de-facto independent state, internationally recognized as part of Georgia
["South Ossetia"] = {
placetype = {"unrecognized country", "ประเทศ"},
addl_parents = {"Georgia", "ยุโรป", "เอเชีย"},
keydesc = "the de-facto independent state of [[South Ossetia]], internationally recognized as part of the country of [[Georgia]]",
british_spelling = true,
},
-- British Overseas Territory
["South Sandwich Islands"] = {
the = true,
placetype = {"overseas territory", "ดินแดน"},
container = "สหราชอาณาจักร",
addl_parents = {"มหาสมุทรแอตแลนติก"},
wp = true,
wpcat = "South Georgia and the South Sandwich Islands",
british_spelling = true,
},
-- Norwegian unincorporated area
["Svalbard"] = {
placetype = {"unincorporated area", "dependent territory", "ดินแดน", "archipelago"},
container = "Norway",
addl_parents = {"ยุโรป"},
british_spelling = true,
},
-- dependent territory of New Zealand
["Tokelau"] = {
placetype = {"dependent territory", "ดินแดน"},
container = "New Zealand",
addl_parents = {"พอลินีเชีย"},
british_spelling = true,
},
-- de-facto independent state, internationally recognized as part of Moldova
["Transnistria"] = {
placetype = {"unrecognized country", "ประเทศ"},
addl_parents = {"Moldova", "ยุโรป"},
keydesc = "the de-facto independent state of [[Transnistria]], internationally recognized as part of [[Moldova]]",
british_spelling = true,
},
-- British Overseas Territory
["Turks and Caicos Islands"] = {
the = true,
placetype = {"overseas territory", "ดินแดน"},
container = "สหราชอาณาจักร",
addl_parents = {"แคริบเบียน"},
british_spelling = true,
},
-- unincorporated territory of the United States
["United States Minor Outlying Islands"] = {
the = true,
placetype = {"unincorporated territory", "overseas territory", "ดินแดน"},
container = "สหรัฐอเมริกา",
addl_parents = {"เกาะ", "ไมโครนีเชีย", "พอลินีเชีย", "แคริบเบียน"},
},
-- FIXME: We should add entries for the other minor outlying islands.
-- Baker Island (Oceania)
-- Howland Island (Oceania)
-- Jarvis Island (Oceania)
-- Johnston Atoll (Oceania)
-- Kingman Reef (Oceania)
-- Midway Atoll (Oceania)
-- Navassa Island (Caribbean)
-- Palmyra Atoll (Oceania)
-- Wake Island (Oceania)
["Wake Island"] = {
placetype = {"unincorporated territory", "overseas territory", "ดินแดน"},
container = "สหรัฐอเมริกา",
addl_parents = {"ไมโครนีเชีย"},
},
-- unincorporated territory of the United States
["United States Virgin Islands"] = {
the = true,
placetype = {"unincorporated territory", "overseas territory", "ดินแดน"},
container = "สหรัฐอเมริกา",
addl_parents = {"แคริบเบียน"},
},
["U.S. Virgin Islands"] = {alias_of = "United States Virgin Islands", display = true, the = true},
["US Virgin Islands"] = {alias_of = "United States Virgin Islands", display = true, the = true},
-- overseas collectivity of France
["Wallis and Futuna"] = {
placetype = {"overseas collectivity", "collectivity"},
container = "ฝรั่งเศส",
addl_parents = {"พอลินีเชีย"},
british_spelling = true,
},
}
export.country_like_entities_group = {
-- don't do any transformations between key and placename; in particular, don't chop off anything from
-- "Saint Helena, Ascension and Tristan da Cunha".
key_to_placename = false,
placename_to_key = false,
canonicalize_key_container = make_canonicalize_key_container(nil, "ประเทศ"),
default_overriding_bare_label_parents = {"country-like entities"},
default_no_container_cat = true,
default_no_container_parent = true,
-- These entities often aren't really part of their container; a village in Wallis and Futuna (an overseas
-- collectivity of France in Polynesia), for example, shouldn't be treated as a village in France, nor as a village
-- in Europe.
default_no_auto_augment_container = true,
data = export.country_like_entities,
}
-- Former countries and such; we don't create "Cities in ..." categories because they don't exist anymore
export.former_countries = {
-- de-facto independent state of Armenian ethnicity, internationally recognized as part of Azerbaijan
-- (also known as Nagorno-Karabakh)
-- NOTE: Formerly listed Armenia as a parent; this seems politically non-neutral so I've taken it out.
["Artsakh"] = {
placetype = {"unrecognized country", "ประเทศ"},
addl_parents = {"อาเซอร์ไบจาน", "ยุโรป", "เอเชีย"},
keydesc = "the former de-facto independent state of [[Artsakh]], internationally recognized as part of [[Azerbaijan]]",
british_spelling = true,
},
["Nagorno-Karabakh"] = {alias_of = "Artsakh"},
["Czechoslovakia"] = {container = "ยุโรป", british_spelling = true},
["East Germany"] = {container = "ยุโรป", addl_parents = {"เยอรมนี"}, british_spelling = true},
["เวียดนามเหนือ"] = {container = "เอเชีย", addl_parents = {"เวียดนาม"}},
["เปอร์เซีย"] = {placetype = {"จักรวรรดิ", "ประเทศ"}, container = "เอเชีย", divs = {"จังหวัด"}},
["Byzantine Empire"] = {
the = true, placetype = {"จักรวรรดิ", "ประเทศ"}, container = {"ยุโรป", "แอฟริกา", "เอเชีย"},
addl_parents = {"Ancient Europe", "Ancient Near East"},
divs = {
"จังหวัด", "themes",
}},
["Roman Empire"] = {
the = true, placetype = {"จักรวรรดิ", "ประเทศ"}, container = {"ยุโรป", "แอฟริกา", "เอเชีย"}, addl_parents = {"Rome"},
divs = {
"จังหวัด",
{type = "FORMER provinces", cat_as = "จังหวัด"},
}},
["เวียดนามใต้"] = {container = "เอเชีย", addl_parents = {"เวียดนาม"}},
["Soviet Union"] = {
the = true, container = {"ยุโรป", "เอเชีย"}, divs = {"republics", "autonomous republics"},
british_spelling = true},
["West Germany"] = {container = "ยุโรป", addl_parents = {"เยอรมนี"}, british_spelling = true},
["Yugoslavia"] = {container = "ยุโรป", divs = {"อำเภอ"},
keydesc = "the former [[Kingdom of Yugoslavia]] (1918–1943) or the former [[Socialist Federal Republic of Yugoslavia]] (1943–1992)", british_spelling = true},
}
export.former_countries_group = {
canonicalize_key_container = canonicalize_continent_container,
default_overriding_bare_label_parents = {"former countries and country-like entities"},
default_is_former_place = true,
default_placetype = "ประเทศ",
default_no_container_cat = true,
default_no_container_parent = true,
-- No need to augment country holonyms with continents; not needed for disambiguation.
default_no_auto_augment_container = true,
data = export.former_countries,
}
-----------------------------------------------------------------------------------
-- Subpolity tables --
-----------------------------------------------------------------------------------
export.australia_states_and_territories = {
["Australian Capital Territory, ออสเตรเลีย"] = {the = true, placetype = "ดินแดน"},
["Jervis Bay Territory, ออสเตรเลีย"] = {the = true, placetype = "ดินแดน"},
["New South Wales, ออสเตรเลีย"] = {},
["Northern Territory, ออสเตรเลีย"] = {the = true, placetype = "ดินแดน"},
["Queensland, ออสเตรเลีย"] = {},
["South Australia, ออสเตรเลีย"] = {},
["Tasmania, ออสเตรเลีย"] = {},
["Victoria, ออสเตรเลีย"] = {},
["Western Australia, ออสเตรเลีย"] = {},
}
-- states and territories of Australia
export.australia_group = {
default_container = "ออสเตรเลีย",
default_placetype = "รัฐ",
default_divs = "local government areas",
data = export.australia_states_and_territories,
}
export.austria_states = {
["Vienna, ออสเตรีย"] = {},
["Lower Austria, ออสเตรีย"] = {},
["Upper Austria, ออสเตรีย"] = {},
["Styria, ออสเตรีย"] = {},
["Tyrol, ออสเตรีย"] = {wp = "Tyrol (รัฐ)"},
["Carinthia, ออสเตรีย"] = {},
["Salzburg, ออสเตรีย"] = {wp = "Salzburg (รัฐ)"},
["Vorarlberg, ออสเตรีย"] = {},
["Burgenland, ออสเตรีย"] = {},
}
-- states of Austria
export.austria_group = {
default_container = "ออสเตรีย",
default_placetype = "รัฐ",
default_divs = "เทศบาล",
data = export.austria_states,
}
export.bangladesh_divisions = {
["Barisal Division, บังกลาเทศ"] = {},
["Chittagong Division, บังกลาเทศ"] = {},
["Dhaka Division, บังกลาเทศ"] = {},
["Khulna Division, บังกลาเทศ"] = {},
["Mymensingh Division, บังกลาเทศ"] = {},
["Rajshahi Division, บังกลาเทศ"] = {},
["Rangpur Division, บังกลาเทศ"] = {},
["Sylhet Division, บังกลาเทศ"] = {},
}
-- divisions of Bangladesh
export.bangladesh_group = {
key_to_placename = make_key_to_placename(", บังกลาเทศ$", " Division$"),
placename_to_key = make_placename_to_key(", บังกลาเทศ", " Division"),
default_container = "บังกลาเทศ",
default_placetype = "division",
default_divs = "อำเภอ",
data = export.bangladesh_divisions,
}
export.brazil_states = {
["Acre, บราซิล"] = {wp = "%l (รัฐ)"},
["Alagoas, บราซิล"] = {},
["Amapá, บราซิล"] = {},
["Amazonas, บราซิล"] = {wp = "%l (Brazilian state)"},
["Bahia, บราซิล"] = {},
["Ceará, บราซิล"] = {},
["Distrito Federal, บราซิล"] = {wp = "Federal District (Brazil)"},
["Espírito Santo, บราซิล"] = {},
["Goiás, บราซิล"] = {},
["Maranhão, บราซิล"] = {},
["Mato Grosso, บราซิล"] = {},
["Mato Grosso do Sul, บราซิล"] = {},
["Minas Gerais, บราซิล"] = {},
["Pará, บราซิล"] = {},
["Paraíba, บราซิล"] = {},
["Paraná, บราซิล"] = {wp = "%l (รัฐ)"},
["Pernambuco, บราซิล"] = {},
["Piauí, บราซิล"] = {},
["Rio de Janeiro, บราซิล"] = {wp = "%l (รัฐ)"},
["Rio Grande do Norte, บราซิล"] = {},
["Rio Grande do Sul, บราซิล"] = {},
["Rondônia, บราซิล"] = {},
["Roraima, บราซิล"] = {},
["Santa Catarina, บราซิล"] = {wp = "%l (รัฐ)"},
["São Paulo, บราซิล"] = {wp = "%l (รัฐ)"},
["Sergipe, บราซิล"] = {},
["Tocantins, บราซิล"] = {},
}
-- states of Brazil
export.brazil_group = {
default_container = "บราซิล",
default_placetype = "รัฐ",
default_divs = "เทศบาล",
data = export.brazil_states,
}
export.canada_provinces_and_territories = {
["Alberta, แคนาดา"] = {divs = {
{type = "municipal districts", container_parent_type = "rural municipalities"},
}},
["British Columbia, แคนาดา"] = {divs =
{type = "regional districts", container_parent_type = false},
"regional municipalities",
},
["Manitoba, แคนาดา"] = {divs = {"rural municipalities"}},
["New Brunswick, แคนาดา"] = {divs = {"เทศมณฑล", "parishes", {type = "civil parishes", cat_as = "parishes"}}},
["Newfoundland and Labrador, แคนาดา"] = {},
["Northwest Territories, แคนาดา"] = {the = true, placetype = "ดินแดน"},
["Nova Scotia, แคนาดา"] = {divs = {"เทศมณฑล", "regional municipalities"}},
["Nunavut, แคนาดา"] = {placetype = "ดินแดน"},
["Ontario, แคนาดา"] = {divs = {"เทศมณฑล", "regional municipalities", {type = "townships", prep = "ใน"}}},
["Prince Edward Island, แคนาดา"] = {divs = {"เทศมณฑล", "parishes", "rural municipalities"}},
["Saskatchewan, แคนาดา"] = {divs = {"rural municipalities"}},
["Quebec, แคนาดา"] = {divs = {
"เทศมณฑล",
{type = "regional county municipalities", container_parent_type = "regional municipalities"},
-- administrative regions have an official (but non-governmental) function but there don't appear to be any
-- equivalent regions elsewhere in Canada, so disable the [[Category:Regions of Canada]] grouping
{type = "ภูมิภาค", container_parent_type = false},
{type = "townships", prep = "ใน"},
{type = "parish municipalities", cat_as = {{type = "parishes", container_parent_type = "เทศมณฑล"}, "เทศบาล"}},
{type = "township municipalities", cat_as = {{type = "townships", prep = "ใน"}, "เทศบาล"}},
{type = "village municipalities", cat_as = {{type = "villages", prep = "ใน"}, "เทศบาล"}},
}},
["Yukon, แคนาดา"] = {placetype = "ดินแดน"},
["Yukon Territory, แคนาดา"] = {alias_of = "Yukon, Canada", the = true},
}
-- provinces and territories of Canada
export.canada_group = {
default_container = "แคนาดา",
default_placetype = "รัฐ", --ตาม thwiki
data = export.canada_provinces_and_territories,
}
export.china_provinces_and_autonomous_regions = {
-- direct-administered municipalities are not here but below under prefecture-level cities
["Anhui, จีน"] = {},
["Fujian, จีน"] = {},
["Fuchien, จีน"] = {alias_of = "Fujian, จีน", display = true},
["Gansu, จีน"] = {},
["Guangdong, จีน"] = {},
["Guangxi, จีน"] = {placetype = "autonomous region"},
["Guizhou, จีน"] = {},
["Hainan, จีน"] = {},
["Hebei, จีน"] = {},
["Heilongjiang, จีน"] = {},
["Henan, จีน"] = {},
["Hubei, จีน"] = {},
["Hunan, จีน"] = {},
["Inner Mongolia, จีน"] = {placetype = "autonomous region"},
["Jiangsu, จีน"] = {},
["Jiangxi, จีน"] = {},
["Jilin, จีน"] = {},
["Liaoning, จีน"] = {},
["Ningxia, จีน"] = {placetype = "autonomous region"},
["Qinghai, จีน"] = {},
["Shaanxi, จีน"] = {},
["Shandong, จีน"] = {},
["Shanxi, จีน"] = {},
["Sichuan, จีน"] = {},
["Tibet, จีน"] = {placetype = "autonomous region", wp = "Tibet Autonomous Region"},
["Xinjiang, จีน"] = {placetype = "autonomous region"},
["Yunnan, จีน"] = {},
["Zhejiang, จีน"] = {},
}
-- provinces and autonomous regions of China
export.china_group = {
default_container = "จีน",
default_placetype = "มณฑล",
default_divs = {
"จังหวัด", "prefecture-level cities",
"อำเภอ", "ตำบล", "townships",
{type = "เทศมณฑล", cat_as = "counties and county-level cities"},
{type = "county-level cities", cat_as = "counties and county-level cities"},
},
data = export.china_provinces_and_autonomous_regions,
}
export.china_prefecture_level_cities = {
-- In China, a "prefecture-level city" is not a city in any real sense. It is rather a prefecture, which is an
-- administrative unit smaller than a province but bigger than a county, which is administratively controlled by
-- the chief city of the prefecture (which bears the same name as the prefecture), in a unified government. Prior
-- to the mid-1980's, in fact, prefecture-level cities *were* prefectures, and a few of them (especially in the
-- western portion of China) have not yet been converted. Generally a given province is entirely tiled by
-- prefecture-level cities, another indication that they should be treated as prefectures and not cities per se.
-- Yet another indication is that prefecture-level cities can contain counties and county-level cities (which, much
-- like prefecture-level cities, are effectively counties surrounding a chief city of the county, again which bears
-- the same name as the county-level city).
--
-- For this reason, we treat prefecture-level cities as non-city political divisions, and separately enumerate the
-- most populous so we can separately categorize districts and counties under them instead of lumping them at the
-- province level.
--
-- Note also that China separately distinguishes "urban area" from "metro area". Sometimes the two figures are
-- identical but sometimes the metro area is larger (and very occasionally smaller, which I assume is an error). I'm
-- guessing that the "urban area" is the contiguous urban area over a certain density while the metro area includes
-- all urban areas above a certain density; when the latter is greater, it's because of satellite cities in the
-- metro area separated by suburban/exurban or rural land.
-- At first I chose all prefecture/province-level cities with a total prefecture/province-level population of at
-- least 6,000,000 per the 2020 census with data taken from https://www.citypopulation.de/en/china/admin/ (a total
-- of 67, including the four direct-administered municipalities), and also chose all prefecture/province-level
-- cities whose "urban population" was at least 2,000,000 per the 2020 census with data taken from Wikipedia
-- [[w:List of cities in China by population#Cities and towns by population]] (a total of 61 cities; if we cut off
-- at 1.5 million we'd have 84 cities, and if we cut off at 1 million we'd have 105 cities). Merging them produces
-- 87 cities. Note that this leaves off a few well-known cities (Guilin, Qiqihar, Kashgar, Lhasa, ...) but includes
-- a lot of obscure cities.
--
-- At a later date I added all cities from citypopulation.de whose "urban" population per the 2020 China census was
-- >= 1 million, and then finally added all urban agglomerations from citypopulation.de whose 2025-01-01 estimate
-- was >= 1 million. These are sorted below by the urban agglomeration value (which is generally of the "adm-urb" =
-- "administrative area (urban population)" type) and sometimes groups nearby cities into a single agglomeration
-- (most notably in the case of the Pearl River Delta, grouped under Guangzhou with an agglomeration population of
-- 72,700,000 but including a large number of nearby large cities in the agglomeration (although for some reason not
-- Hong Kong, maybe due to the administrative issues involved). In addition, citypopulation.de includes divisions
-- under a prefecture-level city if they are city-like and have an agglomeration population of at least 1 million;
-- this includes several county-level cities, one county and one district (Wanzhou, a "district" of Chongqing
-- despite being 142 miles away). None of the county-level cities or counties have districts under them, only
-- subdistricts, towns and townships.
["Guangzhou"] = {container = "Guangdong"}, -- 18.7 prefectural, 18.8 urban; sub-provincial city; 16.097 urban (72.700 adm-urb including Dongguan, Foshan, Huizhou, Jiangmen, Shenzhen, Zhongshan) per citypopulation.de
["Dongguan"] = {container = "Guangdong"}, -- 10.5 prefectural, 10.5 urban; 9.645 per citypopulation.de; included by citypopulation.de in Guangzhou agglomeration
["Foshan"] = {container = "Guangdong"}, -- 9.5 prefectural, 9.5 urban; 9.043 per citypopulation.de; included by citypopulation.de in Guangzhou agglomeration
["Huizhou"] = {container = "Guangdong"}, -- 6.0 prefectural, 2.5 urban; 2.900 per citypopulation.de; included by citypopulation.de in Guangzhou agglomeration
["Jiangmen"] = {container = "Guangdong"}, -- 4.798 prefectural, 2.7 urban; 1.795 per citypopulation.de; included by citypopulation.de in Guangzhou agglomeration
["Shenzhen"] = {container = "Guangdong"}, -- 17.5 prefectural, 14.7 urban; sub-provincial city; 17.445 per citypopulation.de; included by citypopulation.de in Guangzhou agglomeration
["Zhongshan"] = {container = "Guangdong"}, -- 4.418 prefectural, 4.4 urban; 3.842 per citypopulation.de; included by citypopulation.de in Guangzhou agglomeration
["Shanghai"] = {placetype = {"direct-administered municipality", "เทศบาล", "นคร"}}, -- 24.9 prefectural, 29.9 urban; 21.910 urban (41.600 adm-urb including Changshu, Changzhou, Suzhou, Wuxi) per citypopulation.de
["Changshu"] = {container = "Jiangsu"}, -- 1.231 urban per citypopulation.de; included by citypopulation.de in Shanghai agglomeration
-- NOTE: Not to be confused with Cangzhou in Hebei
["Changzhou"] = {container = "Jiangsu"}, -- 5.278 prefectural, 3.6 urban; 3.187 urban per citypopulation.de; included by citypopulation.de in Shanghai agglomeration
-- NOTE: There is also a prefecture-level city Suzhou in Anhui with 5.3 million prefectural inhabitants
["Suzhou"] = {container = "Jiangsu"}, -- 12.8 prefectural, 4.3 urban; 5.893 urban per citypopulation.de; included by citypopulation.de in Shanghai agglomeration
["Wuxi"] = {container = "Jiangsu"}, -- 7.5 prefectural, 3.3 urban; 3.957 per citypopulation.de; included by citypopulation.de in Shanghai agglomeration
["Beijing"] = {placetype = {"direct-administered municipality", "เทศบาล", "นคร"}}, -- 21.9 prefectural, 21.9 urban; 18.961 urban (21.500 adm-urb) per citypopulation.de
["Chengdu"] = {container = "Sichuan"}, -- 20.9 prefectural, 16.9 urban; sub-provincial city; 13.568 urban (18.100 adm-urb) per citypopulation.de
["Xiamen"] = {container = "Fujian"}, -- 5.163 prefectural, 5.2 urban; sub-provincial city; 4.617 urban (15.400 adm-urb including Jinjiang, Quanzhou, Putian) per citypopulation.de
["Jinjiang"] = {container = "Fujian"}, -- 1.416 urban per citypopulation.de; included by citypopulation.de in Xiamen agglomeration
["Quanzhou"] = {container = "Fujian"}, -- 8.8 prefectural, 1.7 urban (6.7 metro); 1.469 urban per citypopulation.de; included by citypopulation.de in Xiamen agglomeration
["Putian"] = {container = "Fujian"}, -- 3.210 prefectural, 2.0 urban; 1.539 urban per citypopulation.de; included by citypopulation.de in Xiamen agglomeration
["Hangzhou"] = {container = "Zhejiang"}, -- 11.9 prefectural, 10.7 urban; sub-provincial city; 9.236 urban (14.600 adm-urb including Shaoxing) per citypopulation.de
["Shaoxing"] = {container = "Zhejiang"}, -- 5.270 prefectural, 2.5 urban; 2.333 urban per citypopulation.de; included by citypopulation.de in Hangzhou agglomeration
["Xi'an"] = {container = "Shaanxi"}, -- 12.1 prefectural, 11.9 urban; sub-provincial city; 9.393 urban (13.400 adm-urb including Xianyang) per citypopulation.de
["Xianyang"] = {container = "Shaanxi"}, -- 1.193 urban per citypopulation.de; included by citypopulation.de in Xi'an agglomeration
["Chongqing"] = {placetype = {"direct-administered municipality", "เทศบาล", "นคร"}}, -- 32.1 prefectural, 16.9 urban; 9.581 urban (12.900 adm-urb) per citypopulation.de
["Wuhan"] = {container = "Hubei"}, -- 12.4 prefectural, 12.3 urban; sub-provincial city; 10.495 urban (12.600 adm-urb) per citypopulation.de
["Tianjin"] = {placetype = {"direct-administered municipality", "เทศบาล", "นคร"}}, -- 13.9 prefectural, 13.9 urban; 11.052 urban (11.700 adm-urb) per citypopulation.de
["Changsha"] = {container = "Hunan"}, -- 10.0 prefectural, 6.0 urban; 5.630 urban (11.500 adm-urb including Xiangtan, Zhuzhou) per citypopulation.de
-- Changsha County -- 1.024 urban per citypopulation.de
["Zhuzhou"] = {container = "Hunan"}, -- 1.510 urban per citypopulation.de; included by citypopulation.de in Changsha agglomeration
["Zhengzhou"] = {container = "Henan"}, -- 12.6 prefectural, 6.7 urban; 6.461 urban (10.300 adm-urb) per citypopulation.de
["Nanjing"] = {container = "Jiangsu"}, -- 9.3 prefectural, 9.3 urban; sub-provincial city; 7.520 urban (9.500 adm-urb including Ma'anshan) per citypopulation.de
["Shenyang"] = {container = "Liaoning"}, -- 9.1 prefectural, 7.9 urban; sub-provincial city; 7.026 urban (8.800 adm-urb including Fushun) per citypopulation.de
["Fushun"] = {container = "Liaoning"}, -- 1.229 urban per citypopulation.de; included by citypopulation.de in Shenyang agglomeration
["Hefei"] = {container = "Anhui"}, -- 9.4 prefectural, 4.2 urban; 5.056 urban (8.200 adm-urb) per citypopulation.de
["Shantou"] = {container = "Guangdong"}, -- 5.502 prefectural, 4.3 urban; 3.839 urban (8.050 adm-urb including Chaozhou, Jieyang, Puning) per citypopulation.de
["Chaozhou"] = {container = "Guangdong"}, -- 1.254 urban per citypopulation.de; included by citypopulation.de in Shantou agglomeration
["Jieyang"] = {container = "Guangdong"}, -- 1.243 urban per citypopulation.de; included by citypopulation.de in Shantou agglomeration
["Qingdao"] = {container = "Shandong"}, -- 10.1 prefectural, 7.1 urban; sub-provincial city; 6.165 urban (7.700 adm-urb) per citypopulation.de
["Ningbo"] = {container = "Zhejiang"}, -- 9.4 prefectural, 5.1 urban; sub-provincial city; 3.731 urban (7.600 adm-urb including Cixi, Yuyao) per citypopulation.de
["Cixi"] = {container = "Zhejiang"}, -- 1.458 urban per citypopulation.de; included by citypopulation.de in Ningbo agglomeration
["Yuyao"] = {container = "Zhejiang"}, -- 1.014 urban per citypopulation.de; included by citypopulation.de in Ningbo agglomeration
-- Hong Kong 7.500 agglomeration per citypopulation.de 2025-01-01 estimate including Kowloon, Victoria
["Wenzhou"] = {container = "Zhejiang"}, -- 9.6 prefectural, 3.6 urban; 2.582 urban (7.000 adm-urb including Rui'an, Cangnan, Pingyang) per citypopulation.de
-- Rui'an is a "county-level city" of the "prefecture-level city" of Wenzhou but in fact is 19 miles away from Wenzhou city proper (urban core to urban core).
["Rui'an"] = {placetype = "county-level city", container = {key = "Wenzhou", placetype = "prefecture-level city"}, divs = {"ตำบล", "townships"}}, -- 1.013 urban per citypopulation.de; included by citypopulation.de in Wenzhou agglomeration
["Kunming"] = {container = "Yunnan"}, -- 8.5 prefectural, 6.0 urban; 5.273 urban (6.800 adm-urb) per citypopulation.de
-- includes Láiwú city
["Jinan"] = {container = "Shandong", wp = "%l, %c"}, -- 9.2 prefectural, 8.4 urban; sub-provincial city; 5.648 urban (6.750 adm-urb) per citypopulation.de
-- includes Xīnjí city
["Shijiazhuang"] = {container = "Hebei"}, -- 11.2 prefectural, 4.1 urban; 5.090 urban (6.450 adm-urb) per citypopulation.de
["Taiyuan"] = {container = "Shanxi"}, -- 5.304 prefectural, 4.5 urban; 4.304 urban (6.150 adm-urb) per citypopulation.de
["Harbin"] = {container = "Heilongjiang"}, -- 10.0 prefectural, 7.0 urban; sub-provincial city; 5.243 urban (5.550 adm-urb) per citypopulation.de
["Nanning"] = {container = {key = "Guangxi, จีน", placetype = "autonomous region"}}, -- 8.7 prefectural, 3.8 urban; 4.583 urban (5.550 adm-urb) per citypopulation.de
["Dalian"] = {container = "Liaoning"}, -- 7.5 prefectural, 5.7 urban; sub-provincial city; 4.914 urban (5.400 adm-urb) per citypopulation.de
["Guiyang"] = {container = "Guizhou"}, -- 5.987 prefectural, 3.5 urban; 4.021 urban (5.300 adm-urb) per citypopulation.de
["Changchun"] = {container = "Jilin"}, -- 9.1 prefectural, 5.7 urban; sub-provincial city; 4.557 urban (5.200 adm-urb) per citypopulation.de
["Nanchang"] = {container = "Jiangxi"}, -- 6.3 prefectural, 3.6 (3.9?) urban, 5.3 metro; 3.519 urban (5.150 adm-urb) per citypopulation.de
["Ürümqi"] = {container = {key = "Xinjiang, จีน", placetype = "autonomous region"}}, -- 4.054 prefectural, 4.3 urban; 3.843 urban (5.000 adm-urb) per citypopulation.de
["Urumqi"] = {alias_of = "Ürümqi", display = true},
["Fuzhou"] = {container = "Fujian"}, -- 8.3 prefectural, 4.1 urban; 3.723 urban (4.775 adm-urb) per citypopulation.de
["Linyi"] = {container = "Shandong"}, -- 11.0 prefectural, 2.3 urban; 2.744 urban (4.650 adm-urb) per citypopulation.de
["Zibo"] = {container = "Shandong"}, -- 4.704 prefectural, 2.6 urban; 2.750 urban (3.975 adm-urb) per citypopulation.de
["Luoyang"] = {container = "Henan"}, -- 7.1 prefectural, 2.4 urban; 2.231 urban (3.750 adm-urb) per citypopulation.de
["Lanzhou"] = {container = "Gansu"}, -- 4.359 prefectural, 3.1 urban; 3.013 urban (3.575 adm-urb) per citypopulation.de
["Nantong"] = {container = "Jiangsu"}, -- 7.7 prefectural, 2.3 urban; 2.988 urban (3.475 adm-urb) citypopulation.de
["Weifang"] = {container = "Shandong"}, -- 9.4 prefectural, 2.7 urban; 1.998 urban (3.325 adm-urb) per citypopulation.de
["Jiangyin"] = {container = "Jiangsu"}, -- 1.331 urban (3.200 adm-urb including Zhangjiagang) per citypopulation.de
["Zhangjiagang"] = {container = "Jiangsu"}, -- 1.056 urban per citypopulation.de; included in Jiangyin figures
["Xuzhou"] = {container = "Jiangsu"}, -- 9.1 prefectural, 2.6 urban; 2.846 urban (3.150 adm-urb) per citypopulation.de
["Handan"] = {container = "Hebei"}, -- 9.4 prefectural, 2.8 urban; 2.095 urban (2.925 adm-urb) per citypopulation.de
["Hohhot"] = {container = {key = "Inner Mongolia, จีน", placetype = "autonomous region"}}, -- 3.446 prefectural, 2.7 urban; 2.373 urban (2.850 adm-urb) per citypopulation.de
["Haikou"] = {container = "Hainan"}, -- 2.873 prefectural, 2.3 urban; 2.349 urban (2.800 adm-urb) per citypopulation.de
["Tangshan"] = {container = "Hebei"}, -- 7.7 prefectural, 3.4 urban; 2.550 urban (2.750 adm-urb) per citypopulation.de
["Xinxiang"] = {container = "Henan"}, -- 6.3 prefectural, 1.2 urban, 2.7 metro; 1.271 urban (2.700 adm-urb) per citypopulation.de
["Yiwu"] = {container = "Zhejiang"}, -- 1.481 urban (2.700 adm-urb) per citypopulation.de
["Zhuhai"] = {container = "Guangdong"}, -- 2.439 prefectural, 2.4 urban; 2.207 urban (2.675 adm-urb) per citypopulation.de
["Taizhou, Zhejiang"] = {container = "Zhejiang"}, -- 6.6 prefectural, 1.6 urban; 1.486 urban (2.625 adm-urb) per citypopulation.de
["Taizhou"] = {alias_of = "Taizhou, Zhejiang"},
["Yantai"] = {container = "Shandong"}, -- 7.1 prefectural, 2.5 urban; 2.312 urban (2.550 adm-urb) per citypopulation.de
["Yinchuan"] = {container = {key = "Ningxia, จีน", placetype = "autonomous region"}}, -- 1.663 urban (2.525 adm-urb) per citypopulation.de
["Liuzhou"] = {container = {key = "Guangxi, จีน", placetype = "autonomous region"}}, -- 4.157 prefectural, 2.2 urban; 2.205 urban (2.500 adm-urb) per citypopulation.de
["Anshan"] = {container = "Liaoning"}, -- 1.480 urban (2.350 adm-urb including Liáoyáng) per citypopulation.de
["Yangzhou"] = {container = "Jiangsu"}, -- 2.067 urban (2.300 adm-urb) per citypopulation.de
["Jiaxing"] = {container = "Zhejiang"}, -- 1.188 urban (2.275 adm-urb) per citypopulation.de
["Xining"] = {container = "Qinghai"}, -- 1.677 urban (2.250 adm-urb) per citypopulation.de
-- includes Dìngzhōu city and Xióngān Xīnqū
["Baoding"] = {container = "Hebei"}, -- 11.5 prefectural, 2.0 urban; 1.940 urban (2.225 adm-urb) per citypopulation.de
["Baotou"] = {container = {key = "Inner Mongolia, จีน", placetype = "autonomous region"}}, -- 2.709 prefectural, 2.2 urban; 2.104 urban (2.200 adm-urb) per citypopulation.de
["Ganzhou"] = {container = "Jiangxi"}, -- 9.0 prefectural, 1.6 urban; 1.778 urban (2.150 adm-urb) per citypopulation.de
["Pingdingshan"] = {container = "Henan"}, -- 1.046 urban (2.100 adm-urb) per citypopulation.de
["Zunyi"] = {container = "Guizhou"}, -- 6.6 prefectural, 2.4 urban/metro; 1.675 urban (2.025 adm-urb) per citypopulation.de
["Bengbu"] = {container = "Anhui"}, -- 1.078 urban (2.000 adm-urb) per citypopulation.de
["Datong"] = {container = "Shanxi"}, -- 3.105 prefectural, 2.0 urban; 1.810 urban (2.000 adm-urb) per citypopulation.de
["Anyang"] = {container = "Henan"}, -- 1.188 urban (1.960 adm-urb) per citypopulation.de
["Huai'an"] = {container = "Jiangsu"}, -- 4.556 prefectural, 2.6 urban; 1.805 urban (1.940 adm-urb) per citypopulation.de
["Zaozhuang"] = {container = "Shandong"}, -- 1.350 urban (1.900 adm-urb) per citypopulation.de
["Zhanjiang"] = {container = "Guangdong"}, -- 7.0 prefectural, 1.9 urban; 1.401 urban (1.890 adm-urb) per citypopulation.de
["Huainan"] = {container = "Anhui"}, -- 1.256 urban (1.880 adm-urb) per citypopulation.de
["Jining"] = {container = "Shandong"}, -- 8.4 prefectural, 1.5 urban; 1.700 urban (1.880 adm-urb) per citypopulation.de
["Daqing"] = {container = "Heilongjiang"}, -- 1.604 urban (1.860 adm-urb) per citypopulation.de
["Wuhu"] = {container = "Anhui"}, -- 1.598 urban (1.850 adm-urb) per citypopulation.de
["Guilin"] = {container = {key = "Guangxi, จีน", placetype = "autonomous region"}}, -- 1.361 urban (1.830 adm-urb) per citypopulation.de
["Mianyang"] = {container = "Sichuan"}, -- 1.549 urban (1.800 adm-urb) per citypopulation.de
["Xiangyang"] = {container = "Hubei"}, -- 1.686 urban (1.800 adm-urb) per citypopulation.de
["Huzhou"] = {container = "Zhejiang"}, -- 1.084 urban (1.750 adm-urb) per citypopulation.de
["Puyang"] = {container = "Henan"}, -- 0.824 urban (1.750 adm-urb) per citypopulation.de
["Shangqiu"] = {container = "Henan"}, -- 7.8 prefectural, 1.9 urban (2.8 metro); 1.031 urban (1.750 adm-urb) per citypopulation.de
["Qinhuangdao"] = {container = "Hebei"}, -- 1.520 urban (1.740 adm-urb) per citypopulation.de
["Xingtai"] = {container = "Hebei"}, -- 7.1 prefectural, 971,000 urban; 1.5 urban (1.700 adm-urb) per citypopulation.de
["Nanyang"] = {container = "Henan", wp = "%l, %c"}, -- 9.7 prefectural, 2.1 urban/metro; 1.481 urban (1.680 adm-urb) per citypopulation.de
["Jiaozuo"] = {container = "Henan"}, -- 0.875 urban (1.640 adm-urb) per citypopulation.de
["Jilin City"] = {container = "Jilin"}, -- 1.509 urban (1.610 adm-urb) per citypopulation.de
["Jilin"] = {alias_of = "Jilin City"},
["Jinhua"] = {container = "Zhejiang"}, -- 7.1 prefectural, 1.5 urban; 1.041 urban (1.590 adm-urb) per citypopulation.de
["Shangrao"] = {container = "Jiangxi"}, -- 6.5 prefectural, 2.1 urban, 1.3 metro [sic]; 1.342 urban (1.580 adm-urb) per citypopulation.de
["Heze"] = {container = "Shandong"}, -- 8.8 prefectural, 1.3 urban; 1.294 urban (1.570 adm-urb) per citypopulation.de
["Yulin"] = {container = {key = "Guangxi, จีน", placetype = "autonomous region"}, wp = "%l, %c"}, -- 0.878 urban (1.570 adm-urb) per citypopulation.de
["Tai'an"] = {container = "Shandong"}, -- 1.417 urban (1.560 adm-urb) per citypopulation.de
["Weihai"] = {container = "Shandong"}, -- 1.340 urban (1.510 adm-urb) per citypopulation.de
-- Taizhou, Jiangsu would be here (1.490 adm-urb) but moved to china_prefecture_level_cities_2 to avoid clash
["Yancheng"] = {container = "Jiangsu"}, -- 6.7 prefectural, 1.6 urban; 1.353 urban (1.460 adm-urb) per citypopulation.de
["Zhangjiakou"] = {container = "Hebei"}, -- 1.339 urban (1.450 adm-urb) per citypopulation.de
["Maoming"] = {container = "Guangdong"}, -- 6.2 prefectural, 2.5 urban; 1.308 urban (1.440 adm-urb) per citypopulation.de
["Nanchong"] = {container = "Sichuan"}, -- 1.254 urban (1.440 adm-urb) per citypopulation.de
["Fuyang"] = {container = "Anhui", wp = "%l, %c"}, -- 8.2 prefectural, 2.1 urban; 1.191 urban (1.410 adm-urb) per citypopulation.de
["Xuchang"] = {container = "Henan"}, -- 0.850 urban (1.390 adm-urb) per citypopulation.de
["Yichang"] = {container = "Hubei"}, -- 1.284 urban (1.390 adm-urb) per citypopulation.de
["Dazhou"] = {container = "Sichuan"}, -- 1.136 urban (1.380 adm-urb) per citypopulation.de
["Kaifeng"] = {container = "Henan"}, -- 1.194 urban (1.340 adm-urb) per citypopulation.de
["Luzhou"] = {container = "Sichuan"}, -- 1.128 urban (1.340 adm-urb) per citypopulation.de
["Qingyuan"] = {container = "Guangdong"}, -- 1.198 urban (1.340 adm-urb) per citypopulation.de
["Huaibei"] = {container = "Anhui"}, -- 0.831 urban (1.330 adm-urb) per citypopulation.de
["Yibin"] = {container = "Sichuan"}, -- 1.101 urban (1.310 adm-urb) per citypopulation.de
["Lu'an"] = {container = "Anhui"}, -- 1.070 urban (1.300 adm-urb) per citypopulation.de
["Dezhou"] = {container = "Shandong"}, -- 0.843 urban (1.290 adm-urb) per citypopulation.de
["Rizhao"] = {container = "Shandong"}, -- 1.147 urban (1.270 adm-urb) per citypopulation.de
["Changzhi"] = {container = "Shanxi"}, -- 1.047 urban (1.250 adm-urb) per citypopulation.de
["Hengyang"] = {container = "Hunan"}, -- 6.6 prefectural, 1.5 urban; 1.185 urban (1.250 adm-urb) per citypopulation.de
["Jinzhou"] = {container = "Liaoning"}, -- 1.021 urban (1.240 adm-urb) per citypopulation.de
["Liaocheng"] = {container = "Shandong"}, -- 1.020 urban (1.240 adm-urb) per citypopulation.de
["Changde"] = {container = "Hunan"}, -- 1.101 urban (1.230 adm-urb) per citypopulation.de
["Suqian"] = {container = "Jiangsu"}, -- 1.082 urban (1.230 adm-urb) per citypopulation.de
["Xinyang"] = {container = "Henan"}, -- 6.2 prefectural, 1.4 urban/metro; 1.015 urban (1.230 adm-urb) per citypopulation.de
["Baoji"] = {container = "Shaanxi"}, -- 1.108 urban (1.220 adm-urb) per citypopulation.de
["Yueyang"] = {container = "Hunan"}, -- 1.125 urban (1.220 adm-urb) per citypopulation.de
["Zhenjiang"] = {container = "Jiangsu"}, -- 1.124 urban (1.210 adm-urb) per citypopulation.de
-- Wanzhou is a "district" of the "direct-administered municipality" of Chongqing but in fact is 142 miles away from Chongqing city proper.
["Wanzhou"] = {placetype = "district", container = {key = "Chongqing", placetype = "direct-administered municipality"}, divs = {"ตำบล", "townships"}, wp = "%l, %c"}, -- 1.078 urban (1.190 adm-urb) per citypopulation.de
["Ulanhad"] = {container = {key = "Inner Mongolia, จีน", placetype = "autonomous region"}}, -- 1.093 urban (1.180 adm-urb) per citypopulation.de
["Chifeng"] = {alias_of = "Ulanhad"},
["Ulankhad"] = {alias_of = "Ulanhad", display = true},
["Ezhou"] = {container = "Hubei"}, -- < 0.750 urban (1.180 adm-urb) per citypopulation.de
["Zhaoqing"] = {container = "Guangdong"}, -- 1.036 urban (1.160 adm-urb) per citypopulation.de
["Lianyungang"] = {container = "Jiangsu"}, -- 4.599 prefectural, 2.0 urban; 1.071 urban (1.150 adm-urb) per citypopulation.de
["Qujing"] = {container = "Yunnan"}, -- 0.976 urban (1.150 adm-urb) per citypopulation.de
-- Shuyang is a "เทศมณฑล" of the "prefecture-level city" of Suqian but in fact is 38 miles away from Suqian city proper (urban core to urban core).
-- The county itself is 37 miles by 34 miles.
["Shuyang"] = {placetype = "เทศมณฑล", container = {key = "Suqian", placetype = "prefecture-level city"}, divs = {"ตำบล", "townships"}, wp = "%l County"}, -- 0.986 urban (1.120 adm-urb) per citypopulation.de
-- Yongkang is a "county-level city" of the "prefecture-level city" of Jinhua but in fact is 32 miles away from Jinhua city proper (urban core to urban core).
["Yongkang"] = {placetype = "county-level city", container = {key = "Jinhua", placetype = "prefecture-level city"}, divs = {"ตำบล", "townships"}, wp = "%l, Zhejiang"}, -- < 0.750 urban (1.110 adm-urb) per citypopulation.de
["Zhoukou"] = {container = "Henan"}, -- 9.0 prefectural, 721,000 urban (1.6 metro); < 0.750 urban (1.100 adm-urb) per citypopulation.de
["Beihai"] = {container = {key = "Guangxi, จีน", placetype = "autonomous region"}}, -- < 1 urban (1.090 adm-urb) per citypopulation.de
["Jiujiang"] = {container = "Jiangxi"}, -- < 0.750 urban (1.080 adm-urb) per citypopulation.de
["Shaoyang"] = {container = "Hunan"}, -- 6.6 prefectural, 802,000 urban, 1.4 metro; < 1 urban (1.080 adm-urb) per citypopulation.de
["Chuzhou"] = {container = "Anhui"}, -- < 0.750 urban (1.070 adm-urb) per citypopulation.de
["Hengshui"] = {container = "Hebei"}, -- 0.885 urban (1.070 adm-urb) per citypopulation.de
["Shiyan"] = {container = "Hubei"}, -- 0.955 urban (1.070 adm-urb) per citypopulation.de
["Huludao"] = {container = "Liaoning"}, -- 0.764 urban (1.060 adm-urb) per citypopulation.de
["Dongying"] = {container = "Shandong"}, -- 0.961 urban (1.050 adm-urb) per citypopulation.de
["Guigang"] = {container = {key = "Guangxi, จีน", placetype = "autonomous region"}}, -- 0.921 urban (1.050 adm-urb) per citypopulation.de
-- Liuyang is a "county-level city" of the "prefecture-level city" of Changsha but in fact is 47 miles away from Changsha city proper (urban core to urban core).
["Liuyang"] = {placetype = "county-level city", container = {key = "Changsha", placetype = "prefecture-level city"}, divs = {"ตำบล", "townships"}}, -- 0.886 urban (1.040 adm-urb) per citypopulation.de
-- NOTE: Not to be confused with Changzhou in Jiangsu
["Cangzhou"] = {container = "Hebei"}, -- 7.3 prefectural, 621,000 urban; 0.759 urban (1.030 adm-urb) per citypopulation.de
["Liupanshui"] = {container = "Guizhou"}, -- < 0.750 urban (1.030 adm-urb) per citypopulation.de
["Panjin"] = {container = "Liaoning"}, -- 0.980 urban (1.030 adm-urb) per citypopulation.de
["Qiqihar"] = {container = "Heilongjiang"}, -- 1.030 urban (1.030 adm-urb) per citypopulation.de
["Linfen"] = {container = "Shanxi"}, -- < 0.750 urban (1.010 adm-urb) per citypopulation.de
-- Tengzhou is a "county-level city" of the "prefecture-level city" of Zaozhuang but in fact is 30 miles away from Zaozhuang city proper (urban core to urban core).
["Tengzhou"] = {placetype = "county-level city", container = {key = "Zaozhuang", placetype = "prefecture-level city"}, divs = {"ตำบล", "townships"}}, -- 0.937 urban (1.010 adm-urb) per citypopulation.de
-- 3 extra that got added in earlier incarnations and aren't found in the "major agglomerations of the world" page https://citypopulation.de/en/world/agglomerations/ reference date 2025-01-01
["Kunshan"] = {container = "Jiangsu"}, -- 1.652 urban (2020 China census) per citypopulation.de
["Zhumadian"] = {container = "Henan"}, -- 7.0 prefectural, 722,000 urban per Wikipedia; 0.754 urban per citypopulation.de
["Bijie"] = {container = "Guizhou"}, -- 6.9 prefectural, ? urban, ? metro (not listed in Wikipedia); < 0.750 urban per citypopulation.de
}
export.china_prefecture_level_cities_group = {
-- don't do any transformations between key and placename; in particular, don't chop off anything from
-- "Taizhou, Zhejiang" or "Suzhou, Anhui".
key_to_placename = false,
placename_to_key = false, -- don't add ", จีน" to make the key
default_container = "จีน",
canonicalize_key_container = make_canonicalize_key_container(", จีน", "จังหวัด"),
-- Prefecture-level cities aren't really cities but allow them to be identified that way, as many people
-- don't understand how Chinese administrative divisions work.
default_placetype = {"prefecture-level city", "นคร"},
default_divs = {
-- "towns" (but not "townships") are automatically added as they are specified as generic_before_non_cities,
-- and prefecture-level cities (as well as county-level cities) are considered non-cities.
"อำเภอ", "ตำบล", "townships",
{type = "เทศมณฑล", cat_as = "counties and county-level cities"},
{type = "county-level cities", cat_as = "counties and county-level cities"},
},
data = export.china_prefecture_level_cities,
}
-- Needed to avoid problems with two cities called Taizhou and Suzhou.
export.china_prefecture_level_cities_2 = {
-- NOTE: There is also a larger and better-known prefecture-level city Taizhou in Zhejiang.
["Taizhou, Jiangsu"] = {container = "Jiangsu"}, -- 1.3 urban (1.490 adm-urb) per citypopulation.de 2020 census
["Taizhou"] = {alias_of = "Taizhou, Jiangsu"},
-- NOTE: There is also a larger and better-known prefecture-level city Suzhou in Jiangsu.
["Suzhou, Anhui"] = {container = "Anhui"}, -- 5.3 prefectural, 1.766 metro and "urban"; < 1 urban (1.010 adm-urb) per citypopulation.de 2020 census
-- hopefully this will work because we also have Suzhou as a key by itself for the larger, more-well-known Suzhou in Jiangsu
["Suzhou"] = {alias_of = "Suzhou, Anhui"},
}
export.china_prefecture_level_cities_group_2 = {
-- don't do any transformations between key and placename; in particular, don't chop off anything from
-- "Taizhou, Jiangsu".
placename_to_key = false, -- don't add ", จีน" to make the key
default_container = "จีน",
canonicalize_key_container = make_canonicalize_key_container(", จีน", "จังหวัด"),
-- Prefecture-level cities aren't really cities but allow them to be identified that way, as many people
-- don't understand how Chinese administrative divisions work.
default_placetype = {"prefecture-level city", "นคร"},
default_divs = {
-- "towns" (but not "townships") are automatically added as they are specified as generic_before_non_cities,
-- and prefecture-level cities (as well as county-level cities) are considered non-cities.
"อำเภอ", "ตำบล", "townships",
{type = "เทศมณฑล", cat_as = "counties and county-level cities"},
{type = "county-level cities", cat_as = "counties and county-level cities"},
},
data = export.china_prefecture_level_cities_2,
}
export.finland_regions = {
["Lapland, ฟินแลนด์"] = {wp = "%l (%c)"},
["North Ostrobothnia, ฟินแลนด์"] = {},
["Northern Ostrobothnia, ฟินแลนด์"] = {alias_of = "North Ostrobothnia, ฟินแลนด์", display = true},
["Kainuu, ฟินแลนด์"] = {},
["North Karelia, ฟินแลนด์"] = {},
["Northern Savonia, ฟินแลนด์"] = {},
["North Savo, ฟินแลนด์"] = {alias_of = "Northern Savonia, ฟินแลนด์", display = true},
["Southern Savonia, ฟินแลนด์"] = {},
["South Savo, ฟินแลนด์"] = {alias_of = "Southern Savonia, ฟินแลนด์", display = true},
["South Karelia, ฟินแลนด์"] = {},
["Central Finland, ฟินแลนด์"] = {},
["South Ostrobothnia, ฟินแลนด์"] = {},
["Southern Ostrobothnia, ฟินแลนด์"] = {alias_of = "South Ostrobothnia, ฟินแลนด์", display = true},
["Ostrobothnia, ฟินแลนด์"] = {wp = "%l (ภูมิภาค)"},
["Central Ostrobothnia, ฟินแลนด์"] = {},
["Pirkanmaa, ฟินแลนด์"] = {},
["Satakunta, ฟินแลนด์"] = {},
["Päijänne Tavastia, ฟินแลนด์"] = {},
["Päijät-Häme, ฟินแลนด์"] = {alias_of = "Päijänne Tavastia, ฟินแลนด์", display = true},
["Tavastia Proper, ฟินแลนด์"] = {},
["Kanta-Häme, ฟินแลนด์"] = {alias_of = "Tavastia Proper, ฟินแลนด์", display = true},
["Kymenlaakso, ฟินแลนด์"] = {},
["Uusimaa, ฟินแลนด์"] = {},
["Southwest Finland, ฟินแลนด์"] = {},
["Åland Islands, ฟินแลนด์"] = {the = true, wp = "Åland"},
["Åland, ฟินแลนด์"] = {alias_of = "Åland Islands, ฟินแลนด์"}, -- differs in "the"
}
-- regions of Finland
export.finland_group = {
default_container = "ฟินแลนด์",
default_placetype = "ภูมิภาค",
default_divs = "เทศบาล",
data = export.finland_regions,
}
export.france_administrative_regions = {
["Auvergne-Rhône-Alpes, ฝรั่งเศส"] = {},
["Bourgogne-Franche-Comté, ฝรั่งเศส"] = {},
["Brittany, ฝรั่งเศส"] = {wp = "%l (administrative region)"},
["Centre-Val de Loire, ฝรั่งเศส"] = {},
["Corsica, ฝรั่งเศส"] = {},
-- overseas departments are handled in `export.country_like_entities`
-- ["French Guiana"] = {},
["Grand Est, ฝรั่งเศส"] = {},
-- ["Guadeloupe"] = {},
["Hauts-de-France, ฝรั่งเศส"] = {},
["Île-de-France, ฝรั่งเศส"] = {},
-- ["Martinique"] = {},
-- ["Mayotte"] = {},
["Normandy, ฝรั่งเศส"] = {wp = "%l (administrative region)"},
["Nouvelle-Aquitaine, ฝรั่งเศส"] = {},
["Occitania, ฝรั่งเศส"] = {wp = "%l (administrative region)"},
["Occitanie, ฝรั่งเศส"] = {alias_of = "Occitania, ฝรั่งเศส", display = true},
["Pays de la Loire, ฝรั่งเศส"] = {},
["Provence-Alpes-Côte d'Azur, ฝรั่งเศส"] = {},
-- ["Réunion"] = {},
}
-- administrative regions of France
export.france_group = {
default_container = "ฝรั่งเศส",
-- Canonically these are 'administrative regions' but also treat as 'region' ('administrative region' falls back
-- to 'region').
default_placetype = "ภูมิภาค",
default_divs = {
"communes",
{type = "เทศบาล", cat_as = "communes"},
"departments",
{type = "prefectures", cat_as = {"prefectures", "departmental capitals"}},
{type = "French prefectures", cat_as = {"prefectures", "departmental capitals"}},
},
data = export.france_administrative_regions,
}
export.france_departments = {
["Ain, ฝรั่งเศส"] = {container = "Auvergne-Rhône-Alpes"}, -- 01
["Aisne, ฝรั่งเศส"] = {container = "Hauts-de-France"}, -- 02
["Allier, ฝรั่งเศส"] = {container = "Auvergne-Rhône-Alpes"}, -- 03
["Alpes-de-Haute-Provence, ฝรั่งเศส"] = {container = "Provence-Alpes-Côte d'Azur"}, -- 04
["Hautes-Alpes, ฝรั่งเศส"] = {container = "Provence-Alpes-Côte d'Azur"}, -- 05
["Alpes-Maritimes, ฝรั่งเศส"] = {container = "Provence-Alpes-Côte d'Azur"}, -- 06
["Ardèche, ฝรั่งเศส"] = {container = "Auvergne-Rhône-Alpes"}, -- 07
["Ardennes, ฝรั่งเศส"] = {container = "Grand Est", wp = "%l (department)"}, -- 08
["Ariège, ฝรั่งเศส"] = {container = "Occitania", wp = "%l (department)"}, -- 09
["Aube, ฝรั่งเศส"] = {container = "Grand Est"}, -- 10
["Aude, ฝรั่งเศส"] = {container = "Occitania"}, -- 11
["Aveyron, ฝรั่งเศส"] = {container = "Occitania"}, -- 12
["Bouches-du-Rhône, ฝรั่งเศส"] = {container = "Provence-Alpes-Côte d'Azur"}, -- 13
["Calvados, ฝรั่งเศส"] = {container = "Normandy", wp = "%l (department)"}, -- 14
["Cantal, ฝรั่งเศส"] = {container = "Auvergne-Rhône-Alpes"}, -- 15
["Charente, ฝรั่งเศส"] = {container = "Nouvelle-Aquitaine"}, -- 16
["Charente-Maritime, ฝรั่งเศส"] = {container = "Nouvelle-Aquitaine"}, -- 17
["Cher, ฝรั่งเศส"] = {container = "Centre-Val de Loire", wp = "%l (department)"}, -- 18
["Corrèze, ฝรั่งเศส"] = {container = "Nouvelle-Aquitaine"}, -- 19
["Corse-du-Sud, ฝรั่งเศส"] = {container = "Corsica"}, -- 2A
["Haute-Corse, ฝรั่งเศส"] = {container = "Corsica"}, -- 2B
["Côte-d'Or, ฝรั่งเศส"] = {container = "Bourgogne-Franche-Comté"}, -- 21
["Côte d'Or, ฝรั่งเศส"] = {alias_of = "Côte-d'Or, ฝรั่งเศส", display = true},
["Côtes-d'Armor, ฝรั่งเศส"] = {container = "Brittany"}, -- 22
["Côtes d'Armor, ฝรั่งเศส"] = {alias_of = "Côtes-d'Armor, ฝรั่งเศส", display = true},
["Creuse, ฝรั่งเศส"] = {container = "Nouvelle-Aquitaine"}, -- 23
["Dordogne, ฝรั่งเศส"] = {container = "Nouvelle-Aquitaine"}, -- 24
["Doubs, ฝรั่งเศส"] = {container = "Bourgogne-Franche-Comté"}, -- 25
["Drôme, ฝรั่งเศส"] = {container = "Auvergne-Rhône-Alpes"}, -- 26
["Eure, ฝรั่งเศส"] = {container = "Normandy"}, -- 27
["Eure-et-Loir, ฝรั่งเศส"] = {container = "Centre-Val de Loire"}, -- 28
["Finistère, ฝรั่งเศส"] = {container = "Brittany"}, -- 29
["Gard, ฝรั่งเศส"] = {container = "Occitania"}, -- 30
["Haute-Garonne, ฝรั่งเศส"] = {container = "Occitania"}, -- 31
["Gers, ฝรั่งเศส"] = {container = "Occitania"}, -- 32
["Gironde, ฝรั่งเศส"] = {container = "Nouvelle-Aquitaine"}, -- 33
["Hérault, ฝรั่งเศส"] = {container = "Occitania"}, -- 34
["Ille-et-Vilaine, ฝรั่งเศส"] = {container = "Brittany"}, -- 35
["Indre, ฝรั่งเศส"] = {container = "Centre-Val de Loire"}, -- 36
["Indre-et-Loire, ฝรั่งเศส"] = {container = "Centre-Val de Loire"}, -- 37
["Isère, ฝรั่งเศส"] = {container = "Auvergne-Rhône-Alpes"}, -- 38
["Jura, ฝรั่งเศส"] = {container = "Bourgogne-Franche-Comté", wp = "%l (department)"}, -- 39
["Landes, ฝรั่งเศส"] = {container = "Nouvelle-Aquitaine", wp = "%l (department)"}, -- 40
["Loir-et-Cher, ฝรั่งเศส"] = {container = "Centre-Val de Loire"}, -- 41
["Loire, ฝรั่งเศส"] = {container = "Auvergne-Rhône-Alpes", wp = "%l (department)"}, -- 42
["Haute-Loire, ฝรั่งเศส"] = {container = "Auvergne-Rhône-Alpes"}, -- 43
["Loire-Atlantique, ฝรั่งเศส"] = {container = "Pays de la Loire"}, -- 44
["Loiret, ฝรั่งเศส"] = {container = "Centre-Val de Loire"}, -- 45
["Lot, ฝรั่งเศส"] = {container = "Occitania", wp = "%l (department)"}, -- 46
["Lot-et-Garonne, ฝรั่งเศส"] = {container = "Nouvelle-Aquitaine"}, -- 47
["Lozère, ฝรั่งเศส"] = {container = "Occitania"}, -- 48
["Maine-et-Loire, ฝรั่งเศส"] = {container = "Pays de la Loire"}, -- 49
["Manche, ฝรั่งเศส"] = {container = "Normandy"}, -- 50
["Marne, ฝรั่งเศส"] = {container = "Grand Est", wp = "%l (department)"}, -- 51
["Haute-Marne, ฝรั่งเศส"] = {container = "Grand Est"}, -- 52
["Mayenne, ฝรั่งเศส"] = {container = "Pays de la Loire"}, -- 53
["Meurthe-et-Moselle, ฝรั่งเศส"] = {container = "Grand Est"}, -- 54
["Meuse, ฝรั่งเศส"] = {container = "Grand Est", wp = "%l (department)"}, -- 55
["Morbihan, ฝรั่งเศส"] = {container = "Brittany"}, -- 56
["Moselle, ฝรั่งเศส"] = {container = "Grand Est", wp = "%l (department)"}, -- 57
["Nièvre, ฝรั่งเศส"] = {container = "Bourgogne-Franche-Comté"}, -- 58
["Nord, ฝรั่งเศส"] = {container = "Hauts-de-France", wp = "%l (French department)"}, -- 59
["Oise, ฝรั่งเศส"] = {container = "Hauts-de-France"}, -- 60
["Orne, ฝรั่งเศส"] = {container = "Normandy"}, -- 61
["Pas-de-Calais, ฝรั่งเศส"] = {container = "Hauts-de-France"}, -- 62
["Puy-de-Dôme, ฝรั่งเศส"] = {container = "Auvergne-Rhône-Alpes"}, -- 63
["Pyrénées-Atlantiques, ฝรั่งเศส"] = {container = "Nouvelle-Aquitaine"}, -- 64
["Hautes-Pyrénées, ฝรั่งเศส"] = {container = "Occitania"}, -- 65
["Pyrénées-Orientales, ฝรั่งเศส"] = {container = "Occitania"}, -- 66
["Bas-Rhin, ฝรั่งเศส"] = {container = "Grand Est"}, -- 67
["Haut-Rhin, ฝรั่งเศส"] = {container = "Grand Est"}, -- 68
["Rhône, ฝรั่งเศส"] = {container = "Auvergne-Rhône-Alpes", wp = "%l (department)"}, -- 69D
["Metropolis of Lyon, ฝรั่งเศส"] = {container = "Auvergne-Rhône-Alpes", the = true}, -- 69M
["Lyon Metropolis, ฝรั่งเศส"] = {alias_of = "Metropolis of Lyon, ฝรั่งเศส"},
["Lyon, ฝรั่งเศส"] = {alias_of = "Metropolis of Lyon, ฝรั่งเศส"},
["Haute-Saône, ฝรั่งเศส"] = {container = "Bourgogne-Franche-Comté"}, -- 70
["Saône-et-Loire, ฝรั่งเศส"] = {container = "Bourgogne-Franche-Comté"}, -- 71
["Sarthe, ฝรั่งเศส"] = {container = "Pays de la Loire"}, -- 72
["Savoie, ฝรั่งเศส"] = {container = "Auvergne-Rhône-Alpes"}, -- 73
["Haute-Savoie, ฝรั่งเศส"] = {container = "Auvergne-Rhône-Alpes"}, -- 74
["Paris, ฝรั่งเศส"] = {container = "Île-de-France"}, -- 75
["Seine-Maritime, ฝรั่งเศส"] = {container = "Normandy"}, -- 76
["Seine-et-Marne, ฝรั่งเศส"] = {container = "Île-de-France"}, -- 77
["Yvelines, ฝรั่งเศส"] = {container = "Île-de-France"}, -- 78
["Deux-Sèvres, ฝรั่งเศส"] = {container = "Nouvelle-Aquitaine"}, -- 79
["Somme, ฝรั่งเศส"] = {container = "Hauts-de-France", wp = "%l (department)"}, -- 80
["Tarn, ฝรั่งเศส"] = {container = "Occitania", wp = "%l (department)"}, -- 81
["Tarn-et-Garonne, ฝรั่งเศส"] = {container = "Occitania"}, -- 82
["Var, ฝรั่งเศส"] = {container = "Provence-Alpes-Côte d'Azur", wp = "%l (department)"}, -- 83
["Vaucluse, ฝรั่งเศส"] = {container = "Provence-Alpes-Côte d'Azur"}, -- 84
["Vendée, ฝรั่งเศส"] = {container = "Pays de la Loire"}, -- 85
["Vienne, ฝรั่งเศส"] = {container = "Nouvelle-Aquitaine", wp = "%l (department)"}, -- 86
["Haute-Vienne, ฝรั่งเศส"] = {container = "Nouvelle-Aquitaine"}, -- 87
["Vosges, ฝรั่งเศส"] = {container = "Grand Est", wp = "%l (department)"}, -- 88
["Yonne, ฝรั่งเศส"] = {container = "Bourgogne-Franche-Comté"}, -- 89
["Territoire de Belfort, ฝรั่งเศส"] = {container = "Bourgogne-Franche-Comté"}, -- 90
["Essonne, ฝรั่งเศส"] = {container = "Île-de-France"}, -- 91
["Hauts-de-Seine, ฝรั่งเศส"] = {container = "Île-de-France"}, -- 92
["Seine-Saint-Denis, ฝรั่งเศส"] = {container = "Île-de-France"}, -- 93
["Val-de-Marne, ฝรั่งเศส"] = {container = "Île-de-France"}, -- 94
["Val-d'Oise, ฝรั่งเศส"] = {container = "Île-de-France"}, -- 95
--["Guadeloupe"] = {container = "Guadeloupe"}, -- 971
--["Martinique"] = {container = "Martinique"}, -- 972
--["Guyane"] = {container = "French Guiana", wp = "French Guiana"}, -- 973
--["La Réunion"] = {container = "Réunion", wp = "Réunion"}, -- 974
--["Mayotte"] = {container = "Mayotte"}, -- 976
}
export.france_departments_group = {
placename_to_key = make_placename_to_key(", ฝรั่งเศส"),
canonicalize_key_container = make_canonicalize_key_container(", ฝรั่งเศส", "ภูมิภาค"),
default_placetype = "department",
default_divs = {
"communes",
{type = "เทศบาล", cat_as = "communes"},
},
data = export.france_departments,
}
export.germany_states = {
["Baden-Württemberg, เยอรมนี"] = {},
["Bavaria, เยอรมนี"] = {},
-- Berlin, Bremen and Hamburg are effectively city-states and don't have districts ([[Kreise]]), so override
-- the default_divs setting. Better not to include them at all since they're included as cities down below.
-- ["Berlin"] = {divs = {}},
["Brandenburg, เยอรมนี"] = {},
-- ["Bremen"] = {divs = {}},
-- ["Hamburg"] = {divs = {}},
["Hesse, เยอรมนี"] = {},
["Lower Saxony, เยอรมนี"] = {},
["Mecklenburg-Vorpommern, เยอรมนี"] = {},
["Mecklenburg-Western Pomerania, เยอรมนี"] = {alias_of = "Mecklenburg-Vorpommern, เยอรมนี", display = true},
["North Rhine-Westphalia, เยอรมนี"] = {},
["Rhineland-Palatinate, เยอรมนี"] = {},
["Saarland, เยอรมนี"] = {},
["Saxony, เยอรมนี"] = {},
["Saxony-Anhalt, เยอรมนี"] = {},
["Schleswig-Holstein, เยอรมนี"] = {},
["Thuringia, เยอรมนี"] = {},
}
-- states of Germany
export.germany_group = {
default_container = "เยอรมนี",
default_placetype = "รัฐ",
default_divs = {"อำเภอ", "เทศบาล"},
data = export.germany_states,
}
export.greece_regions = {
["Attica, กรีซ"] = {wp = "%l (ภูมิภาค)"},
["Central Greece, กรีซ"] = {wp = "%l (administrative region)"},
["Central Macedonia, กรีซ"] = {},
["Crete, กรีซ"] = {},
["Eastern Macedonia and Thrace, กรีซ"] = {},
["Epirus, กรีซ"] = {wp = "%l (ภูมิภาค)"},
["Ionian Islands, กรีซ"] = {the = true, wp = "%l (ภูมิภาค)"},
["North Aegean, กรีซ"] = {the = true},
-- I would expect 'the Peloponnese' but Wikipedia mostly has categories like [[w:Category:Geography of Peloponnese (ภูมิภาค)]]
-- and [[w:Category:Buildings and structures in Peloponnese (ภูมิภาค)]]; only [[w:Category:People from the Peloponnese (ภูมิภาค)]]
-- has "the" in it.
["Peloponnese, กรีซ"] = {wp = "%l (ภูมิภาค)"},
["South Aegean, กรีซ"] = {the = true},
["Thessaly, กรีซ"] = {},
["Western Greece, กรีซ"] = {},
["Western Macedonia, กรีซ"] = {},
["Mount Athos, กรีซ"] = {placetype = {"autonomous region", "ภูมิภาค"}, wp = "Monastic community of Mount Athos"},
}
-- regions of Greece
export.greece_group = {
default_container = "กรีซ",
default_placetype = "ภูมิภาค",
data = export.greece_regions,
}
local india_polity_with_divisions = {"divisions", "อำเภอ"}
local india_polity_without_divisions = {"อำเภอ"}
-- States and union territories of India. Only some of them are divided into divisions.
export.india_states_and_union_territories = {
["Andaman and Nicobar Islands, อินเดีย"] =
{the = true, placetype = "union territory", divs = india_polity_without_divisions},
["Andhra Pradesh, อินเดีย"] = {divs = india_polity_without_divisions},
["Arunachal Pradesh, อินเดีย"] = {divs = india_polity_with_divisions},
["Assam, อินเดีย"] = {divs = india_polity_with_divisions},
["Bihar, อินเดีย"] = {divs = india_polity_with_divisions},
["Chandigarh, อินเดีย"] = {placetype = "union territory", divs = india_polity_without_divisions},
["Chhattisgarh, อินเดีย"] = {divs = india_polity_with_divisions},
["Dadra and Nagar Haveli and Daman and Diu, อินเดีย"] = {placetype = "union territory", divs = india_polity_without_divisions},
["Delhi, อินเดีย"] = {placetype = "union territory", divs = india_polity_with_divisions},
["Goa, อินเดีย"] = {divs = india_polity_without_divisions},
["Gujarat, อินเดีย"] = {divs = india_polity_without_divisions},
["Haryana, อินเดีย"] = {divs = india_polity_with_divisions},
["Himachal Pradesh, อินเดีย"] = {divs = india_polity_with_divisions},
["Jammu and Kashmir, อินเดีย"] = {placetype = "union territory", divs = india_polity_with_divisions,
wp = "%l (union territory)"},
["Jharkhand, อินเดีย"] = {divs = india_polity_with_divisions},
["Karnataka, อินเดีย"] = {divs = india_polity_with_divisions},
["Kerala, อินเดีย"] = {divs = india_polity_without_divisions},
["Ladakh, อินเดีย"] = {placetype = "union territory", divs = india_polity_with_divisions},
["Lakshadweep, อินเดีย"] = {placetype = "union territory", divs = india_polity_without_divisions},
["Madhya Pradesh, อินเดีย"] = {divs = india_polity_with_divisions},
["Maharashtra, อินเดีย"] = {divs = india_polity_with_divisions},
["Manipur, อินเดีย"] = {divs = india_polity_without_divisions},
["Meghalaya, อินเดีย"] = {divs = india_polity_with_divisions},
["Mizoram, อินเดีย"] = {divs = india_polity_without_divisions},
["Nagaland, อินเดีย"] = {divs = india_polity_with_divisions},
["Odisha, อินเดีย"] = {divs = india_polity_with_divisions},
["Puducherry, อินเดีย"] = {placetype = "union territory", divs = india_polity_without_divisions,
wp = "%l (union territory)"},
["Pondicherry, อินเดีย"] = {alias_of = "Puducherry, อินเดีย", display = true},
["Punjab, อินเดีย"] = {divs = india_polity_with_divisions, wp = "%l, %c"},
["Rajasthan, อินเดีย"] = {divs = india_polity_with_divisions},
["Sikkim, อินเดีย"] = {divs = india_polity_without_divisions},
["Tamil Nadu, อินเดีย"] = {divs = india_polity_without_divisions},
["Telangana, อินเดีย"] = {divs = india_polity_without_divisions},
["Tripura, อินเดีย"] = {divs = india_polity_without_divisions},
["Uttar Pradesh, อินเดีย"] = {divs = india_polity_with_divisions},
["Uttarakhand, อินเดีย"] = {divs = india_polity_with_divisions},
["West Bengal, อินเดีย"] = {divs = india_polity_with_divisions},
}
-- states and union territories of India
export.india_group = {
default_container = "อินเดีย",
default_placetype = "รัฐ",
data = export.india_states_and_union_territories,
}
export.indonesia_provinces = {
["Aceh, อินโดนีเซีย"] = {},
["Bali, อินโดนีเซีย"] = {},
["Bangka Belitung Islands, อินโดนีเซีย"] = {the = true},
["Banten, อินโดนีเซีย"] = {},
["Bengkulu, อินโดนีเซีย"] = {},
["Central Java, อินโดนีเซีย"] = {},
["Central Kalimantan, อินโดนีเซีย"] = {},
["Central Papua, อินโดนีเซีย"] = {},
["Central Sulawesi, อินโดนีเซีย"] = {},
["East Java, อินโดนีเซีย"] = {},
["East Kalimantan, อินโดนีเซีย"] = {},
["East Nusa Tenggara, อินโดนีเซีย"] = {},
["Gorontalo, อินโดนีเซีย"] = {},
["Highland Papua, อินโดนีเซีย"] = {wp = "%l"},
["Special Capital Region of Jakarta, อินโดนีเซีย"] = {the = true, wp = "Jakarta"},
["Jakarta, อินโดนีเซีย"] = {alias_of = "Special Capital Region of Jakarta, อินโดนีเซีย"},
["Jambi, อินโดนีเซีย"] = {},
["Lampung, อินโดนีเซีย"] = {},
["Maluku, อินโดนีเซีย"] = {},
["North Kalimantan, อินโดนีเซีย"] = {},
["North Maluku, อินโดนีเซีย"] = {},
["North Sulawesi, อินโดนีเซีย"] = {},
["North Papua, อินโดนีเซีย"] = {},
["North Sumatra, อินโดนีเซีย"] = {},
["Papua, อินโดนีเซีย"] = {wp = "%l (จังหวัด)"},
["Riau, อินโดนีเซีย"] = {},
["Riau Islands, อินโดนีเซีย"] = {the = true},
["Southeast Sulawesi, อินโดนีเซีย"] = {},
["South Kalimantan, อินโดนีเซีย"] = {},
["South Papua, อินโดนีเซีย"] = {},
["South Sulawesi, อินโดนีเซีย"] = {},
["South Sumatra, อินโดนีเซีย"] = {},
["Southwest Papua, อินโดนีเซีย"] = {},
["West Java, อินโดนีเซีย"] = {},
["West Kalimantan, อินโดนีเซีย"] = {},
["West Nusa Tenggara, อินโดนีเซีย"] = {},
["West Papua, อินโดนีเซีย"] = {wp = "%l (จังหวัด)"},
["West Sulawesi, อินโดนีเซีย"] = {},
["West Sumatra, อินโดนีเซีย"] = {},
["Special Region of Yogyakarta, อินโดนีเซีย"] = {the = true},
["Yogyakarta, อินโดนีเซีย"] = {alias_of = "Special Region of Yogyakarta, อินโดนีเซีย"},
}
-- provinces of Indonesia
export.indonesia_group = {
default_container = "อินโดนีเซีย",
default_placetype = "จังหวัด",
-- per https://www.quora.com/Does-Indonesia-use-British-or-American-English, อินโดนีเซีย tends to use American
-- spellings.
data = export.indonesia_provinces,
}
export.iran_provinces = {
["Alborz, อิหร่าน"] = {}, -- abbreviation AL, capital [[w:Karaj]]
["Ardabil, อิหร่าน"] = {}, -- abbreviation AR, capital [[w:Ardabil]]
["Bushehr, อิหร่าน"] = {}, -- abbreviation BU, capital [[w:Bushehr]]
["Chaharmahal and Bakhtiari, อิหร่าน"] = {}, -- abbreviation CB, capital [[w:Shahr-e Kord]]
["East Azerbaijan, อิหร่าน"] = {}, -- abbreviation EA, capital [[w:Tabriz]]
["Fars, อิหร่าน"] = {}, -- abbreviation FA, capital [[w:Shiraz]]
["Pars, อิหร่าน"] = {alias_of = "Fars, อิหร่าน", display = true},
["Gilan, อิหร่าน"] = {}, -- abbreviation GN, capital [[w:Rasht]]
["Golestan, อิหร่าน"] = {}, -- abbreviation GO, capital [[w:Gorgan]]
["Hamadan, อิหร่าน"] = {}, -- abbreviation HA, capital [[w:Hamadan]]
["Hormozgan, อิหร่าน"] = {}, -- abbreviation HO, capital [[w:Bandar Abbas]]
["Ilam, อิหร่าน"] = {}, -- abbreviation IL, capital [[w:Ilam, อิหร่าน|Ilam]]
["Isfahan, อิหร่าน"] = {}, -- abbreviation IS, capital [[w:Isfahan]]
["Kerman, อิหร่าน"] = {}, -- abbreviation KN, capital [[w:Kerman]]
["Kermanshah, อิหร่าน"] = {}, -- abbreviation KE, capital [[w:Kermanshah]]
["Khuzestan, อิหร่าน"] = {}, -- abbreviation KH, capital [[w:Ahvaz]]
["Kohgiluyeh and Boyer-Ahmad, อิหร่าน"] = {}, -- abbreviation KB, capital [[w:Yasuj]]
["Kurdistan, อิหร่าน"] = {}, -- abbreviation KU, capital [[w:Sanandaj]]
["Lorestan, อิหร่าน"] = {}, -- abbreviation LO, capital [[w:Khorramabad]]
["Markazi, อิหร่าน"] = {}, -- abbreviation MA, capital [[w:Arak, อิหร่าน|Arak]]
["Mazandaran, อิหร่าน"] = {}, -- abbreviation MN, capital [[w:Sari, อิหร่าน|Sari]]
["North Khorasan, อิหร่าน"] = {}, -- abbreviation NK, capital [[w:Bojnord]]
["Qazvin, อิหร่าน"] = {}, -- abbreviation QA, capital [[w:Qazvin]]
["Qom, อิหร่าน"] = {}, -- abbreviation QM, capital [[w:Qom]]
["Razavi Khorasan, อิหร่าน"] = {}, -- abbreviation RK, capital [[w:Mashhad]]
["Semnan, อิหร่าน"] = {}, -- abbreviation SE, capital [[w:Semnan, อิหร่าน|Semnan]]
["Sistan and Baluchestan, อิหร่าน"] = {}, -- abbreviation SB, capital [[w:Zahedan]]
["South Khorasan, อิหร่าน"] = {}, -- abbreviation SK, capital [[w:Birjand]]
["Tehran, อิหร่าน"] = {}, -- abbreviation TE, capital [[w:Tehran]]
["West Azerbaijan, อิหร่าน"] = {}, -- abbreviation WA, capital [[w:Urmia]]
["Yazd, อิหร่าน"] = {}, -- abbreviation YA, capital [[w:Yazd]]
["Zanjan, อิหร่าน"] = {}, -- abbreviation ZA, capital [[w:Zanjan, อิหร่าน|Zanjan]]
}
-- provinces of Iran
export.iran_group = {
key_to_placename = make_key_to_placename(", อิหร่าน$"),
placename_to_key = make_placename_to_key(", อิหร่าน"),
default_container = "อิหร่าน",
default_placetype = "จังหวัด",
-- There aren't nearly enough counties of Iran currently entered in any language to allow for categorizing them
-- per-province. (As of 2025-05-09, there are only 6 counties in each of [[Category:en:Counties of Iran]],
-- [[Category:fa:Counties of Iran]] and [[Category:ar:Counties of Iran]].)
-- default_divs = "เทศมณฑล",
-- For obscure reasons, provinces of Iran, Laos, Thailand and Vietnam use lowercase 'province'
default_wp = "จังหวัด%e",
data = export.iran_provinces,
}
export.ireland_counties = {
["County Carlow, ไอร์แลนด์"] = {},
["County Cavan, ไอร์แลนด์"] = {},
["County Clare, ไอร์แลนด์"] = {},
["County Cork, ไอร์แลนด์"] = {},
["County Donegal, ไอร์แลนด์"] = {},
["County Dublin, ไอร์แลนด์"] = {},
["County Galway, ไอร์แลนด์"] = {},
["County Kerry, ไอร์แลนด์"] = {},
["County Kildare, ไอร์แลนด์"] = {},
["County Kilkenny, ไอร์แลนด์"] = {},
["County Laois, ไอร์แลนด์"] = {},
["County Leitrim, ไอร์แลนด์"] = {},
["County Limerick, ไอร์แลนด์"] = {},
["County Longford, ไอร์แลนด์"] = {},
["County Louth, ไอร์แลนด์"] = {},
["County Mayo, ไอร์แลนด์"] = {},
["County Meath, ไอร์แลนด์"] = {},
["County Monaghan, ไอร์แลนด์"] = {},
["County Offaly, ไอร์แลนด์"] = {},
["County Roscommon, ไอร์แลนด์"] = {},
["County Sligo, ไอร์แลนด์"] = {},
["County Tipperary, ไอร์แลนด์"] = {},
["County Waterford, ไอร์แลนด์"] = {},
["County Westmeath, ไอร์แลนด์"] = {},
["County Wexford, ไอร์แลนด์"] = {},
["County Wicklow, ไอร์แลนด์"] = {},
}
local function make_irish_type_key_to_placename(container_pattern)
return function(key)
key = key:gsub(container_pattern, "")
local elliptical_key = key:gsub("^County ", "")
return key, elliptical_key
end
end
local function make_irish_type_placename_to_key(container_suffix)
return function(placename)
if not placename:find("^County ") and not placename:find("^City ") then
placename = "County " .. placename
end
return placename .. container_suffix
end
end
-- counties of Ireland
export.ireland_group = {
key_to_placename = make_irish_type_key_to_placename(", ไอร์แลนด์$"),
placename_to_key = make_irish_type_placename_to_key(", ไอร์แลนด์"),
default_container = "ไอร์แลนด์",
default_placetype = "เทศมณฑล",
data = export.ireland_counties,
}
export.italy_administrative_regions = {
["Abruzzo, Italy"] = {},
["Aosta Valley, Italy"] = {placetype = {"autonomous region", "administrative region", "ภูมิภาค"}},
["Apulia, Italy"] = {},
["Basilicata, Italy"] = {},
["Calabria, Italy"] = {},
["Campania, Italy"] = {},
["Emilia-Romagna, Italy"] = {},
["Friuli-Venezia Giulia, Italy"] = {placetype = {"autonomous region", "administrative region", "ภูมิภาค"}},
["Lazio, Italy"] = {},
["Liguria, Italy"] = {},
["Lombardy, Italy"] = {},
["Marche, Italy"] = {},
["Molise, Italy"] = {},
["Piedmont, Italy"] = {},
["Sardinia, Italy"] = {placetype = {"autonomous region", "administrative region", "ภูมิภาค"}},
["Sicily, Italy"] = {placetype = {"autonomous region", "administrative region", "ภูมิภาค"}},
["Trentino-Alto Adige, Italy"] = {placetype = {"autonomous region", "administrative region", "ภูมิภาค"}},
["Tuscany, Italy"] = {},
["Umbria, Italy"] = {},
["Veneto, Italy"] = {},
}
-- administrative regions of Italy
export.italy_group = {
default_container = "อิตาลี",
default_placetype = "ภูมิภาค",
data = export.italy_administrative_regions,
}
-- table of Japanese prefectures; interpolated into the main 'places' table, but also needed separately
export.japan_prefectures = {
["ไอจิ, ญี่ปุ่น"] = {},
["อากิตะ, ญี่ปุ่น"] = {},
["อาโอโมริ, ญี่ปุ่น"] = {},
["จิบะ, ญี่ปุ่น"] = {},
["เอฮิเมะ, ญี่ปุ่น"] = {},
["ฟูกูอิ, ญี่ปุ่น"] = {},
["ฟูกูโอกะ, ญี่ปุ่น"] = {},
["ฟูกูชิมะ, ญี่ปุ่น"] = {},
["กิฟุ, ญี่ปุ่น"] = {},
["กุมมะ, ญี่ปุ่น"] = {},
["ฮิโรชิมะ, ญี่ปุ่น"] = {},
["ฮกไกโด, ญี่ปุ่น"] = {divs = "กิ่งจังหวัด", wp = "ฮกไกโด"},
["เฮียวโงะ, ญี่ปุ่น"] = {},
--["Hyogo, ญี่ปุ่น"] = {alias_of = "เฮียวโงะ, ญี่ปุ่น", display = true},
["อิบารากิ, ญี่ปุ่น"] = {},
["อิชิกาวะ, ญี่ปุ่น"] = {},
["อิวาเตะ, ญี่ปุ่น"] = {},
["คางาวะ, ญี่ปุ่น"] = {},
["คาโงชิมะ, ญี่ปุ่น"] = {},
["คานางาวะ, ญี่ปุ่น"] = {},
["โคจิ, ญี่ปุ่น"] = {},
--["Kochi, ญี่ปุ่น"] = {alias_of = "โคจิ, ญี่ปุ่น", display = true},
["คูมาโมโตะ, ญี่ปุ่น"] = {},
["เกียวโต, ญี่ปุ่น"] = {},
["มิเอะ, ญี่ปุ่น"] = {},
["มิยางิ, ญี่ปุ่น"] = {},
["มิยาซากิ, ญี่ปุ่น"] = {},
["นางาโนะ, ญี่ปุ่น"] = {},
["นางาซากิ, ญี่ปุ่น"] = {},
["นาระ, ญี่ปุ่น"] = {},
["นีงาตะ, ญี่ปุ่น"] = {},
["โออิตะ, ญี่ปุ่น"] = {},
--["Oita, ญี่ปุ่น"] = {alias_of = "โออิตะ, ญี่ปุ่น", display = true},
["โอกายามะ, ญี่ปุ่น"] = {},
["โอกินาวะ, ญี่ปุ่น"] = {},
["โอซากะ, ญี่ปุ่น"] = {},
["ซางะ, ญี่ปุ่น"] = {},
["ไซตามะ, ญี่ปุ่น"] = {},
["ชิงะ, ญี่ปุ่น"] = {},
["ชิมาเนะ, ญี่ปุ่น"] = {},
["ชิซูโอกะ, ญี่ปุ่น"] = {},
["โทจิงิ, ญี่ปุ่น"] = {},
["โทกูชิมะ, ญี่ปุ่น"] = {},
["ทตโตริ, ญี่ปุ่น"] = {},
["โทยามะ, ญี่ปุ่น"] = {},
["วากายามะ, ญี่ปุ่น"] = {},
["ยามางาตะ, ญี่ปุ่น"] = {},
["ยามางูจิ, ญี่ปุ่น"] = {},
["ยามานาชิ, ญี่ปุ่น"] = {},
}
-- prefectures of Japan
export.japan_group = {
key_to_placename = make_key_to_placename(", ญี่ปุ่น$"),
placename_to_key = make_placename_to_key(", ญี่ปุ่น"),
default_container = "ญี่ปุ่น",
default_placetype = "จังหวัด",
default_wp = "จังหวัด%e",
data = export.japan_prefectures,
}
export.laos_provinces = {
["Attapeu Province, Laos"] = {},
["Bokeo Province, Laos"] = {},
["Bolikhamxai Province, Laos"] = {},
["Champasak Province, Laos"] = {},
["Houaphanh Province, Laos"] = {},
["Khammouane Province, Laos"] = {},
["Luang Namtha Province, Laos"] = {},
["Luang Prabang Province, Laos"] = {},
["Oudomxay Province, Laos"] = {},
["Phongsaly Province, Laos"] = {},
["Salavan Province, Laos"] = {},
["Savannakhet Province, Laos"] = {},
["Vientiane Province, Laos"] = {},
["Vientiane Prefecture, Laos"] = {placetype = "prefecture", wp = "%l"},
["Sainyabuli Province, Laos"] = {},
["Sekong Province, Laos"] = {},
["Xaisomboun Province, Laos"] = {},
["Xiangkhouang Province, Laos"] = {},
}
local function laos_placename_to_key(placename)
if placename == "Vientiane Prefecture" then
return placename .. ", Laos"
end
if placename:find(" Province$") then
return placename .. ", Laos"
end
return placename .. " Province, Laos"
end
-- provinces of Laos
export.laos_group = {
key_to_placename = make_key_to_placename(", Laos$", {" Province$", " Prefecture$"}),
placename_to_key = laos_placename_to_key,
default_container = "Laos",
default_placetype = "จังหวัด",
-- For obscure reasons, provinces of Iran, Laos, Thailand and Vietnam use lowercase 'province'
default_wp = "%e province",
data = export.laos_provinces,
}
export.lebanon_governorates = {
["Akkar Governorate, Lebanon"] = {},
["Baalbek-Hermel Governorate, Lebanon"] = {},
["Beirut Governorate, Lebanon"] = {},
["Beqaa Governorate, Lebanon"] = {},
["Keserwan-Jbeil Governorate, Lebanon"] = {},
["Mount Lebanon Governorate, Lebanon"] = {},
["Nabatieh Governorate, Lebanon"] = {},
-- These two are generic enough that we don't want to automatically augment a use of `gov/North Governorate` or
-- `gov/South Governorate` with `c/Lebanon`.
["North Governorate, Lebanon"] = {no_auto_augment_container = true},
["South Governorate, Lebanon"] = {no_auto_augment_container = true},
}
-- governorates of Lebanon
export.lebanon_group = {
key_to_placename = make_key_to_placename(", Lebanon$", " Governorate$"),
placename_to_key = make_placename_to_key(", Lebanon", " Governorate"),
default_container = "Lebanon",
default_placetype = "governorate",
data = export.lebanon_governorates,
}
export.malaysia_states = {
["Johor, Malaysia"] = {},
["Kedah, Malaysia"] = {},
["Kelantan, Malaysia"] = {},
["Malacca, Malaysia"] = {},
["Negeri Sembilan, Malaysia"] = {},
["Pahang, Malaysia"] = {},
["Penang, Malaysia"] = {},
["Perak, Malaysia"] = {},
["Perlis, Malaysia"] = {},
["Sabah, Malaysia"] = {},
["Sarawak, Malaysia"] = {},
["Selangor, Malaysia"] = {},
["Terengganu, Malaysia"] = {},
}
-- states of Malaysia
export.malaysia_group = {
default_container = "Malaysia",
default_placetype = "รัฐ",
default_wp = "%l, %c",
data = export.malaysia_states,
}
export.malta_regions = {
-- Some of the regions are generic enough that we don't want to automatically augment a use of e.g.
-- `r/Northern Region` with `c/Malta`. In particular;
-- * "Eastern Region" also occurs at least in Ghana, Uganda, Iceland, Nigeria, Venezuela, North Macedonia and
-- El Salvador;
-- * "Northern Region" also occurs at least in Ghana, Uganda, Malawi, Nigeria, Canada and South Africa;
-- * "Western Region" also occurs at least in Abu Dhabi, Bahrain, South Africa, Ghana, Iceland, Nepal, Nigeria,
-- Serbia and Uganda;
-- * "Southern Region" also occurs at least in Nigeria, Eritrea, Iceland, ไอร์แลนด์, Malawi and Serbia.
["Eastern Region, Malta"] = {no_auto_augment_container = true},
["Gozo Region, Malta"] = {wp = "%l"},
["Northern Region, Malta"] = {no_auto_augment_container = true},
["Port Region, Malta"] = {},
["Southern Region, Malta"] = {no_auto_augment_container = true},
["Western Region, Malta"] = {no_auto_augment_container = true},
}
-- regions of Malta
export.malta_group = {
key_to_placename = make_key_to_placename(", Malta$", " Region"),
placename_to_key = make_placename_to_key(", Malta", " Region"),
default_container = "Malta",
default_placetype = "ภูมิภาค",
default_wp = "%l, %c",
default_the = true,
data = export.malta_regions,
}
export.mexico_states = {
["Aguascalientes, Mexico"] = {},
["Baja California, Mexico"] = {},
-- not display-canonicalizing because the "Norte" could be for emphasis
["Baja California Norte, Mexico"] = {alias_of = "Baja California, Mexico"},
["Baja California Sur, Mexico"] = {},
["Campeche, Mexico"] = {},
["Chiapas, Mexico"] = {},
["Chihuahua, Mexico"] = {wp = "%l (รัฐ)"},
["Coahuila, Mexico"] = {},
["Colima, Mexico"] = {},
["Durango, Mexico"] = {},
["Guanajuato, Mexico"] = {},
["Guerrero, Mexico"] = {},
["Hidalgo, Mexico"] = {wp = "%l (รัฐ)"},
["Jalisco, Mexico"] = {},
["State of Mexico, Mexico"] = {the = true},
["Mexico, Mexico"] = {alias_of = "State of Mexico, Mexico"}, -- differs in "the"
-- ["Mexico City, Mexico"] = {}, doesn't belong here because it's a city
["Michoacán, Mexico"] = {},
["Michoacan, Mexico"] = {alias_of = "Michoacán, Mexico", display = true},
["Morelos, Mexico"] = {},
["Nayarit, Mexico"] = {},
["Nuevo León, Mexico"] = {},
["Nuevo Leon, Mexico"] = {alias_of = "Nuevo León, Mexico", display = true},
["Oaxaca, Mexico"] = {},
["Puebla, Mexico"] = {},
["Querétaro, Mexico"] = {},
["Queretaro, Mexico"] = {alias_of = "Querétaro, Mexico", display = true},
["Quintana Roo, Mexico"] = {},
["San Luis Potosí, Mexico"] = {},
["San Luis Potosi, Mexico"] = {alias_of = "San Luis Potosí, Mexico", display = true},
["Sinaloa, Mexico"] = {},
["Sonora, Mexico"] = {},
["Tabasco, Mexico"] = {},
["Tamaulipas, Mexico"] = {},
["Tlaxcala, Mexico"] = {},
["Veracruz, Mexico"] = {},
["Yucatán, Mexico"] = {},
["Yucatan, Mexico"] = {alias_of = "Yucatán, Mexico", display = true},
["Zacatecas, Mexico"] = {},
}
-- Mexican states
export.mexico_group = {
default_container = "Mexico",
default_placetype = "รัฐ",
data = export.mexico_states,
}
export.moldova_districts_and_autonomous_territorial_units = {
["Anenii Noi District, Moldova"] = {}, -- capital [[Anenii Noi]]
["Basarabeasca District, Moldova"] = {}, -- capital [[Basarabeasca]]
["Briceni District, Moldova"] = {}, -- capital [[Briceni]]
["Cahul District, Moldova"] = {}, -- capital [[Cahul]]
["Cantemir District, Moldova"] = {}, -- capital [[Cantemir, Moldova|Cantemir]]
["Călărași District, Moldova"] = {}, -- capital [[Călărași, Moldova|Călărași]]
["Căușeni District, Moldova"] = {}, -- capital [[Căușeni]]
["Cimișlia District, Moldova"] = {}, -- capital [[Cimișlia]]
["Criuleni District, Moldova"] = {}, -- capital [[Criuleni]]
["Dondușeni District, Moldova"] = {}, -- capital [[Dondușeni]]
["Drochia District, Moldova"] = {}, -- capital [[Drochia]]
["Dubăsari District, Moldova"] = {}, -- capital [[Cocieri]]
["Edineț District, Moldova"] = {}, -- capital [[Edineț]]
["Fălești District, Moldova"] = {}, -- capital [[Fălești]]
["Florești District, Moldova"] = {}, -- capital [[Florești, Moldova|Florești]]
["Glodeni District, Moldova"] = {}, -- capital [[Glodeni]]
["Hîncești District, Moldova"] = {}, -- capital [[Hîncești]]
["Ialoveni District, Moldova"] = {}, -- capital [[Ialoveni]]
["Leova District, Moldova"] = {}, -- capital [[Leova]]
["Nisporeni District, Moldova"] = {}, -- capital [[Nisporeni]]
["Ocnița District, Moldova"] = {}, -- capital [[Ocnița]]
["Orhei District, Moldova"] = {}, -- capital [[Orhei]]
["Rezina District, Moldova"] = {}, -- capital [[Rezina]]
["Rîșcani District, Moldova"] = {}, -- capital [[Rîșcani]]
["Sîngerei District, Moldova"] = {}, -- capital [[Sîngerei]]
["Soroca District, Moldova"] = {}, -- capital [[Soroca]]
["Strășeni District, Moldova"] = {}, -- capital [[Strășeni]]
["Șoldănești District, Moldova"] = {}, -- capital [[Șoldănești]]
["Ștefan Vodă District, Moldova"] = {}, -- capital [[Ștefan Vodă]]
["Taraclia District, Moldova"] = {}, -- capital [[Taraclia]]
["Telenești District, Moldova"] = {}, -- capital [[Telenești]]
["Ungheni District, Moldova"] = {}, -- capital [[Ungheni]]
["Chișinău, Moldova"] = {placetype = "เทศบาล"},
["Bălți, Moldova"] = {placetype = "เทศบาล"},
["Gagauzia, Moldova"] = {placetype = {"autonomous territorial unit", "autonomous region", "ภูมิภาค"}}, -- capital [[Comrat]]
-- the remainder are under the de-facto control of the unrecognized state of Transnistria
["Bender, Moldova"] = {placetype = "เทศบาล"},
["Tighina, Moldova"] = {alias_of = "Bender, Moldova"},
["Transnistria, Moldova"] = {placetype = {"autonomous territorial unit", "autonomous region", "ภูมิภาค"}}, -- capital [[Tiraspol]]
["Left Bank of the Dniester, Moldova"] = {alias_of = "Transnistria, Moldova", the = true},
["Administrative-Territorial Units of the Left Bank of the Dniester, Moldova"] = {alias_of = "Transnistria, Moldova", the = true},
}
local function moldova_placename_to_key(placename)
local elliptical_key = placename .. ", Moldova"
if export.moldova_districts_and_autonomous_territorial_units[elliptical_key] then
return elliptical_key
end
if placename:find(" District$") then
return placename .. ", Moldova"
end
return placename .. " District, Moldova"
end
-- Moldovan districts (raions) and autonomous territorial units
export.moldova_group = {
key_to_placename = make_key_to_placename(", Moldova$", " District"),
placename_to_key = moldova_placename_to_key,
default_container = "Moldova",
default_placetype = {"district", "raion"},
default_divs = "communes",
data = export.moldova_districts_and_autonomous_territorial_units,
}
export.morocco_regions = {
["Tangier-Tetouan-Al Hoceima, Morocco"] = {},
["Oriental, Morocco"] = {wp = "%l (%c)"},
["L'Oriental, Morocco"] = {alias_of = "Oriental, Morocco", display = true},
["Fez-Meknes, Morocco"] = {},
["Rabat-Sale-Kenitra, Morocco"] = {wp = "Rabat-Salé-Kénitra"},
["Rabat-Salé-Kénitra, Morocco"] = {alias_of = "Rabat-Sale-Kenitra, Morocco", display = true},
["Beni Mellal-Khenifra, Morocco"] = {wp = "Béni Mellal-Khénifra"},
["Béni Mellal-Khénifra, Morocco"] = {alias_of = "Beni Mellal-Khenifra, Morocco", display = true},
["Casablanca-Settat, Morocco"] = {},
["Marrakesh-Safi, Morocco"] = {wp = "Marrakesh–Safi"}, -- WP title has en-dash
["Marrakech-Safi, Morocco"] = {alias_of = "Marrakesh-Safi, Morocco", display = true},
["Draa-Tafilalet, Morocco"] = {wp = "Drâa-Tafilalet"},
["Drâa-Tafilalet, Morocco"] = {alias_of = "Draa-Tafilalet, Morocco", display = true},
["Souss-Massa, Morocco"] = {},
["Guelmim-Oued Noun, Morocco"] = {
keydesc = "+++. '''NOTE:''' This region lies partly within the disputed territory of [[Western Sahara]]"
},
["Laayoune-Sakia El Hamra, Morocco"] = {
wp = "Laâyoune-Sakia El Hamra",
keydesc = "+++. '''NOTE:''' This region lies almost completely within the disputed territory of [[Western Sahara]]",
},
["Laâyoune-Sakia El Hamra, Morocco"] = {alias_of = "Laayoune-Sakia El Hamra, Morocco", display = true},
["Dakhla-Oued Ed-Dahab, Morocco"] = {
keydesc = "+++. '''NOTE:''' This region lies completely within the disputed territory of [[Western Sahara]]",
},
}
-- regions of Morocco
export.morocco_group = {
default_container = "Morocco",
default_placetype = "ภูมิภาค",
data = export.morocco_regions,
}
export.egypt_governorates = {
["Cairo Governorate, Egypt"] = {},
["Giza Governorate, Egypt"] = {},
["Sharqia Governorate, Egypt"] = {},
["Dakahlia Governorate, Egypt"] = {},
["Beheira Governorate, Egypt"] = {},
["Minya Governorate, Egypt"] = {},
["Qalyubia Governorate, Egypt"] = {},
["Sohag Governorate, Egypt"] = {},
["Alexandria Governorate, Egypt"] = {},
["Gharbia Governorate, Egypt"] = {},
["Asyut Governorate, Egypt"] = {},
["Monufia Governorate, Egypt"] = {},
["Faiyum Governorate, Egypt"] = {},
["Kafr El Sheikh Governorate, Egypt"] = {},
["Qena Governorate, Egypt"] = {},
["Beni Suef Governorate, Egypt"] = {},
["Damietta Governorate, Egypt"] = {},
["Aswan Governorate, Egypt"] = {},
["Ismailia Governorate, Egypt"] = {},
["Luxor Governorate, Egypt"] = {},
["Suez Governorate, Egypt"] = {},
["Port Said Governorate, Egypt"] = {},
["Matrouh Governorate, Egypt"] = {},
["North Sinai Governorate, Egypt"] = {},
["Red Sea Governorate, Egypt"] = {},
["New Valley Governorate, Egypt"] = {},
["South Sinai Governorate, Egypt"] = {},
}
-- governorates of Egypt
export.egypt_group = {
key_to_placename = make_key_to_placename(", Egypt$", " Governorate$"),
placename_to_key = make_placename_to_key(", Egypt", " Governorate"),
default_container = "อียิปต์",
default_placetype = "governorate",
data = export.egypt_governorates,
}
export.netherlands_provinces = {
["Drenthe, Netherlands"] = {},
["Flevoland, Netherlands"] = {},
["Friesland, Netherlands"] = {},
["Gelderland, Netherlands"] = {},
["Groningen, Netherlands"] = {wp = "%l (จังหวัด)"},
["Limburg, Netherlands"] = {wp = "%l (%c)"},
["North Brabant, Netherlands"] = {},
-- Foreign forms get display-canonicalized.
["Noord-Brabant, Netherlands"] = {alias_of = "North Brabant, Netherlands", display = true},
["North Holland, Netherlands"] = {},
["Noord-Holland, Netherlands"] = {alias_of = "North Holland, Netherlands", display = true},
["Overijssel, Netherlands"] = {},
["South Holland, Netherlands"] = {},
["Zuid-Holland, Netherlands"] = {alias_of = "South Holland, Netherlands", display = true},
["Utrecht, Netherlands"] = {wp = "%l (จังหวัด)"},
["Zeeland, Netherlands"] = {},
}
-- provinces of the Netherlands
export.netherlands_group = {
default_container = "เนเธอร์แลนด์",
default_placetype = "จังหวัด",
default_divs = "เทศบาล",
data = export.netherlands_provinces,
}
export.new_zealand_regions = {
-- North Island regions
["Northland, New Zealand"] = {wp = "%l Region"}, -- ISO 3166-2 code NZ-NTL, number 1, capital [[Whangārei]]
["Auckland, New Zealand"] = {wp = "%l Region"}, -- ISO 3166-2 code NZ-AUK, number 2, capital [[Auckland]]
["Waikato, New Zealand"] = {}, -- ISO 3166-2 code NZ-WKO, number 3, capital [[Hamilton, New Zealand|Hamilton]]
["Bay of Plenty, New Zealand"] = {the = true, wp = "%l Region"}, -- ISO 3166-2 code NZ-BOP, number 4, capital [[Whakatāne]]
["Gisborne, New Zealand"] = {placetype = {"ภูมิภาค", "district"}, wp = "%l District"}, -- ISO 3166-2 code NZ-GIS, number 5, capital [[Gisborne, New Zealand|Gisborne]]
["Hawke's Bay, New Zealand"] = {}, -- ISO 3166-2 code NZ-HKB, number 6, capital [[Napier, New Zealand|Napier]]
["Taranaki, New Zealand"] = {}, -- ISO 3166-2 code NZ-TKI, number 7, capital [[Stratford, New Zealand|Stratford]]
["Manawatū-Whanganui, New Zealand"] = {}, -- ISO 3166-2 code NZ-MWT, number 8, capital [[Palmerston North]]
["Manawatu-Whanganui, New Zealand"] = {alias_of = "Manawatū-Whanganui, New Zealand", display = true},
["Manawatu-Wanganui, New Zealand"] = {alias_of = "Manawatū-Whanganui, New Zealand", display = true},
["Wellington, New Zealand"] = {wp = "%l Region"}, -- ISO 3166-2 code NZ-WGN, number 9, capital [[Wellington]]
-- South Island regions
["Tasman, New Zealand"] = {placetype = {"ภูมิภาค", "district"}, wp = "%l District"}, -- ISO 3166-2 code NZ-TAS, number 10, capital [[Richmond, New Zealand|Richmond]]
["Nelson, New Zealand"] = {placetype = {"ภูมิภาค", "นคร"}, wp = "%l, %c", is_city = true}, -- ISO 3166-2 code NZ-NSN, number 11, capital [[Nelson, New Zealand|Nelson]]
["Marlborough, New Zealand"] = {placetype = {"ภูมิภาค", "district"}, wp = "%l District"}, -- ISO 3166-2 code NZ-MBH, number 12, capital [[Blenheim, New Zealand|Blenheim]]
["West Coast, New Zealand"] = {the = true, wp = "%l Region"}, -- ISO 3166-2 code NZ-WTC, number 13, capital [[Greymouth]]
["Canterbury, New Zealand"] = {wp = "%l Region"}, -- ISO 3166-2 code NZ-CAN, number 14, capital [[Christchurch]]
["Otago, New Zealand"] = {}, -- ISO 3166-2 code NZ-OTA, number 15, capital [[Dunedin]]
["Southland, New Zealand"] = {wp = "%l Region"}, -- ISO 3166-2 code NZ-STL, number 16, capital [[Invercargill]]
}
-- regions of New Zealand
export.new_zealand_group = {
default_container = "New Zealand",
default_placetype = "ภูมิภาค",
data = export.new_zealand_regions,
}
export.nigeria_states = {
["Abia State, Nigeria"] = {},
["Adamawa State, Nigeria"] = {},
["Akwa Ibom State, Nigeria"] = {},
["Anambra State, Nigeria"] = {},
["Bauchi State, Nigeria"] = {},
["Bayelsa State, Nigeria"] = {},
["Benue State, Nigeria"] = {},
["Borno State, Nigeria"] = {},
["Cross River State, Nigeria"] = {},
["Delta State, Nigeria"] = {},
["Ebonyi State, Nigeria"] = {},
["Edo State, Nigeria"] = {},
["Ekiti State, Nigeria"] = {},
["Enugu State, Nigeria"] = {},
["Federal Capital Territory, Nigeria"] = {
-- not a state but allow it to be referenced as one in holonyms
placetype = {"federal territory", "ดินแดน", "รัฐ"}, the = true, wp = "%l (%c)",
},
["Gombe State, Nigeria"] = {},
["Imo State, Nigeria"] = {},
["Jigawa State, Nigeria"] = {},
["Kaduna State, Nigeria"] = {},
["Kano State, Nigeria"] = {},
["Katsina State, Nigeria"] = {},
["Kebbi State, Nigeria"] = {},
["Kogi State, Nigeria"] = {},
["Kwara State, Nigeria"] = {},
["Lagos State, Nigeria"] = {},
["Nasarawa State, Nigeria"] = {},
["Niger State, Nigeria"] = {},
["Ogun State, Nigeria"] = {},
["Ondo State, Nigeria"] = {},
["Osun State, Nigeria"] = {},
["Oyo State, Nigeria"] = {},
["Plateau State, Nigeria"] = {},
["Rivers State, Nigeria"] = {},
["Sokoto State, Nigeria"] = {},
["Taraba State, Nigeria"] = {},
["Yobe State, Nigeria"] = {},
["Zamfara State, Nigeria"] = {},
}
-- states of Nigeria
export.nigeria_group = {
key_to_placename = make_key_to_placename(", Nigeria$", " State$"),
placename_to_key = make_placename_to_key(", Nigeria", " State"),
default_container = "Nigeria",
default_placetype = "รัฐ",
data = export.nigeria_states,
}
export.north_korea_provinces = {
["Chagang Province, North Korea"] = {},
["North Hamgyong Province, North Korea"] = {},
["South Hamgyong Province, North Korea"] = {},
["North Hwanghae Province, North Korea"] = {},
["South Hwanghae Province, North Korea"] = {},
["Kangwon Province, North Korea"] = {wp = "%l (%c)"},
["North Pyongan Province, North Korea"] = {},
["South Pyongan Province, North Korea"] = {},
["Ryanggang Province, North Korea"] = {},
}
-- provinces of North Korea
export.north_korea_group = {
key_to_placename = make_key_to_placename(", North Korea$", " Province$"),
placename_to_key = make_placename_to_key(", North Korea", " Province"),
default_container = "North Korea",
default_placetype = "จังหวัด",
data = export.north_korea_provinces,
}
export.norwegian_counties = {
["Oslo, Norway"] = {},
["Rogaland, Norway"] = {},
["Møre og Romsdal, Norway"] = {},
["Nordland, Norway"] = {},
["Østfold, Norway"] = {},
["Akershus, Norway"] = {},
["Buskerud, Norway"] = {},
-- the following two were merged into Innlandet
-- ["Hedmark, Norway"] = {},
-- ["Oppland, Norway"] = {},
["Innlandet, Norway"] = {},
["Vestfold, Norway"] = {},
["Telemark, Norway"] = {},
-- the following two were merged into Agder
-- ["Aust-Agder, Norway"] = {},
-- ["Vest-Agder, Norway"] = {},
["Agder, Norway"] = {},
-- the following two were merged into Vestland
-- ["Hordaland, Norway"] = {},
-- ["Sogn og Fjordane, Norway"] = {},
["Vestland, Norway"] = {},
["Trøndelag, Norway"] = {},
["Troms, Norway"] = {},
["Finnmark, Norway"] = {},
}
-- counties of Norway
export.norway_group = {
default_container = "Norway",
default_placetype = "เทศมณฑล",
data = export.norwegian_counties,
}
export.pakistan_provinces_and_territories = {
["Azad Kashmir, Pakistan"] = {
placetype = {"administrative territory", "autonomous territory", "ดินแดน"},
},
["Azad Jammu and Kashmir, Pakistan"] = {alias_of = "Azad Kashmir, Pakistan", display = true},
["Balochistan, Pakistan"] = {wp = "%l, %c"},
["Gilgit-Baltistan, Pakistan"] = {
placetype = {"administrative territory", "ดินแดน"},
},
["Islamabad Capital Territory, Pakistan"] = {
the = true,
divs = {}, -- no divisions
placetype = {"federal territory", "administrative territory", "ดินแดน"},
},
-- Islamabad is an accepted alias for Islamabad Capital Territory given the above placetypes
["Islamabad, Pakistan"] = {alias_of = "Islamabad Capital Territory, Pakistan"},
["Khyber Pakhtunkhwa, Pakistan"] = {},
["Punjab, Pakistan"] = {wp = "%l, %c"},
["Sindh, Pakistan"] = {},
}
-- provinces and territories of Pakistan
export.pakistan_group = {
default_container = "Pakistan",
default_placetype = "จังหวัด",
default_divs = "divisions",
data = export.pakistan_provinces_and_territories,
}
export.philippines_provinces = {
["Abra, Philippines"] = {wp = "%l (จังหวัด)"},
["Agusan del Norte, Philippines"] = {},
["Agusan del Sur, Philippines"] = {},
["Aklan, Philippines"] = {},
["Albay, Philippines"] = {},
["Antique, Philippines"] = {wp = "%l (จังหวัด)"},
["Apayao, Philippines"] = {},
["Aurora, Philippines"] = {wp = "%l (จังหวัด)"},
["Basilan, Philippines"] = {},
["Bataan, Philippines"] = {},
["Batanes, Philippines"] = {},
["Batangas, Philippines"] = {},
["Benguet, Philippines"] = {},
["Biliran, Philippines"] = {},
["Bohol, Philippines"] = {},
["Bukidnon, Philippines"] = {},
["Bulacan, Philippines"] = {},
["Cagayan, Philippines"] = {},
["Camarines Norte, Philippines"] = {},
["Camarines Sur, Philippines"] = {},
["Camiguin, Philippines"] = {},
["Capiz, Philippines"] = {},
["Catanduanes, Philippines"] = {},
["Cavite, Philippines"] = {},
["Cebu, Philippines"] = {},
["Cotabato, Philippines"] = {},
["Davao de Oro, Philippines"] = {},
["Davao del Norte, Philippines"] = {},
["Davao del Sur, Philippines"] = {},
["Davao Occidental, Philippines"] = {},
["Davao Oriental, Philippines"] = {},
["Dinagat Islands, Philippines"] = {the = true},
["Eastern Samar, Philippines"] = {},
["Guimaras, Philippines"] = {},
["Ifugao, Philippines"] = {},
["Ilocos Norte, Philippines"] = {},
["Ilocos Sur, Philippines"] = {},
["Iloilo, Philippines"] = {},
["Isabela, Philippines"] = {wp = "%l (จังหวัด)"},
["Kalinga, Philippines"] = {wp = "%l (จังหวัด)"},
["La Union, Philippines"] = {},
["Laguna, Philippines"] = {wp = "%l (จังหวัด)"},
["Lanao del Norte, Philippines"] = {},
["Lanao del Sur, Philippines"] = {},
["Leyte, Philippines"] = {wp = "%l (จังหวัด)"},
["Maguindanao del Norte, Philippines"] = {},
["Maguindanao del Sur, Philippines"] = {},
["Marinduque, Philippines"] = {},
["Masbate, Philippines"] = {},
["Misamis Occidental, Philippines"] = {},
["Misamis Oriental, Philippines"] = {},
["Mountain Province, Philippines"] = {},
["Negros Occidental, Philippines"] = {},
["Negros Oriental, Philippines"] = {},
["Northern Samar, Philippines"] = {},
["Nueva Ecija, Philippines"] = {},
["Nueva Vizcaya, Philippines"] = {},
["Occidental Mindoro, Philippines"] = {},
["Oriental Mindoro, Philippines"] = {},
["Palawan, Philippines"] = {},
["Pampanga, Philippines"] = {},
["Pangasinan, Philippines"] = {},
["Quezon, Philippines"] = {},
["Quirino, Philippines"] = {},
["Rizal, Philippines"] = {wp = "%l (จังหวัด)"},
["Romblon, Philippines"] = {},
["Samar, Philippines"] = {wp = "%l (จังหวัด)"},
["Sarangani, Philippines"] = {},
["Siquijor, Philippines"] = {},
["Sorsogon, Philippines"] = {},
["South Cotabato, Philippines"] = {},
["Southern Leyte, Philippines"] = {},
["Sultan Kudarat, Philippines"] = {},
["Sulu, Philippines"] = {},
["Surigao del Norte, Philippines"] = {},
["Surigao del Sur, Philippines"] = {},
["Tarlac, Philippines"] = {},
["Tawi-Tawi, Philippines"] = {},
["Zambales, Philippines"] = {},
["Zamboanga del Norte, Philippines"] = {},
["Zamboanga del Sur, Philippines"] = {},
["Zamboanga Sibugay, Philippines"] = {},
-- not a province but treated as one; allow it to be referred to as a province in holonyms
["Metro Manila, Philippines"] = {placetype = {"ภูมิภาค", "จังหวัด"}},
}
-- provinces of the Philippines
export.philippines_group = {
default_container = "Philippines",
default_placetype = "จังหวัด",
default_divs = {"เทศบาล", "barangays"},
data = export.philippines_provinces,
}
export.poland_voivodeships = {
["Lower Silesian Voivodeship, Poland"] = {}, -- abbr DS, code 02, capital Wrocław
["Kuyavian-Pomeranian Voivodeship, Poland"] = {}, -- abbr KP, code 04, capital Bydgoszcz (seat of voivode), Toruń (seat of sejmik and marshal)
["Lublin Voivodeship, Poland"] = {}, -- abbr LU, code 06, capital Lublin
["Lubusz Voivodeship, Poland"] = {}, -- abbr LB, code 08, capital Gorzów Wielkopolski (seat of voivode), Zielona Góra (seat of sejmik and marshal)
["Lodz Voivodeship, Poland"] = {wp = "Łódź Voivodeship"}, -- abbr LD, code 10, capital Łódź
["Łódź Voivodeship, Poland"] = {alias_of = "Lodz Voivodeship, Poland", display = true, display_as_full = true},
["Lesser Poland Voivodeship, Poland"] = {}, -- abbr MA, code 12, capital Kraków
["Masovian Voivodeship, Poland"] = {}, -- abbr MZ, code 14, capital Warsaw
["Opole Voivodeship, Poland"] = {}, -- abbr OP, code 16, capital Opole
["Subcarpathian Voivodeship, Poland"] = {}, -- abbr PK, code 18, capital Rzeszów
["Podlaskie Voivodeship, Poland"] = {}, -- abbr PD, code 20, capital Białystok
["Pomeranian Voivodeship, Poland"] = {}, -- abbr PM, code 22, capital Gdańsk
["Silesian Voivodeship, Poland"] = {}, -- abbr SL, code 24, capital Katowice
["Holy Cross Voivodeship, Poland"] = {wp = "Świętokrzyskie Voivodeship"}, -- abbr SK, code 26, capital Kielce
["Świętokrzyskie Voivodeship, Poland"] = {alias_of = "Holy Cross Voivodeship, Poland", display = true, display_as_full = true},
["Warmian-Masurian Voivodeship, Poland"] = {}, -- abbr WN, code 28, capital Olsztyn
["Greater Poland Voivodeship, Poland"] = {}, -- abbr WP, code 30, capital Poznań
["West Pomeranian Voivodeship, Poland"] = {}, -- abbr ZP, code 32, capital Szczecin
}
-- voivodeships of Poland
export.poland_group = {
key_to_placename = make_key_to_placename(", Poland$", " Voivodeship$"),
placename_to_key = make_placename_to_key(", Poland", " Voivodeship"),
default_container = "Poland",
default_placetype = "voivodeship",
default_divs = {
-- "เทศมณฑล", -- not enough of them currently
{type = "Polish colonies", cat_as = {{type = "villages", prep = "ใน"}}},
},
data = export.poland_voivodeships,
}
export.portugal_districts_and_autonomous_regions = {
["Azores, Portugal"] = {the = true, placetype = {"autonomous region", "ภูมิภาค"}},
["Aveiro District, Portugal"] = {},
["Beja District, Portugal"] = {},
["Braga District, Portugal"] = {},
["Bragança District, Portugal"] = {},
["Castelo Branco District, Portugal"] = {},
["Coimbra District, Portugal"] = {},
["Évora District, Portugal"] = {},
["Faro District, Portugal"] = {},
["Guarda District, Portugal"] = {},
["Leiria District, Portugal"] = {},
["Lisbon District, Portugal"] = {},
["Lisboa District, Portugal"] = {alias_of = "Lisbon District, Portugal", display = true},
["Madeira, Portugal"] = {placetype = {"autonomous region", "ภูมิภาค"}},
["Portalegre District, Portugal"] = {},
["Porto District, Portugal"] = {},
["Santarém District, Portugal"] = {},
["Setúbal District, Portugal"] = {},
["Viana do Castelo District, Portugal"] = {},
["Vila Real District, Portugal"] = {},
["Viseu District, Portugal"] = {},
}
local function portugal_placename_to_key(placename)
if placename == "Azores" or placename == "Madeira" then
return placename .. ", Portugal"
end
if placename:find(" District$") then
return placename .. ", Portugal"
end
return placename .. " District, Portugal"
end
-- districts and autonomous regions of Portugal
export.portugal_group = {
key_to_placename = make_key_to_placename(", Portugal$", " District$"),
placename_to_key = portugal_placename_to_key,
default_container = "Portugal",
default_placetype = "district",
default_divs = "เทศบาล",
data = export.portugal_districts_and_autonomous_regions,
}
export.romania_counties = {
["Alba County, Romania"] = {},
["Arad County, Romania"] = {},
["Argeș County, Romania"] = {},
["Bacău County, Romania"] = {},
["Bihor County, Romania"] = {},
["Bistrița-Năsăud County, Romania"] = {},
["Botoșani County, Romania"] = {},
["Brașov County, Romania"] = {},
["Brăila County, Romania"] = {},
-- Bucharest: not in a county
["Buzău County, Romania"] = {},
["Caraș-Severin County, Romania"] = {},
["Cluj County, Romania"] = {},
["Constanța County, Romania"] = {},
["Covasna County, Romania"] = {},
["Călărași County, Romania"] = {},
["Dolj County, Romania"] = {},
["Dâmbovița County, Romania"] = {},
["Galați County, Romania"] = {},
["Giurgiu County, Romania"] = {},
["Gorj County, Romania"] = {},
["Harghita County, Romania"] = {},
["Hunedoara County, Romania"] = {},
["Ialomița County, Romania"] = {},
["Iași County, Romania"] = {},
["Ilfov County, Romania"] = {},
["Maramureș County, Romania"] = {},
["Mehedinți County, Romania"] = {},
["Mureș County, Romania"] = {},
["Neamț County, Romania"] = {},
["Olt County, Romania"] = {},
["Prahova County, Romania"] = {},
["Satu Mare County, Romania"] = {},
["Sibiu County, Romania"] = {},
["Suceava County, Romania"] = {},
["Sălaj County, Romania"] = {},
["Teleorman County, Romania"] = {},
["Timiș County, Romania"] = {},
["Tulcea County, Romania"] = {},
["Vaslui County, Romania"] = {},
["Vrancea County, Romania"] = {},
["Vâlcea County, Romania"] = {},
}
-- counties of Romania
export.romania_group = {
key_to_placename = make_key_to_placename(", Romania$", " County$"),
placename_to_key = make_placename_to_key(", Romania", " County"),
default_container = "Romania",
default_placetype = "เทศมณฑล",
default_divs = "communes",
data = export.romania_counties,
}
local function make_russia_federal_subject_spec(spectype, use_the, wp)
return {
placetype = spectype,
the = not not use_the,
bare_category_parent_type = {"federal subjects", spectype .. "s"},
wp = wp,
}
end
local russia_autonomous_okrug_no_the =
{placetype = {"autonomous okrug", "okrug"}, bare_category_parent_type = {"federal subjects", "autonomous okrugs"}}
local russia_autonomous_okrug_the =
{placetype = {"autonomous okrug", "okrug"}, bare_category_parent_type = {"federal subjects", "autonomous okrugs"},
the = true}
local russia_krai = make_russia_federal_subject_spec("krai")
local russia_oblast = make_russia_federal_subject_spec("oblast")
local russia_republic_the = make_russia_federal_subject_spec("republic", "use the")
local russia_republic_no_the = make_russia_federal_subject_spec("republic")
export.russia_federal_subjects = {
-- autonomous oblasts
["Jewish Autonomous Oblast, Russia"] =
{the = true, placetype = {"autonomous oblast", "oblast"},
bare_category_parent_type = {"federal subjects", "autonomous oblasts"}},
-- autonomous okrugs
["Chukotka Autonomous Okrug, Russia"] = russia_autonomous_okrug_the,
["Chukotka, Russia"] = {alias_of = "Chukotka Autonomous Okrug, Russia"},
["Khanty-Mansi Autonomous Okrug, Russia"] = russia_autonomous_okrug_the,
["Khanty-Mansia, Russia"] = {alias_of = "Khanty-Mansi Autonomous Okrug, Russia"},
["Khantia-Mansia, Russia"] = {alias_of = "Khanty-Mansi Autonomous Okrug, Russia"},
["Yugra, Russia"] = {alias_of = "Khanty-Mansi Autonomous Okrug, Russia"},
["Nenets Autonomous Okrug, Russia"] = russia_autonomous_okrug_the,
["Nenetsia, Russia"] = {alias_of = "Nenets Autonomous Okrug, Russia"},
["Yamalo-Nenets Autonomous Okrug, Russia"] = russia_autonomous_okrug_the,
["Yamalia, Russia"] = {alias_of = "Yamalo-Nenets Autonomous Okrug, Russia"},
-- krais
["Altai Krai, Russia"] = russia_krai,
["Kamchatka Krai, Russia"] = russia_krai,
["Khabarovsk Krai, Russia"] = russia_krai,
["Krasnodar Krai, Russia"] = russia_krai,
["Krasnoyarsk Krai, Russia"] = russia_krai,
["Perm Krai, Russia"] = russia_krai,
["Primorsky Krai, Russia"] = russia_krai,
["Stavropol Krai, Russia"] = russia_krai,
["Zabaykalsky Krai, Russia"] = russia_krai,
-- oblasts
["Amur Oblast, Russia"] = russia_oblast,
["Arkhangelsk Oblast, Russia"] = russia_oblast,
["Astrakhan Oblast, Russia"] = russia_oblast,
["Belgorod Oblast, Russia"] = russia_oblast,
["Bryansk Oblast, Russia"] = russia_oblast,
["Chelyabinsk Oblast, Russia"] = russia_oblast,
["Irkutsk Oblast, Russia"] = russia_oblast,
["Ivanovo Oblast, Russia"] = russia_oblast,
["Kaliningrad Oblast, Russia"] = russia_oblast,
["Kaluga Oblast, Russia"] = russia_oblast,
["Kemerovo Oblast, Russia"] = russia_oblast,
["Kirov Oblast, Russia"] = russia_oblast,
["Kostroma Oblast, Russia"] = russia_oblast,
["Kurgan Oblast, Russia"] = russia_oblast,
["Kursk Oblast, Russia"] = russia_oblast,
["Leningrad Oblast, Russia"] = russia_oblast,
["Lipetsk Oblast, Russia"] = russia_oblast,
["Magadan Oblast, Russia"] = russia_oblast,
["Moscow Oblast, Russia"] = russia_oblast,
["Murmansk Oblast, Russia"] = russia_oblast,
["Nizhny Novgorod Oblast, Russia"] = russia_oblast,
["Novgorod Oblast, Russia"] = russia_oblast,
["Novosibirsk Oblast, Russia"] = russia_oblast,
["Omsk Oblast, Russia"] = russia_oblast,
["Orenburg Oblast, Russia"] = russia_oblast,
["Oryol Oblast, Russia"] = russia_oblast,
["Penza Oblast, Russia"] = russia_oblast,
["Pskov Oblast, Russia"] = russia_oblast,
["Rostov Oblast, Russia"] = russia_oblast,
["Ryazan Oblast, Russia"] = russia_oblast,
["Sakhalin Oblast, Russia"] = russia_oblast,
["Samara Oblast, Russia"] = russia_oblast,
["Saratov Oblast, Russia"] = russia_oblast,
["Smolensk Oblast, Russia"] = russia_oblast,
["Sverdlovsk Oblast, Russia"] = russia_oblast,
["Tambov Oblast, Russia"] = russia_oblast,
["Tomsk Oblast, Russia"] = russia_oblast,
["Tula Oblast, Russia"] = russia_oblast,
["Tver Oblast, Russia"] = russia_oblast,
["Tyumen Oblast, Russia"] = russia_oblast,
["Ulyanovsk Oblast, Russia"] = russia_oblast,
["Vladimir Oblast, Russia"] = russia_oblast,
["Volgograd Oblast, Russia"] = russia_oblast,
["Vologda Oblast, Russia"] = russia_oblast,
["Voronezh Oblast, Russia"] = russia_oblast,
["Yaroslavl Oblast, Russia"] = russia_oblast,
-- republics
--
-- We only need to include cases that aren't just shortened versions of the full federal subject name (i.e. where
-- words like "Republic" and "Oblast" are omitted but the name is not otherwise modified; these are handled by
-- key_to_placename). Non-display-canonicalizing aliases are generally due to differences in the presence or absence
-- of "the".
["Adygea, Russia"] = russia_republic_no_the,
["Republic of Adygea, Russia"] = {alias_of = "Adygea, Russia", the = true},
["Bashkortostan, Russia"] = russia_republic_no_the,
["Republic of Bashkortostan, Russia"] = {alias_of = "Bashkortostan, Russia", the = true},
["Bashkiria, Russia"] = {alias_of = "Bashkortostan, Russia"},
["Buryatia, Russia"] = russia_republic_no_the,
["Republic of Buryatia, Russia"] = {alias_of = "Buryatia, Russia", the = true},
["Dagestan, Russia"] = russia_republic_no_the,
["Republic of Dagestan, Russia"] = {alias_of = "Dagestan, Russia", the = true},
["Ingushetia, Russia"] = russia_republic_no_the,
["Republic of Ingushetia, Russia"] = {alias_of = "Ingushetia, Russia", the = true},
["Kalmykia, Russia"] = russia_republic_no_the,
["Republic of Kalmykia, Russia"] = {alias_of = "Kalmykia, Russia", the = true},
["Karelia, Russia"] = make_russia_federal_subject_spec("republic", nil, "Republic of Karelia"),
["Republic of Karelia, Russia"] = {alias_of = "Karelia, Russia", the = true},
["Khakassia, Russia"] = russia_republic_no_the,
["Republic of Khakassia, Russia"] = {alias_of = "Khakassia, Russia", the = true},
["Mordovia, Russia"] = russia_republic_no_the,
["Republic of Mordovia, Russia"] = {alias_of = "Mordovia, Russia", the = true},
["North Ossetia-Alania, Russia"] = make_russia_federal_subject_spec("republic", nil, "North Ossetia–Alania"), -- with en-dash
["Republic of North Ossetia-Alania, Russia"] = {alias_of = "North Ossetia-Alania, Russia", the = true},
["North Ossetia, Russia"] = {alias_of = "North Ossetia-Alania, Russia", display = true},
["Alania, Russia"] = {alias_of = "North Ossetia-Alania, Russia", display = true},
["Tatarstan, Russia"] = russia_republic_no_the,
["Republic of Tatarstan, Russia"] = {alias_of = "Tatarstan, Russia", the = true},
["Altai Republic, Russia"] = russia_republic_the,
["Chechnya, Russia"] = russia_republic_no_the,
["Chechen Republic, Russia"] = {alias_of = "Chechnya, Russia", the = true},
["Chuvashia, Russia"] = russia_republic_no_the,
["Chuvash Republic, Russia"] = {alias_of = "Chuvashia, Russia", the = true},
["Kabardino-Balkaria, Russia"] = russia_republic_no_the,
["Kabardino-Balkariya, Russia"] = {alias_of = "Kabardino-Balkaria, Russia", display = true},
["Kabardino-Balkarian Republic, Russia"] = {alias_of = "Kabardino-Balkaria, Russia", the = true},
["Kabardino-Balkar Republic, Russia"] = {alias_of = "Kabardino-Balkaria, Russia",
display = "Kabardino-Balkarian Republic, Russia", the = true},
["Karachay-Cherkessia, Russia"] = russia_republic_no_the,
["Karachay-Cherkess Republic, Russia"] = {alias_of = "Karachay-Cherkessia, Russia"},
["Komi, Russia"] = make_russia_federal_subject_spec("republic", nil, "Komi Republic"),
["Komi Republic, Russia"] = {alias_of = "Komi, Russia", the = true},
["Mari El, Russia"] = russia_republic_no_the,
["Mari El Republic, Russia"] = {alias_of = "Mari El, Russia", the = true},
["Sakha, Russia"] = make_russia_federal_subject_spec("republic", nil, "Sakha Republic"),
["Sakha Republic, Russia"] = {alias_of = "Sakha, Russia", the = true},
["Yakutia, Russia"] = {alias_of = "Sakha, Russia"},
["Yakutiya, Russia"] = {alias_of = "Sakha, Russia", display = "Yakutia, Russia"},
["Republic of Yakutia (Sakha), Russia"] = {alias_of = "Sakha, Russia", display = "Sakha Republic, Russia",
the = true},
["Tuva, Russia"] = russia_republic_no_the,
["Tyva, Russia"] = {alias_of = "Tuva, Russia", display = true},
["Tuva Republic, Russia"] = {alias_of = "Tuva, Russia", the = true},
["Tyva Republic, Russia"] = {alias_of = "Tuva, Russia", display= "Tuva Republic, Russia", the = true},
["Udmurtia, Russia"] = russia_republic_no_the,
["Udmurt Republic, Russia"] = {alias_of = "Udmurtia, Russia", the = true},
-- Not included due to being unrecognized and only partly controlled:
-- ["Crimea, Russia"] = make_russia_federal_subject_spec("republic", nil, "Republic of Crimea (Russia)")
-- ["Donetsk People's Republic, Russia"] = russia_republic_the,
-- ["Luhansk People's Republic, Russia"] = russia_republic_the,
-- ["Zaporozhye Oblast, Russia"] = make_russia_federal_subject_spec("oblast", nil, "Russian occupation of Zaporizhzhia Oblast"),
-- ["Kherson Oblast, Russia"] = make_russia_federal_subject_spec("oblast", nil, "Russian occupation of Kherson Oblast"),
-- There are also federal cities (not included because they're cities):
-- Moscow, Saint Petersburg; Sevastopol (unrecognized; same status as for "Crimea, Russia" above)
}
local function russia_key_to_placename(key)
key = key:gsub(",.*", "")
local full_placename = key
if key == "Jewish Autonomous Oblast" then
return full_placename, full_placename
end
local elliptical_placename
for _, suffix in ipairs({"Krai", "Oblast"}) do
elliptical_placename = key:match("^(.*) " .. suffix .. "$")
if elliptical_placename then
return full_placename, elliptical_placename
end
end
return full_placename, full_placename
end
local function russia_placename_to_key(placename)
local key = placename .. ", Russia"
if export.russia_federal_subjects[key] then
return key
end
-- We allow the user to say e.g. "obl/Samara" in place of "obl/Samara Oblast".
for _, suffix in ipairs({"Krai", "Oblast"}) do
local suffixed_key = placename .. " " .. suffix .. ", Russia"
if export.russia_federal_subjects[suffixed_key] then
return suffixed_key
end
end
return placename .. ", Russia"
end
local function construct_russia_federal_subject_keydesc(group, key, spec)
local placename = key:gsub(",.*", "")
local linked_placename = export.construct_linked_placename(spec, placename)
local placetype = spec.placetype
if type(placetype) == "table" then
placetype = placetype[1]
end
if placetype == "oblast" then
-- Hack: Oblasts generally don't have entries under "Foo Oblast"
-- but just under "Foo", so fix the linked key appropriately;
-- doesn't apply to the Jewish Autonomous Oblast
linked_placename = linked_placename:gsub(" Oblast%]%]", "%]%] Oblast")
end
return linked_placename .. ", a [[federal subject]] ([[" .. placetype .. "]]) of [[Russia]]"
end
-- federal subjects of Russia
export.russia_group = {
key_to_placename = russia_key_to_placename,
placename_to_key = russia_placename_to_key,
default_container = "Russia",
default_keydesc = construct_russia_federal_subject_keydesc,
default_overriding_bare_label_parents = {"federal subjects of Russia", "+++"},
data = export.russia_federal_subjects,
}
export.saudi_arabia_provinces = {
["Riyadh Province, Saudi Arabia"] = {},
["Mecca Province, Saudi Arabia"] = {},
-- Name is too generic to assume it's in Saudi Arabia if not specified.
["Eastern Province, Saudi Arabia"] = {no_auto_augment_container = true, wp = "%l, %c"},
["Medina Province, Saudi Arabia"] = {wp = "%l (%c)"},
["Aseer Province, Saudi Arabia"] = {wp = "Asir"},
["Asir Province, Saudi Arabia"] = {alias_of = "Aseer Province, Saudi Arabia", display = true},
["Jazan Province, Saudi Arabia"] = {},
["Qassim Province, Saudi Arabia"] = {wp = "Al-Qassim Province"},
["Al-Qassim Province, Saudi Arabia"] = {alias_of = "Qassim Province, Saudi Arabia", display = true},
["Tabuk Province, Saudi Arabia"] = {},
["Hail Province, Saudi Arabia"] = {wp = "Ḥa'il Province"},
["Ha'il Province, Saudi Arabia"] = {alias_of = "Hail Province, Saudi Arabia", display = true},
["Ḥa'il Province, Saudi Arabia"] = {alias_of = "Hail Province, Saudi Arabia", display = true},
["Al-Jouf Province, Saudi Arabia"] = {wp = "Al-Jawf Province"},
["Al-Jawf Province, Saudi Arabia"] = {alias_of = "Al-Jouf Province, Saudi Arabia", display = true},
["Najran Province, Saudi Arabia"] = {},
["Northern Borders Province, Saudi Arabia"] = {},
["Al-Bahah Province, Saudi Arabia"] = {},
}
-- provinces of Saudi Arabia
export.saudi_arabia_group = {
key_to_placename = make_key_to_placename(", Saudi Arabia$", " Province$"),
placename_to_key = make_placename_to_key(", Saudi Arabia", " Province"),
default_container = "Saudi Arabia",
default_placetype = "จังหวัด",
data = export.saudi_arabia_provinces,
}
export.south_africa_provinces = {
["Eastern Cape, South Africa"] = {the = true},
["Free State, South Africa"] = {the = true, wp = "%l (จังหวัด)"},
["Gauteng, South Africa"] = {},
["KwaZulu-Natal, South Africa"] = {},
["Limpopo, South Africa"] = {},
["Mpumalanga, South Africa"] = {},
-- per Wikipedia and other sources, `North West` doesn't normally have `the` before it
["North West, South Africa"] = {wp = "%l (South African province)"},
["Northern Cape, South Africa"] = {the = true},
["Western Cape, South Africa"] = {the = true},
}
-- provinces of South Africa
export.south_africa_group = {
default_container = "South Africa",
default_placetype = "จังหวัด",
default_divs = "เทศบาล",
data = export.south_africa_provinces,
}
export.south_korea_provinces = {
["North Chungcheong Province, South Korea"] = {},
["South Chungcheong Province, South Korea"] = {},
["Gangwon Province, South Korea"] = {wp = "%l, %c"},
["Gyeonggi Province, South Korea"] = {},
["North Gyeongsang Province, South Korea"] = {},
["South Gyeongsang Province, South Korea"] = {},
["North Jeolla Province, South Korea"] = {},
["South Jeolla Province, South Korea"] = {},
["Jeju Province, South Korea"] = {},
}
-- provinces of South Korea
export.south_korea_group = {
key_to_placename = make_key_to_placename(", South Korea$", " Province$"),
placename_to_key = make_placename_to_key(", South Korea", " Province"),
default_container = "South Korea",
default_placetype = "จังหวัด",
data = export.south_korea_provinces,
}
export.spain_autonomous_communities = {
["Andalusia, Spain"] = {},
["Aragon, Spain"] = {},
["Asturias, Spain"] = {},
["Balearic Islands, Spain"] = {the = true},
["Basque Country, Spain"] = {the = true, wp = "%l (autonomous community)"},
["Canary Islands, Spain"] = {the = true},
["Cantabria, Spain"] = {},
["Castile and León, Spain"] = {},
["Castilla-La Mancha, Spain"] = {wp = "Castilla–La Mancha"}, -- with en-dash
["Catalonia, Spain"] = {},
["Community of Madrid, Spain"] = {the = true},
["Extremadura, Spain"] = {},
["Galicia, Spain"] = {wp = "%l (Spain)"},
["La Rioja, Spain"] = {},
["Murcia, Spain"] = {wp = "Region of %l"},
["Navarre, Spain"] = {},
["Valencia, Spain"] = {wp = "Valencian Community"},
["Valencian Community, Spain"] = {alias_of = "Valencia, Spain", the = true},
}
-- autonomous communities of Spain
export.spain_group = {
default_container = "Spain",
default_placetype = "autonomous community",
default_divs = {"เทศบาล", "comarcas"},
data = export.spain_autonomous_communities,
}
export.taiwan_counties = {
["จางฮว่า, ไต้หวัน"] = {},
["เจียอี้, ไต้หวัน"] = {},
["ซินจู๋, ไต้หวัน"] = {},
["ฮวาเหลียน, ไต้หวัน"] = {},
["จินเหมิน, ไต้หวัน"] = {wp = "หมู่เกาะจินเหมิน"},
["เหลียนเจียง, ไต้หวัน"] = {wp = "หมู่เกาะหมาจู่"},
["เหมียวลี่, ไต้หวัน"] = {},
["หนานโถว, ไต้หวัน"] = {},
["เผิงหู, ไต้หวัน"] = {wp = "เผิงหู"},
["ผิงตง, ไต้หวัน"] = {},
["ไถตง, ไต้หวัน"] = {},
["อี๋หลาน, ไต้หวัน"] = {wp = "%l, %c"},
["ยฺหวินหลิน, ไต้หวัน"] = {},
}
-- counties of Taiwan
export.taiwan_group = {
key_to_placename = make_key_to_placename(", ไต้หวัน$"),
placename_to_key = make_placename_to_key(", ไต้หวัน"),
default_container = "ไต้หวัน",
default_placetype = "เทศมณฑล",
default_divs = {"อำเภอ", "townships"},
data = export.taiwan_counties,
}
export.thailand_provinces = { --ไม่ต้องเติม จังหวัด
-- กรุงเทพมหานคร (Bangkok - special administrative area)
["อำนาจเจริญ, ไทย"] = {},
["อ่างทอง, ไทย"] = {},
["บึงกาฬ, ไทย"] = {},
["บุรีรัมย์, ไทย"] = {},
["ฉะเชิงเทรา, ไทย"] = {},
["ชัยนาท, ไทย"] = {},
["ชัยภูมิ, ไทย"] = {},
["จันทบุรี, ไทย"] = {},
["เชียงใหม่, ไทย"] = {},
["เชียงราย, ไทย"] = {},
["ชลบุรี, ไทย"] = {},
["ชุมพร, ไทย"] = {},
["กาฬสินธุ์, ไทย"] = {},
["กำแพงเพชร, ไทย"] = {},
["กาญจนบุรี, ไทย"] = {},
["ขอนแก่น, ไทย"] = {},
["กระบี่, ไทย"] = {},
["ลำปาง, ไทย"] = {},
["ลำพูน, ไทย"] = {},
["เลย, ไทย"] = {},
["ลพบุรี, ไทย"] = {},
["แม่ฮ่องสอน, ไทย"] = {},
["มหาสารคาม, ไทย"] = {},
["มุกดาหาร, ไทย"] = {},
["นครนายก, ไทย"] = {},
["นครปฐม, ไทย"] = {},
["นครพนม, ไทย"] = {},
["นครราชสีมา, ไทย"] = {},
["นครสวรรค์, ไทย"] = {},
["นครศรีธรรมราช, ไทย"] = {},
["น่าน, ไทย"] = {},
["นราธิวาส, ไทย"] = {},
["หนองบัวลำภู, ไทย"] = {},
["หนองคาย, ไทย"] = {},
["นนทบุรี, ไทย"] = {},
["ปทุมธานี, ไทย"] = {},
["ปัตตานี, ไทย"] = {},
["พังงา, ไทย"] = {},
["พัทลุง, ไทย"] = {},
["พะเยา, ไทย"] = {},
["เพชรบูรณ์, ไทย"] = {},
["เพชรบุรี, ไทย"] = {},
["พิจิตร, ไทย"] = {},
["พิษณุโลก, ไทย"] = {},
["พระนครศรีอยุธยา, ไทย"] = {},
["แพร่, ไทย"] = {},
["ภูเก็ต, ไทย"] = {},
["ปราจีนบุรี, ไทย"] = {},
["ประจวบคีรีขันธ์, ไทย"] = {},
["ระนอง, ไทย"] = {},
["ราชบุรี, ไทย"] = {},
["ระยอง, ไทย"] = {},
["ร้อยเอ็ด, ไทย"] = {},
["สระแก้ว, ไทย"] = {},
["สกลนคร, ไทย"] = {},
["สมุทรปราการ, ไทย"] = {},
["สมุทรสาคร, ไทย"] = {},
["สมุทรสงคราม, ไทย"] = {},
["สระบุรี, ไทย"] = {},
["สตูล, ไทย"] = {},
["สิงห์บุรี, ไทย"] = {},
["ศรีสะเกษ, ไทย"] = {},
["สงขลา, ไทย"] = {},
["สุโขทัย, ไทย"] = {},
["สุพรรณบุรี, ไทย"] = {},
["สุราษฎร์ธานี, ไทย"] = {},
["สุรินทร์, ไทย"] = {},
["ตาก, ไทย"] = {},
["ตรัง, ไทย"] = {},
["ตราด, ไทย"] = {},
["อุบลราชธานี, ไทย"] = {},
["อุดรธานี, ไทย"] = {},
["อุทัยธานี, ไทย"] = {},
["อุตรดิตถ์, ไทย"] = {},
["ยะลา, ไทย"] = {},
["ยโสธร, ไทย"] = {},
}
-- provinces of Thailand
export.thailand_group = {
key_to_placename = make_key_to_placename(", ไทย$"), --ไม่ต้องเติม จังหวัด
placename_to_key = make_placename_to_key(", ไทย"),
default_container = "ไทย",
default_placetype = "จังหวัด",
default_divs = "อำเภอ",
-- For obscure reasons, provinces of Iran, Laos, Thailand and Vietnam use lowercase 'province'
default_wp = "จังหวัด%e",
data = export.thailand_provinces,
}
export.turkey_provinces = {
["Adana Province, Turkey"] = {}, -- code 01
["Adıyaman Province, Turkey"] = {}, -- code 02
["Afyonkarahisar Province, Turkey"] = {}, -- code 03
["Ağrı Province, Turkey"] = {}, -- code 04
["Amasya Province, Turkey"] = {}, -- code 05
["Ankara Province, Turkey"] = {}, -- code 06
["Antalya Province, Turkey"] = {}, -- code 07
["Artvin Province, Turkey"] = {}, -- code 08
["Aydın Province, Turkey"] = {}, -- code 09
["Balıkesir Province, Turkey"] = {}, -- code 10
["Bilecik Province, Turkey"] = {}, -- code 11
["Bingöl Province, Turkey"] = {}, -- code 12
["Bitlis Province, Turkey"] = {}, -- code 13
["Bolu Province, Turkey"] = {}, -- code 14
["Burdur Province, Turkey"] = {}, -- code 15
["Bursa Province, Turkey"] = {}, -- code 16
["Çanakkale Province, Turkey"] = {}, -- code 17
["Çankırı Province, Turkey"] = {}, -- code 18
["Çorum Province, Turkey"] = {}, -- code 19
["Denizli Province, Turkey"] = {}, -- code 20
["Diyarbakır Province, Turkey"] = {}, -- code 21
["Edirne Province, Turkey"] = {}, -- code 22
["Elazığ Province, Turkey"] = {}, -- code 23
["Elâzığ Province, Turkey"] = {alias_of = "Elazığ Province, Turkey", display = true},
["Erzincan Province, Turkey"] = {}, -- code 24
["Erzurum Province, Turkey"] = {}, -- code 25
["Eskişehir Province, Turkey"] = {}, -- code 26
["Gaziantep Province, Turkey"] = {}, -- code 27
["Giresun Province, Turkey"] = {}, -- code 28
["Gümüşhane Province, Turkey"] = {}, -- code 29
["Hakkâri Province, Turkey"] = {}, -- code 30
["Hakkari Province, Turkey"] = {alias_of = "Hakkâri Province, Turkey", display = true},
["Hatay Province, Turkey"] = {}, -- code 31
["Isparta Province, Turkey"] = {}, -- code 32
["Mersin Province, Turkey"] = {}, -- code 33
-- ["Istanbul Province, Turkey"] = {}, -- code 34; this is coextensive with the city itself
["İzmir Province, Turkey"] = {}, -- code 35
["Izmir Province, Turkey"] = {alias_of = "İzmir Province, Turkey", display = true},
["Kars Province, Turkey"] = {}, -- code 36
["Kastamonu Province, Turkey"] = {}, -- code 37
["Kayseri Province, Turkey"] = {}, -- code 38
["Kırklareli Province, Turkey"] = {}, -- code 39
["Kırşehir Province, Turkey"] = {}, -- code 40
["Kocaeli Province, Turkey"] = {}, -- code 41
["Konya Province, Turkey"] = {}, -- code 42
["Kütahya Province, Turkey"] = {}, -- code 43
["Malatya Province, Turkey"] = {}, -- code 44
["Manisa Province, Turkey"] = {}, -- code 45
["Kahramanmaraş Province, Turkey"] = {}, -- code 46
["Mardin Province, Turkey"] = {}, -- code 47
["Muğla Province, Turkey"] = {}, -- code 48
["Muş Province, Turkey"] = {}, -- code 49
["Nevşehir Province, Turkey"] = {}, -- code 50
["Niğde Province, Turkey"] = {}, -- code 51
["Ordu Province, Turkey"] = {}, -- code 52
["Rize Province, Turkey"] = {}, -- code 53
["Sakarya Province, Turkey"] = {}, -- code 54
["Samsun Province, Turkey"] = {}, -- code 55
["Siirt Province, Turkey"] = {}, -- code 56
["Sinop Province, Turkey"] = {}, -- code 57
["Sivas Province, Turkey"] = {}, -- code 58
["Tekirdağ Province, Turkey"] = {}, -- code 59
["Tokat Province, Turkey"] = {}, -- code 60
["Trabzon Province, Turkey"] = {}, -- code 61
["Tunceli Province, Turkey"] = {}, -- code 62
["Şanlıurfa Province, Turkey"] = {}, -- code 63
["Uşak Province, Turkey"] = {}, -- code 64
["Van Province, Turkey"] = {}, -- code 65
["Yozgat Province, Turkey"] = {}, -- code 66
["Zonguldak Province, Turkey"] = {}, -- code 67
["Aksaray Province, Turkey"] = {}, -- code 68
["Bayburt Province, Turkey"] = {}, -- code 69
["Karaman Province, Turkey"] = {}, -- code 70
["Kırıkkale Province, Turkey"] = {}, -- code 71
["Batman Province, Turkey"] = {}, -- code 72
["Şırnak Province, Turkey"] = {}, -- code 73
["Bartın Province, Turkey"] = {}, -- code 74
["Ardahan Province, Turkey"] = {}, -- code 75
["Iğdır Province, Turkey"] = {}, -- code 76
["Yalova Province, Turkey"] = {}, -- code 77
["Karabük Province, Turkey"] = {}, -- code 78
["Kilis Province, Turkey"] = {}, -- code 79
["Osmaniye Province, Turkey"] = {}, -- code 80
["Düzce Province, Turkey"] = {}, -- code 81
}
-- provinces of Turkey
export.turkey_group = {
key_to_placename = make_key_to_placename(", Turkey$", " Province$"),
placename_to_key = make_placename_to_key(", Turkey", " Province"),
default_container = "Turkey",
default_placetype = "จังหวัด",
default_divs = "อำเภอ",
data = export.turkey_provinces,
}
export.ukraine_oblasts = {
["Cherkasy Oblast, Ukraine"] = {}, -- capital [[Cherkasy]], license plate prefix CA, IA
["Chernihiv Oblast, Ukraine"] = {}, -- capital [[Chernihiv]], license plate prefix CB, IB
["Chernivtsi Oblast, Ukraine"] = {}, -- capital [[Chernivtsi]], license plate prefix CE, IE
-- apparently will be renamed to 'Dnipro Oblast'
["Dnipropetrovsk Oblast, Ukraine"] = {}, -- capital [[Dnipro]], license plate prefix AE, KE
["Donetsk Oblast, Ukraine"] = {}, -- capital ''[[Donetsk]] ([[Kramatorsk]])'', license plate prefix AH, KH
["Ivano-Frankivsk Oblast, Ukraine"] = {}, -- capital [[Ivano-Frankivsk]], license plate prefix AT, KT
["Kharkiv Oblast, Ukraine"] = {}, -- capital [[Kharkiv]], license plate prefix AX, KX
["Kherson Oblast, Ukraine"] = {}, -- capital ''[[Kherson]]'', license plate prefix ''BT, HT''
["Khmelnytskyi Oblast, Ukraine"] = {}, -- capital [[Khmelnytskyi]], license plate prefix BX, HX
-- apparently will be renamed to 'Kropyvnytskyi Oblast'
["Kirovohrad Oblast, Ukraine"] = {}, -- capital [[Kropyvnytskyi]], license plate prefix BA, HA
["Kyiv Oblast, Ukraine"] = {}, -- capital [[Kyiv]], license plate prefix AI, KI
["Kiev Oblast, Ukraine"] = {alias_of = "Kyiv Oblast, Ukraine", display = true},
["Luhansk Oblast, Ukraine"] = {}, -- capital ''[[Luhansk]] ([[Sievierodonetsk]])'', license plate prefix BB, HB
["Lviv Oblast, Ukraine"] = {}, -- capital [[Lviv]], license plate prefix BC, HC
["Mykolaiv Oblast, Ukraine"] = {}, -- capital [[Mykolaiv]], license plate prefix BE, HE
["Odesa Oblast, Ukraine"] = {}, -- capital [[Odesa]], license plate prefix BH, HH
["Odessa Oblast, Ukraine"] = {alias_of = "Odesa Oblast, Ukraine", display = true},
["Poltava Oblast, Ukraine"] = {}, -- capital [[Poltava]], license plate prefix BI, HI
["Rivne Oblast, Ukraine"] = {}, -- capital [[Rivne]], license plate prefix BK, HK
["Sumy Oblast, Ukraine"] = {}, -- capital [[Sumy]], license plate prefix BM, HM
["Ternopil Oblast, Ukraine"] = {}, -- capital [[Ternopil]], license plate prefix BO, HO
["Vinnytsia Oblast, Ukraine"] = {}, -- capital [[Vinnytsia]], license plate prefix AB, KB
["Volyn Oblast, Ukraine"] = {}, -- capital [[Lutsk]], license plate prefix AC, KC
["Zakarpattia Oblast, Ukraine"] = {}, -- capital [[Uzhhorod]], license plate prefix AO, KO
["Zaporizhzhia Oblast, Ukraine"] = {}, -- capital ''[[Zaporizhzhia]]'', license plate prefix AP, KP
["Zaporizhia Oblast, Ukraine"] = {alias_of = "Zaporizhzhia Oblast, Ukraine", display = true},
["Zhytomyr Oblast, Ukraine"] = {}, -- capital [[Zhytomyr]], license plate prefix AM, KM
}
-- oblasts of Ukraine
export.ukraine_group = {
key_to_placename = make_key_to_placename(", Ukraine$", " Oblast$"),
placename_to_key = make_placename_to_key(", Ukraine", " Oblast"),
default_container = "Ukraine",
default_placetype = "oblast",
default_divs = {"raions", "hromadas"},
data = export.ukraine_oblasts,
}
export.united_kingdom_constituent_countries = {
["England"] = {divs = {
"เทศมณฑล",
"อำเภอ",
{type = "local government districts", cat_as = "อำเภอ"},
{
type = "local government districts with borough status",
cat_as = {"อำเภอ", "boroughs"},
},
{type = "boroughs", cat_as = {"อำเภอ", "boroughs"}},
{type = "civil parishes", container_parent_type = false},
}},
["Northern Ireland"] = {
placetype = {"constituent country", "จังหวัด", "ประเทศ"},
divs = {"เทศมณฑล", "อำเภอ"},
},
["Scotland"] = {divs = {
{type = "council areas", container_parent_type = false},
"อำเภอ",
}},
["Wales"] = {divs = {
"เทศมณฑล",
{type = "county boroughs", container_parent_type = false},
{type = "communities", container_parent_type = false},
{type = "Welsh communities", cat_as = {{type = "communities", container_parent_type = false}}},
}},
}
-- constituent countries and provinces of the United Kingdom
export.united_kingdom_group = {
placename_to_key = false,
default_container = "สหราชอาณาจักร",
default_placetype = {"constituent country", "ประเทศ"},
addl_divs = {
"traditional counties",
{type = "historical counties", cat_as = "traditional counties"},
},
-- Don't create categories like 'Category:en:Towns in the United Kingdom'
-- or 'Category:en:Places in the United Kingdom'.
default_no_container_cat = true,
data = export.united_kingdom_constituent_countries,
}
export.england_counties = {
-- NOTE: We used to have various other "no longer" counties commented out, which seems to refer to counties that
-- existed officially at some point between 1889 and 1974, which I have removed. I have only kept the three
-- ceremonial counties that existed from 1974 (when ceremonial counties were created) to 1996, as well as those
-- still considered "historic counties" per [[w:Historic counties of England]].
-- ["Avon, England"] = {wp = "%l (county)"}, -- no longer (1974 to 1996)
["Bedfordshire, England"] = {},
["Berkshire, England"] = {},
-- ["Brighton and Hove, England"] = {}, -- city
-- ["Bristol, England"] = {}, -- city
["Buckinghamshire, England"] = {},
["Cambridgeshire, England"] = {},
["Cheshire, England"] = {},
-- ["Cleveland, England"] = {wp = "%l (county)"}, -- no longer (1974 to 1996)
["Cornwall, England"] = {},
-- ["Cumberland, England"] = {}, -- no longer (historic county)
["Cumbria, England"] = {},
["Derbyshire, England"] = {},
["Devon, England"] = {},
["Dorset, England"] = {},
["County Durham, England"] = {},
["East Sussex, England"] = {},
["Essex, England"] = {},
["Gloucestershire, England"] = {},
["Greater London, England"] = {},
["Greater Manchester, England"] = {},
["Hampshire, England"] = {},
["Herefordshire, England"] = {},
["Hertfordshire, England"] = {},
-- ["Humberside, England"] = {}, -- no longer (1974 to 1996)
-- ["Huntingdonshire, England"] = {}, -- no longer (historic county)
["Isle of Wight, England"] = {the = true},
["Kent, England"] = {},
["Lancashire, England"] = {},
["Leicestershire, England"] = {},
["Lincolnshire, England"] = {},
["Merseyside, England"] = {},
-- ["Middlesex, England"] = {}, -- no longer (historic county)
["Norfolk, England"] = {},
["Northamptonshire, England"] = {},
["Northumberland, England"] = {},
["North Yorkshire, England"] = {},
["Nottinghamshire, England"] = {},
["Oxfordshire, England"] = {},
["Rutland, England"] = {},
["Shropshire, England"] = {},
["Somerset, England"] = {},
["South Humberside, England"] = {},
["South Yorkshire, England"] = {},
["Staffordshire, England"] = {},
["Suffolk, England"] = {},
["Surrey, England"] = {},
-- ["Sussex, England"] = {}, -- no longer (historic county)
["Tyne and Wear, England"] = {},
["Warwickshire, England"] = {},
["West Midlands, England"] = {the = true, wp = "%l (county)"},
-- ["Westmorland, England"] = {}, -- no longer (historic county)
["West Sussex, England"] = {},
["West Yorkshire, England"] = {},
["Wiltshire, England"] = {},
["Worcestershire, England"] = {},
-- ["Yorkshire, England"] = {}, -- no longer (historic county)
["East Riding of Yorkshire, England"] = {the = true},
}
-- counties of England
export.england_group = {
default_container = {key = "England", placetype = "constituent country"},
default_placetype = "เทศมณฑล",
default_divs = {
"อำเภอ",
{type = "local government districts", cat_as = "อำเภอ"},
{
type = "local government districts with borough status",
cat_as = {"อำเภอ", "boroughs"},
},
{type = "boroughs", cat_as = {"อำเภอ", "boroughs"}},
"civil parishes",
},
data = export.england_counties,
}
export.northern_ireland_counties = {
["County Antrim, Northern Ireland"] = {},
["County Armagh, Northern Ireland"] = {},
["City of Belfast, Northern Ireland"] = {the = true, is_city = true, wp = "Belfast"},
["County Down, Northern Ireland"] = {},
["County Fermanagh, Northern Ireland"] = {},
["County Londonderry, Northern Ireland"] = {},
["City of Derry, Northern Ireland"] = {the = true, is_city = true, wp = "Derry"},
["County Tyrone, Northern Ireland"] = {},
}
-- counties of Northern Ireland
export.northern_ireland_group = {
key_to_placename = make_irish_type_key_to_placename(", Northern Ireland$"),
placename_to_key = make_irish_type_placename_to_key(", Northern Ireland"),
default_container = {key = "Northern Ireland", placetype = "constituent country"},
default_placetype = "เทศมณฑล",
data = export.northern_ireland_counties,
}
export.scotland_council_areas = {
["Aberdeenshire, Scotland"] = {},
["Angus, Scotland"] = {wp = "%l, %c"},
["Argyll and Bute, Scotland"] = {},
["City of Aberdeen, Scotland"] = {the = true, wp = "Aberdeen"},
["Aberdeen"] = {alias_of = "City of Aberdeen, Scotland"},
["Aberdeen City"] = {alias_of = "City of Aberdeen, Scotland"},
["City of Dundee, Scotland"] = {the = true, wp = "Dundee"},
["Dundee"] = {alias_of = "City of Dundee, Scotland"},
["Dundee City"] = {alias_of = "City of Dundee, Scotland"},
["City of Edinburgh, Scotland"] = {the = true, wp = "%l council area"},
["Edinburgh"] = {alias_of = "City of Edinburgh, Scotland"},
["City of Glasgow, Scotland"] = {the = true, wp = "Glasgow"},
["Glasgow"] = {alias_of = "City of Glasgow, Scotland"},
["Clackmannanshire, Scotland"] = {},
["Dumfries and Galloway, Scotland"] = {},
["East Ayrshire, Scotland"] = {},
["East Dunbartonshire, Scotland"] = {},
["East Lothian, Scotland"] = {},
["East Renfrewshire, Scotland"] = {},
["Falkirk, Scotland"] = {wp = "%l council area"},
["Fife, Scotland"] = {},
["Highland, Scotland"] = {wp = "%l council area"},
["Inverclyde, Scotland"] = {},
["Midlothian, Scotland"] = {},
["Moray, Scotland"] = {},
["North Ayrshire, Scotland"] = {},
["North Lanarkshire, Scotland"] = {},
["Orkney Islands, Scotland"] = {the = true},
["Perth and Kinross, Scotland"] = {},
["Renfrewshire, Scotland"] = {},
["Scottish Borders, Scotland"] = {the = true},
["Shetland Islands, Scotland"] = {the = true},
["South Ayrshire, Scotland"] = {},
["South Lanarkshire, Scotland"] = {},
["Stirling, Scotland"] = {wp = "%l council area"},
["West Dunbartonshire, Scotland"] = {},
["West Lothian, Scotland"] = {},
["Western Isles, Scotland"] = {the = true, wp = "Outer Hebrides"},
["Na h-Eileanan Siar, Scotland"] = {alias_of = "Western Isles, Scotland"},
}
-- council areas of Scotland
export.scotland_group = {
default_container = {key = "Scotland", placetype = "constituent country"},
default_placetype = "council area",
data = export.scotland_council_areas,
}
export.wales_principal_areas = {
["Blaenau Gwent, Wales"] = {},
["Bridgend, Wales"] = {wp = "%l County Borough"},
["Caerphilly, Wales"] = {wp = "%l County Borough"},
-- ["Cardiff, Wales"] = {placetype = "นคร"},
["Carmarthenshire, Wales"] = {placetype = "เทศมณฑล"},
["Ceredigion, Wales"] = {placetype = "เทศมณฑล"},
["Conwy, Wales"] = {wp = "%l County Borough"},
["Denbighshire, Wales"] = {placetype = "เทศมณฑล"},
["Flintshire, Wales"] = {placetype = "เทศมณฑล"},
["Gwynedd, Wales"] = {placetype = "เทศมณฑล"},
["Isle of Anglesey, Wales"] = {the = true, placetype = "เทศมณฑล"},
["Anglesey, Wales"] = {alias_of = "Isle of Anglesey, Wales"}, -- differs in "the"
["Merthyr Tydfil, Wales"] = {wp = "%l County Borough"},
["Monmouthshire, Wales"] = {placetype = "เทศมณฑล"},
["Neath Port Talbot, Wales"] = {},
-- ["Newport, Wales"] = {placetype = "นคร", wp = "%l, %c"},
["Pembrokeshire, Wales"] = {placetype = "เทศมณฑล"},
["Powys, Wales"] = {placetype = "เทศมณฑล"},
["Rhondda Cynon Taf, Wales"] = {},
-- ["Swansea, Wales"] = {placetype = "นคร"},
["Torfaen, Wales"] = {},
["Vale of Glamorgan, Wales"] = {the = true},
["Wrexham, Wales"] = {wp = "%l County Borough"},
}
-- principal areas (cities, counties and county boroughs) of Wales
export.wales_group = {
default_container = {key = "Wales", placetype = "constituent country"},
default_placetype = "county borough",
data = export.wales_principal_areas,
}
export.united_states_states = {
["Alabama, USA"] = {},
["Alaska, USA"] = {divs = {
{type = "boroughs", container_parent_type = "เทศมณฑล"},
{type = "borough seats", container_parent_type = "county seats"},
}},
["Arizona, USA"] = {},
["Arkansas, USA"] = {},
["California, USA"] = {},
["Colorado, USA"] = {divs = {"เทศมณฑล", "county seats", "เทศบาล"}},
["Connecticut, USA"] = {divs = {"เทศมณฑล", "county seats", "เทศบาล"}},
["Delaware, USA"] = {},
["Florida, USA"] = {},
["Georgia, USA"] = {wp = "%l (U.S. state)"},
["Hawaii, USA"] = {addl_parents = {"พอลินีเชีย"}},
["Idaho, USA"] = {},
["Illinois, USA"] = {},
["Indiana, USA"] = {},
["Iowa, USA"] = {},
["Kansas, USA"] = {},
["Kentucky, USA"] = {},
["Louisiana, USA"] = {divs = {
{type = "parishes", container_parent_type = "เทศมณฑล"},
{type = "parish seats", container_parent_type = "county seats"},
}},
["Maine, USA"] = {},
["Maryland, USA"] = {},
["Massachusetts, USA"] = {},
["Michigan, USA"] = {},
["Minnesota, USA"] = {},
["Mississippi, USA"] = {},
["Missouri, USA"] = {},
["Montana, USA"] = {},
["Nebraska, USA"] = {},
["Nevada, USA"] = {},
["New Hampshire, USA"] = {},
["New Jersey, USA"] = {divs = {
"เทศมณฑล", "county seats",
{type = "boroughs", prep = "ใน"},
}},
["New Mexico, USA"] = {},
["New York, USA"] = {wp = "%l (รัฐ)"},
["North Carolina, USA"] = {},
["North Dakota, USA"] = {},
["Ohio, USA"] = {},
["Oklahoma, USA"] = {},
["Oregon, USA"] = {},
["Pennsylvania, USA"] = {divs = {
"เทศมณฑล", "county seats",
{type = "boroughs", prep = "ใน"},
}},
["Rhode Island, USA"] = {},
["South Carolina, USA"] = {},
["South Dakota, USA"] = {},
["Tennessee, USA"] = {},
["Texas, USA"] = {},
["Utah, USA"] = {},
["Vermont, USA"] = {},
["Virginia, USA"] = {},
["Washington, USA"] = {wp = "%l (รัฐ)"},
["West Virginia, USA"] = {},
["Wisconsin, USA"] = {},
["Wyoming, USA"] = {},
}
-- states of the United States
export.united_states_group = {
placename_to_key = make_placename_to_key(", USA"),
default_container = "สหรัฐอเมริกา",
default_placetype = "รัฐ",
default_divs = {"เทศมณฑล", "county seats"},
addl_divs = {
{type = "census-designated places", prep = "ใน"},
{type = "unincorporated communities", prep = "ใน"},
},
data = export.united_states_states,
}
export.vietnam_provinces = {
-- [[Northeast (Vietnam)|Northeast]] region
["Bắc Giang, เวียดนาม"] = {}, -- capital [[Bắc Giang]]
["Bắc Kạn, เวียดนาม"] = {}, -- capital [[Bắc Kạn]]
["Cao Bằng, เวียดนาม"] = {}, -- capital [[Cao Bằng]]
["Hà Giang, เวียดนาม"] = {}, -- capital [[Hà Giang]]
["Lạng Sơn, เวียดนาม"] = {}, -- capital [[Lạng Sơn]]
["Phú Thọ, เวียดนาม"] = {}, -- capital [[Việt Trì]]
["Quảng Ninh, เวียดนาม"] = {}, -- capital [[Hạ Long]]
["Thái Nguyên, เวียดนาม"] = {}, -- capital [[Thái Nguyên]]
["Tuyên Quang, เวียดนาม"] = {}, -- capital [[Tuyên Quang]]
-- [[Northwest (Vietnam)|Northwest]] region
["Lào Cai, เวียดนาม"] = {}, -- capital [[Lào Cai]]
["Yên Bái, เวียดนาม"] = {}, -- capital [[Yên Bái]]
["Điện Biên, เวียดนาม"] = {}, -- capital [[Điện Biên Phủ]]
["Hoà Bình, เวียดนาม"] = {}, -- capital [[Hoà Bình City|Hoà Bình]]
["Hòa Bình, เวียดนาม"] = {alias_of = "Hoà Bình, เวียดนาม", display = true},
["Lai Châu, เวียดนาม"] = {}, -- capital [[Lai Châu]]
["Sơn La, เวียดนาม"] = {}, -- capital [[Sơn La]]
-- [[Red River Delta]] region
["Bắc Ninh, เวียดนาม"] = {}, -- capital [[Bắc Ninh]]
["Hà Nam, เวียดนาม"] = {}, -- capital [[Phủ Lý]]
["Hải Dương, เวียดนาม"] = {}, -- capital [[Hải Dương]]
["Hưng Yên, เวียดนาม"] = {}, -- capital [[Hưng Yên]]
["Nam Định, เวียดนาม"] = {}, -- capital [[Nam Định]]
["Ninh Bình, เวียดนาม"] = {}, -- capital [[Ninh Bình|Hoa Lư]]
["Thái Bình, เวียดนาม"] = {}, -- capital [[Thái Bình]]
["Vĩnh Phúc, เวียดนาม"] = {}, -- capital [[Vĩnh Yên]]
-- ["Hanoi"] = {placetype = {"เทศบาล", "นคร"}}, -- capital [[Hoàn Kiếm district]]
-- ["Haiphong"] = {placetype = {"เทศบาล", "นคร"}}, -- capital [[Hồng Bàng district]]
-- [[North Central Coast]] region
["Hà Tĩnh, เวียดนาม"] = {}, -- capital [[Hà Tĩnh]]
["Nghệ An, เวียดนาม"] = {}, -- capital [[Vinh]]
["Quảng Bình, เวียดนาม"] = {}, -- capital [[Đồng Hới]]
["Quảng Trị, เวียดนาม"] = {}, -- capital [[Đông Hà]]
["Thanh Hoá, เวียดนาม"] = {}, -- capital [[Thanh Hoá]]
["Thanh Hóa, เวียดนาม"] = {alias_of = "Thanh Hoá, เวียดนาม", display = true},
-- ["Hue"] = {placetype = {"เทศบาล", "นคร"}, wp = "Huế"}, -- capital [[Thuận Hoá district]]
-- [[Central Highlands (Vietnam)|Central Highlands]] region
["Đắk Lắk, เวียดนาม"] = {}, -- capital [[Buôn Ma Thuột]]
["Đăk Nông, เวียดนาม"] = {}, -- capital [[Gia Nghĩa]]
["Gia Lai, เวียดนาม"] = {}, -- capital [[Pleiku]]
["Kon Tum, เวียดนาม"] = {}, -- capital [[Kon Tum]]
["Lâm Đồng, เวียดนาม"] = {}, -- capital [[Đà Lạt]]
-- [[South Central Coast]] region
["Bình Định, เวียดนาม"] = {}, -- capital [[Quy Nhon]]
["Bình Thuận, เวียดนาม"] = {}, -- capital [[Phan Thiết]]
["Khánh Hoà, เวียดนาม"] = {}, -- capital [[Nha Trang]]
["Khánh Hòa, เวียดนาม"] = {alias_of = "Khánh Hoà, เวียดนาม", display = true},
["Ninh Thuận, เวียดนาม"] = {}, -- capital [[Phan Rang–Tháp Chàm]]
["Phú Yên, เวียดนาม"] = {}, -- capital [[Tuy Hoà]]
["Quảng Nam, เวียดนาม"] = {}, -- capital [[Tam Kỳ]]
["Quảng Ngãi, เวียดนาม"] = {}, -- capital [[Quảng Ngãi]]
-- ["Da Nang"] = {placetype = {"เทศบาล", "นคร"}}, -- capital [[Hải Châu district]]
-- [[Southeast (Vietnam)|Southeast]] region
["Bà Rịa–Vũng Tàu, เวียดนาม"] = {}, -- capital [[Bà Rịa]]
["Bình Dương, เวียดนาม"] = {}, -- capital [[Thủ Dầu Một]]
["Bình Phước, เวียดนาม"] = {}, -- capital [[Đồng Xoài]]
["Đồng Nai, เวียดนาม"] = {}, -- capital [[Biên Hoà]]
["Tây Ninh, เวียดนาม"] = {}, -- capital [[Tây Ninh]]
-- ["Ho Chi Minh City"] = {placetype = {"เทศบาล", "นคร"}}, -- capital [[District 1, Ho Chi Minh City|'''District 1''']]
-- [[Mekong Delta]] region
["An Giang, เวียดนาม"] = {}, -- capital [[Long Xuyên]]
["Bạc Liêu, เวียดนาม"] = {}, -- capital [[Bạc Liêu]]
["Bến Tre, เวียดนาม"] = {}, -- capital [[Bến Tre]]
["Cà Mau, เวียดนาม"] = {}, -- capital [[Cà Mau]]
["Đồng Tháp, เวียดนาม"] = {}, -- capital [[Cao Lãnh City|Cao Lãnh]]
["Hậu Giang, เวียดนาม"] = {}, -- capital [[Vị Thanh]]
["Kiên Giang, เวียดนาม"] = {}, -- capital [[Rạch Giá]]
["Long An, เวียดนาม"] = {}, -- capital [[Tân An]]
["Sóc Trăng, เวียดนาม"] = {}, -- capital [[Sóc Trăng]]
["Tiền Giang, เวียดนาม"] = {}, -- capital [[Mỹ Tho]]
["Trà Vinh, เวียดนาม"] = {}, -- capital [[Trà Vinh]]
["Vĩnh Long, เวียดนาม"] = {}, -- capital [[Vĩnh Long]]
-- ["Can Tho"] = {placetype = {"เทศบาล", "นคร"}, wp = "Cần Thơ"}, -- capital [[Ninh Kiều district]]
}
-- provinces of Vietnam
export.vietnam_group = {
key_to_placename = make_key_to_placename(", เวียดนาม$"),
placename_to_key = make_placename_to_key(", เวียดนาม"),
default_container = "เวียดนาม",
default_placetype = "จังหวัด",
-- There may not be enough districts to subcategorize like this.
-- default_divs = "อำเภอ",
-- For obscure reasons, provinces of Iran, Laos, Thailand and Vietnam use lowercase 'province'
default_wp = "จังหวัด%e",
data = export.vietnam_provinces,
}
-----------------------------------------------------------------------------------
-- City data --
-----------------------------------------------------------------------------------
export.australia_cities = {
["Adelaide"] = {container = "South Australia"}, -- 1,450,000 (Agglomeration)
["Brisbane"] = {container = "Queensland"}, -- 3,450,000 (Conglomeration; including the Gold Coast [750,997 2024 estiamte])
["Canberra"] = {container = {key = "Australian Capital Territory, ออสเตรเลีย", placetype = "ดินแดน"}}, -- 510,641 (2024 estimate)
["Melbourne"] = {container = "Victoria"}, -- 5,200,000 (Agglomeration)
["Newcastle, New South Wales"] = {container = "New South Wales", wp = "%l, %c"}, -- 534,033 (2024 estimate)
["Newcastle"] = {alias_of = "Newcastle, New South Wales"},
["Perth"] = {container = "Western Australia"}, -- 2,350,000 (Agglomeration)
["Sydney"] = {container = "New South Wales"}, -- 5,100,000 (Agglomeration)
}
export.australia_cities_group = {
canonicalize_key_container = make_canonicalize_key_container(", ออสเตรเลีย", "รัฐ"),
default_placetype = "นคร",
data = export.australia_cities,
}
export.brazil_cities = {
-- Figures from citypopulation.de; retrieved 2025-04-27; reference date 2025-01-01.
["São Paulo"] = {container = "São Paulo"}, -- 22,600,000 (Consolidated Urban Area; including Guarulhos)
["Sao Paulo"] = {alias_of = "São Paulo", display = true},
["Rio de Janeiro"] = {container = "Rio de Janeiro"}, -- 13,600,000 (Consolidated Urban Area)
["Belo Horizonte"] = {container = "Minas Gerais"}, -- 5,300,000
["Recife"] = {container = "Pernambuco"}, -- 4,100,000
["Porto Alegre"] = {container = "Rio Grande do Sul"}, -- 3,950,000 (Consolidated Urban Area)
["Brasília"] = {container = "Distrito Federal"}, -- 3,850,000
["Brasilia"] = {alias_of = "Brasília", display = true},
["Fortaleza"] = {container = "Ceará"}, -- 3,825,000
["Salvador"] = {container = "Bahia", wp = "%l, %c", commonscat = "%l (%c)"}, -- 3,400,000
["Curitiba"] = {container = "Paraná"}, -- 3,375,000
["Campinas"] = {container = "São Paulo"}, -- 3,250,000
["Goiânia"] = {container = "Goiás"}, -- 2,525,000
["Goiania"] = {alias_of = "Goiânia", display = true},
["Manaus"] = {container = "Amazonas"}, -- 2,275,000
["Belém"] = {container = "Pará"}, -- 2,200,000
["Belem"] = {alias_of = "Belém", display = true},
["Vitória"] = {container = "Espírito Santo", wp = "%l, %c"}, -- 1,870,000
["Vitoria"] = {alias_of = "Vitória", display = true},
["Santos"] = {container = "São Paulo", wp = "%l, %c"}, -- 1,760,000
["São Luís"] = {container = "Maranhão", wp = "%l, %c"}, -- 1,530,000
["Sao Luis"] = {alias_of = "São Luís", display = true},
["Natal"] = {container = "Rio Grande do Norte", wp = "%l, %c"}, -- 1,360,000
["Florianópolis"] = {container = "Santa Catarina"}, -- 1,260,000
["Florianopolis"] = {alias_of = "Florianópolis", display = true},
["Maceió"] = {container = "Alagoas"}, -- 1,220,000
["Maceio"] = {alias_of = "Maceió", display = true},
["João Pessoa"] = {container = "Paraíba", wp = "%l, %c"}, -- 1,210,000
["Joao Pessoa"] = {alias_of = "João Pessoa", display = true},
["São José dos Campos"] = {container = "São Paulo"}, -- 1,090,000
["Sao Jose dos Campos"] = {alias_of = "São José dos Campos", display = true},
["Londrina"] = {container = "Paraná"}, -- 1,050,000
["Teresina"] = {container = "Piauí"}, -- 1,040,000
}
export.brazil_cities_group = {
canonicalize_key_container = make_canonicalize_key_container(", บราซิล", "รัฐ"),
default_placetype = "นคร",
data = export.brazil_cities,
}
export.canada_cities = {
-- Figures from citypopulation.de; retrieved 2025-04-27; reference date 2025-01-01.
["Toronto"] = {container = "Ontario"}, -- 7,850,000 (Consolidated Urban Area; including Hamilton)
["Montreal"] = {container = "Quebec"}, -- 4,500,000 (Consolidated Urban Area)
["Vancouver"] = {container = "British Columbia"}, -- 3,175,000 (Consolidated Urban Area)
["Calgary"] = {container = "Alberta"}, -- 1,510,000 (Consolidated Urban Area)
["Edmonton"] = {container = "Alberta"}, -- 1,460,000 (Consolidated Urban Area)
["Ottawa"] = {container = "Ontario"}, -- 1,390,000 (Consolidated Urban Area)
["Quebec City"] = {container = "Quebec"}, -- 839,311 metro per Wikipedia (2021 census)
["Winnipeg"] = {container = "Manitoba"}, -- 834,678 metro per Wikipedia (2021 census)
["Hamilton"] = {container = "Ontario", wp = "%l, %c"}, -- 785,184 metro per Wikipedia (2021 census)
["Kitchener"] = {container = "Ontario", wp = "%l, %c"}, -- 575,847 metro per Wikipedia (2021 census)
}
export.canada_cities_group = {
canonicalize_key_container = make_canonicalize_key_container(", Canada", "จังหวัด"),
default_placetype = "นคร",
data = export.canada_cities,
}
export.france_cities = {
-- Figures from citypopulation.de unless otherwise indicated; retrieved 2025-04-26; reference date 2025-01-01.
["Paris"] = {container = "Île-de-France"}, -- 11,500,000 (Conglomeration)
["Lyon"] = {container = "Auvergne-Rhône-Alpes"}, -- 2,050,000 (Conglomeration)
["Lyons"] = {alias_of = "Lyon", display = true},
["Marseille"] = {container = "Provence-Alpes-Côte d'Azur"}, -- 1,710,000 (Conglomeration)
["Marseilles"] = {alias_of = "Marseille", display = true},
["Lille"] = {container = "Hauts-de-France"}, -- 1,320,000 (Conglomeration)
["Bordeaux"] = {container = "Nouvelle-Aquitaine"}, -- 1,160,000 (Conglomeration)
["Toulouse"] = {container = "Occitania"}, -- 1,150,000 (Conglomeration)
["Nice"] = {container = "Provence-Alpes-Côte d'Azur"},
["Nantes"] = {container = "Pays de la Loire"},
["Strasbourg"] = {container = "Grand Est"},
["Rennes"] = {container = "Brittany"},
}
export.france_cities_group = {
canonicalize_key_container = make_canonicalize_key_container(", ฝรั่งเศส", "ภูมิภาค"),
default_placetype = "นคร",
data = export.france_cities,
}
export.germany_cities = {
-- Figures from citypopulation.de unless otherwise indicated; retrieved 2025-04-26; reference date 2025-01-01.
-- listed under Rhein-Ruhr Area, total population 10,900,000 (Consolidated Urban Area)
["Cologne"] = {container = "North Rhine-Westphalia"},
["Köln"] = {alias_of = "Cologne", display = true},
["Düsseldorf"] = {container = "North Rhine-Westphalia"},
["Dusseldorf"] = {alias_of = "Düsseldorf", display = true},
["Dortmund"] = {container = "North Rhine-Westphalia"},
["Essen"] = {container = "North Rhine-Westphalia"},
["Duisberg"] = {container = "North Rhine-Westphalia"},
["Berlin"] = {}, -- 4,700,000
["Frankfurt"] = {container = "Hesse"}, -- 3,225,000
["Frankfurt am Main"] = {alias_of = "Frankfurt"}, -- not a display alias as it's longer
["Hamburg"] = {}, -- 2,900,000
["Munich"] = {container = "Bavaria"}, -- 2,300,000
["Stuttgart"] = {container = "Baden-Württemberg"}, -- 2,300,000
["Mannheim"] = {container = "Baden-Württemberg"}, -- 1,550,000
["Nuremberg"] = {container = "Bavaria"}, -- 1,120,000
["Hanover"] = {"Lower Saxony"}, -- 1,090,000
["Bielefeld"] = {container = "North Rhine-Westphalia"}, -- 1,080,000
["Leipzig"] = {container = "Saxony"}, -- 1,080,000
["Aachen"] = {container = "North Rhine-Westphalia"}, -- 1,000,000
["Aix-la-Chapelle"] = {alias_of = "Aachen"}, -- historical; not a display alias
["Bremen"] = {},
}
export.germany_cities_group = {
default_container = "เยอรมนี",
canonicalize_key_container = make_canonicalize_key_container(", เยอรมนี", "รัฐ"),
default_placetype = "นคร",
data = export.germany_cities,
}
export.india_cities = {
-- This lists the 65 metro areas per Demographia's 2023 estimates, as found in
-- [[w:List_of_million-plus_urban_agglomerations_in_India]]. The last census in India (as of April 2025) was
-- conducted in 2011, and the results are not accurate any more.
["Delhi"] = {container = {key = "Delhi, อินเดีย", placetype = "union territory"}}, -- 31,190,000
["Mumbai"] = {container = "Maharashtra"}, -- 25,189,000
["Kolkata"] = {container = "West Bengal"}, -- 21,747,000
["Bangalore"] = {container = "Karnataka", wp = "Bengaluru"}, -- 15,257,000
["Bengaluru"] = {alias_of = "Bangalore"},
["Chennai"] = {container = "Tamil Nadu"}, -- 11,570,000
["Hyderabad"] = {container = "Telangana"}, -- 9,797,000
["Ahmedabad"] = {container = "Gujarat"}, -- 8,006,000
["Pune"] = {container = "Maharashtra"}, -- 6,819,000
["Surat"] = {container = "Gujarat"}, -- 6,601,000
["Lucknow"] = {container = "Uttar Pradesh"}, -- 4,661,000
["Jaipur"] = {container = "Rajasthan"}, -- 4,360,000
["Kanpur"] = {container = "Uttar Pradesh"}, -- 4,350,000
["Indore"] = {container = "Madhya Pradesh"}, -- 3,765,000
["Nagpur"] = {container = "Maharashtra"}, -- 3,493,000
["Patna"] = {container = "Bihar"}, -- 3,331,000
["Varanasi"] = {container = "Uttar Pradesh"}, -- 3,229,000
["Kozhikode"] = {container = "Kerala"}, -- 3,049,000
["Thiruvananthapuram"] = {container = "Kerala"}, -- 2,851,000
["Agra"] = {container = "Uttar Pradesh"}, -- 2,737,000
["Bhopal"] = {container = "Madhya Pradesh"}, -- 2,562,000
["Coimbatore"] = {container = "Tamil Nadu"}, -- 2,551,000
["Allahabad"] = {container = "Uttar Pradesh", wp = "Prayagraj"}, -- 2,438,000
["Prayagraj"] = {alias_of = "Allahabad"},
["Kochi"] = {container = "Kerala"}, -- 2,381,000
["Ludhiana"] = {container = "Punjab"}, -- 2,205,000
["Vadodara"] = {container = "Gujarat"}, -- 2,182,000
["Chandigarh"] = {container = {key = "Chandigarh, อินเดีย", placetype = "union territory"}}, -- 2,168,000
["Madurai"] = {container = "Tamil Nadu"}, -- 2,048,000
["Meerut"] = {container = "Uttar Pradesh"}, -- 2,011,000
["Visakhapatnam"] = {container = "Andhra Pradesh"}, -- 2,005,000
["Jamshedpur"] = {container = "Jharkhand"}, -- 1,925,000
["Malappuram"] = {container = "Kerala"}, -- 1,868,000
["Nashik"] = {container = "Maharashtra"}, -- 1,810,000
["Asansol"] = {container = "West Bengal"}, -- 1,720,000
["Aligarh"] = {container = "Uttar Pradesh"}, -- 1,660,000
["Ranchi"] = {container = "Jharkhand"}, -- 1,638,000
["Thrissur"] = {container = "Kerala"}, -- 1,578,000
["Kollam"] = {container = "Kerala"}, -- 1,576,000
["Jabalpur"] = {container = "Madhya Pradesh"}, -- 1,533,000
["Dhanbad"] = {container = "Jharkhand"}, -- 1,503,000
["Jodhpur"] = {container = "Rajasthan"}, -- 1,497,000
["Aurangabad"] = {container = "Maharashtra"}, -- 1,490,000
["Chhatrapati Sambhajinagar"] = {alias_of = "Aurangabad"},
["Rajkot"] = {container = "Gujarat"}, -- 1,487,000
["Gwalior"] = {container = "Madhya Pradesh"}, -- 1,477,000
["Raipur"] = {container = "Chhattisgarh"}, -- 1,429,000
["Gorakhpur"] = {container = "Uttar Pradesh"}, -- 1,410,000
["Kannur"] = {container = "Kerala"}, -- 1,360,000
["Bareilly"] = {container = "Uttar Pradesh"}, -- 1,355,000
["Guwahati"] = {container = "Assam"}, -- 1,355,000
["Moradabad"] = {container = "Uttar Pradesh"}, -- 1,345,000
["Amritsar"] = {container = "Punjab"}, -- 1,313,000
["Mysore"] = {container = "Karnataka"}, -- 1,296,000
["Bhilai"] = {container = "Chhattisgarh"}, -- 1,293,000
["Durg-Bhilainagar"] = {alias_of = "Bhilai"},
["Durg-Bhilai"] = {alias_of = "Bhilai"},
["Durg"] = {alias_of = "Bhilai"},
["Bhilainagar"] = {alias_of = "Bhilai"},
["Vijayawada"] = {container = "Andhra Pradesh"}, -- 1,232,000
["Srinagar"] = {container = {key = "Jammu and Kashmir, อินเดีย", placetype = "union territory"}}, -- 1,212,000
["Salem"] = {container = "Tamil Nadu", wp = "%l, %c"}, -- 1,189,000
["Kota"] = {container = "Rajasthan"}, -- 1,172,000
["Jalandhar"] = {container = "Punjab"}, -- 1,165,000
["Saharanpur"] = {container = "Uttar Pradesh"}, -- 1,152,000
["Dehradun"] = {container = "Uttarakhand"}, -- 1,136,000
["Tiruchirappalli"] = {container = "Tamil Nadu"}, -- 1,131,000
["Bhubaneswar"] = {container = "Odisha"}, -- 1,112,000
["Jammu"] = {container = {key = "Jammu and Kashmir, อินเดีย", placetype = "union territory"}}, -- 1,103,000
["Solapur"] = {container = "Maharashtra"}, -- 1,082,000
["Hubli-Dharwad"] = {container = "Karnataka", wp = "Hubli–Dharwad"}, -- 1,062,000; wp with en dash
["Hubli"] = {alias_of = "Hubli-Dharwad"},
["Dharwad"] = {alias_of = "Hubli-Dharwad"},
["Puducherry"] = {container = {key = "Puducherry, อินเดีย", placetype = "union territory"}}, -- 1,024,000
["Pondicherry"] = {alias_of = "Puducherry", display = true},
-- satellite/secondary cities of metro area (none in citypopulation.de)
["Ghaziabad"] = {container = "Uttar Pradesh"}, -- 1,729,000 city, 2,358,525 urban agglomeration per 2011 census; 3,406,061 2025 estimate from official website; part of Delhi metro area
["Faridabad"] = {container = "Haryana"}, -- 1,414,050 city per 2011 census; part of Delhi metro area
["Thane"] = {container = "Maharashtra"}, -- 1,841,488 city per 2011 census; part of Mumbai metro area
["Kalyan-Dombivli"] = {container = "Maharashtra"}, -- 1,246,381 city per 2011 census; part of Mumbai metro area
["Kalyan-Dombivali"] = {alias_of = "Kalyan-Dombivli", display = true},
["Kalyan"] = {alias_of = "Kalyan-Dombivli"},
["Dombivli"] = {alias_of = "Kalyan-Dombivli"},
["Dombivali"] = {alias_of = "Kalyan-Dombivli"},
["Vasai-Virar"] = {container = "Maharashtra"}, -- 1,221,233 city per 2011 census; part of Mumbai metro area
["Vasai"] = {alias_of = "Vasai-Virar"},
["Virar"] = {alias_of = "Vasai-Virar"},
["Navi Mumbai"] = {container = "Maharashtra"}, -- 1,120,547 city per 2011 census; part of Mumbai metro area
["Howrah"] = {container = "West Bengal"}, -- 1,077,075 city ("metropolis"), 2,811,344 "metro" per 2011 census; part of Kolkata metro area
["Pimpri-Chinchwad"] = {container = "Maharashtra"}, -- 1,727,692 per 2011 census; part of Pune metro area
["Pimpri Chinchwad"] = {alias_of = "Pimpri-Chinchwad", display = true},
}
export.india_cities_group = {
canonicalize_key_container = make_canonicalize_key_container(", อินเดีย", "รัฐ"),
default_placetype = "นคร",
data = export.india_cities,
}
export.indonesia_cities = {
-- cities where the city proper has more than 1,000,000 people as of mid-2023 estimate
["Jakarta"] = {container = "Special Capital Region of Jakarta", divs = {
{type = "ตำบล", container_parent_type = false},
}},
["Surabaya"] = {container = "East Java"},
["Bekasi"] = {container = "West Java"}, -- part of Jakarta metro area
["Bandung"] = {container = "West Java"},
["Medan"] = {container = "North Sumatra"},
["Depok"] = {container = "West Java"}, -- part of Jakarta metro area
["Tangerang"] = {container = "Banten"}, -- part of Jakarta metro area
["Palembang"] = {container = "South Sumatra"},
["Semarang"] = {container = "Central Java"},
["Makassar"] = {container = "South Sulawesi"},
["South Tangerang"] = {container = "Banten"}, -- part of Jakarta metro area
["Batam"] = {container = "Riau Islands"},
["Bogor"] = {container = "West Java"}, -- part of Jakarta metro area
["Pekanbaru"] = {container = "Riau"},
["Bandar Lampung"] = {container = "Lampung"},
-- other metro areas over 1,000,000 people
["Padang"] = {container = "West Sumatra"},
["Samarinda"] = {container = "East Kalimantan"},
["Malang"] = {container = "East Java"},
["Yogyakarta"] = {container = "Special Region of Yogyakarta"},
["Denpasar"] = {container = "Bali"},
["Cirebon"] = {container = "West Java"},
["Surakarta"] = {container = "Central Java"},
["Banjarmasin"] = {container = "South Kalimantan"},
["Tasikmalaya"] = {container = "West Java"},
}
export.indonesia_cities_group = {
canonicalize_key_container = make_canonicalize_key_container(", อินโดนีเซีย", "จังหวัด"),
default_placetype = "นคร",
data = export.indonesia_cities,
}
export.italy_cities = {
-- Data per [[w:List_of_metropolitan_areas_of_Italy]]. There are several lists given; the most recent one, used
-- here, only gives estimates as of Jan 1, 2014.
["Milan"] = {container = "Lombardy"}, -- 6,623,798
["Naples"] = {container = "Campania"}, -- 5,294,546
["Rome"] = {container = "Lazio"}, -- 4,447,881
["Turin"] = {container = "Piedmont"}, -- 1,865,284
["Venice"] = {container = "Veneto"}, -- 1,645,900
["Florence"] = {container = "Tuscany"}, -- 1,485,030
["Bari"] = {container = "Apulia"}, -- 1,257,459
["Palermo"] = {container = "Sicily"}, -- 1,183,084
-- include a few just below 1,000,000 metro area that may be above it by now (depending on the definition).
["Catania"] = {container = "Sicily"}, -- 988,240
["Brescia"] = {container = "Lombardy"}, -- 924,090
["Genoa"] = {container = "Liguria"}, -- 861,318
}
export.italy_cities_group = {
canonicalize_key_container = make_canonicalize_key_container(", Italy", "ภูมิภาค"),
default_placetype = "นคร",
data = export.italy_cities,
}
export.japan_cities = {
-- Population figures from [[w:List of cities in Japan]]. Metro areas from
-- [[w:List of metropolitan areas in Japan]].
["Tokyo"] = {keydesc = "[[Tokyo]] Metropolis, the [[capital city]] and a [[prefecture]] of [[Japan]] (which is a country in [[Asia]])",
placetype = {"นคร", "prefecture"},
divs = {
{type = "special wards", container_parent_type = false},
{type = "cities", prep = "ใน"},
},
},
["Yokohama"] = {container = "Kanagawa"}, -- 3,697,894
["Osaka"] = {container = "Osaka"}, -- 2,668,586
["Nagoya"] = {container = "Aichi"}, -- 2,283,289
-- FIXME, Hokkaido is handled specially.
["Sapporo"] = {container = "Hokkaido"}, -- 1,918,096
["Fukuoka"] = {container = "Fukuoka"}, -- 1,581,527
["Kobe"] = {container = "Hyōgo"}, -- 1,530,847
["Kyoto"] = {container = "Kyoto"}, -- 1,474,570
["Kawasaki"] = {container = "Kanagawa", wp = "%l, Kanagawa"}, -- 1,373,630
["Saitama"] = {container = "Saitama", wp = "%l (city)", commonscat = "%l, %c"}, -- 1,192,418
["Hiroshima"] = {container = "Hiroshima"}, -- 1,163,806
["Sendai"] = {container = "Miyagi"}, -- 1,029,552
-- the remaining cities are considered "central cities" in a 1,000,000+ metro area
-- (sometimes there is more than one central city in the area).
["Kitakyushu"] = {container = "Fukuoka"}, -- 986,998
["Chiba"] = {container = "Chiba", wp = "%l (city)", commonscat = "%l, %c"}, -- 938,695
["Sakai"] = {container = "Osaka"}, -- 835,333
["Niigata"] = {container = "Niigata", wp = "%l (city)", commonscat = "%l, %c"}, -- 813,053
["Hamamatsu"] = {container = "Shizuoka"}, -- 811,431
["Shizuoka"] = {container = "Shizuoka", wp = "%l (city)", commonscat = "%l, %c"}, -- 710,944
["Sagamihara"] = {container = "Kanagawa"}, -- 706,342
["Okayama"] = {container = "Okayama"}, -- 701,293
["Kumamoto"] = {container = "Kumamoto"}, -- 670,348
["Kagoshima"] = {container = "Kagoshima"}, -- 605,196
-- skipped 6 cities (Funabashi, Hachiōji, Kawaguchi, Himeji, Matsuyama, Higashiōsaka)
-- with population in the range 509k - 587k because not central cities in any
-- 1,000,000+ metro area.
["Utsunomiya"] = {container = "Tochigi"}, -- 507,833
}
export.japan_cities_group = {
default_container = "ญี่ปุ่น",
canonicalize_key_container = make_canonicalize_key_container(", ญี่ปุ่น", "prefecture"),
default_placetype = "นคร",
data = export.japan_cities,
}
export.mexico_cities = {
["Mexico City"] = {}, -- its own state
["Monterrey"] = {container = "Nuevo León"},
["Guadalajara"] = {container = "Jalisco"},
["Puebla"] = {container = "Puebla", wp = "%l (city)"},
["Toluca"] = {container = "State of Mexico"},
["Tijuana"] = {container = "Baja California"},
-- Include the state in the category for León due to possible confusion with León, Spain.
["León, Guanajuato"] = {container = "Guanajuato", wp = "%l, %c"},
["León"] = {alias_of = "León, Guanajuato"},
["Leon"] = {alias_of = "León, Guanajuato", display = true},
["Querétaro"] = {container = "Querétaro", wp = "%l (city)"},
["Queretaro"] = {alias_of = "Querétaro", display = true},
["Ciudad Juárez"] = {container = "Chihuahua"},
["Juárez"] = {alias_of = "Ciudad Juárez"},
["Juarez"] = {alias_of = "Ciudad Juárez", display = "Juárez"},
["Torreón"] = {container = "Coahuila"},
["Torreon"] = {alias_of = "Torreón", display = true},
-- Include the state in the category for Mérida due to possible confusion with Mérida, Spain or
-- Mérida, Venezuela.
["Mérida, Yucatán"] = {container = "Yucatán", wp = "%l, %c"},
["Mérida"] = {alias_of = "Mérida, Yucatán"},
["Merida"] = {alias_of = "Mérida, Yucatán", display = true},
["San Luis Potosí"] = {container = "San Luis Potosí", wp = "%l (city)"},
["San Luis Potosi"] = {alias_of = "San Luis Potosí", display = true},
["Aguascalientes"] = {container = "Aguascalientes", wp = "%l (city)"},
["Mexicali"] = {container = "Baja California"},
}
export.mexico_cities_group = {
default_container = "Mexico",
canonicalize_key_container = make_canonicalize_key_container(", Mexico", "รัฐ"),
default_placetype = "นคร",
data = export.mexico_cities,
}
export.nigeria_cities = {
-- Figures from citypopulation.de unless otherwise indicated; retrieved 2025-04-26; reference date 2025-01-01.
["Lagos"] = {container = "Lagos"}, -- 21,300,000 (unindicated; population of low reliability)
["Kano"] = {container = "Kano", wp = "%l (city)"}, -- 5,350,000 (unindicated; population of low reliability)
["Ibadan"] = {container = "Oyo"}, -- 3,400,000 (unindicated; population of low reliability)
["Abuja"] = {container = {key = "Federal Capital Territory, Nigeria", placetype = "federal territory"}}, -- 3,050,000 (unindicated; population of low reliability)
["Port Harcourt"] = {container = "Rivers"}, -- 2,250,000 (unindicated; population of low reliability)
["Kaduna"] = {container = "Kaduna"}, -- 1,980,000 (unindicated; population of low reliability)
["Benin City"] = {container = "Edo"}, -- 1,790,000 (unindicated; population of low reliability)
["Aba"] = {container = "Abia", wp = "%l, Nigeria"}, -- 1,280,000 (unindicated; population of low reliability)
["Onitsha"] = {container = "Anambra"}, -- 1,230,000 (unindicated; population of low reliability)
["Maiduguri"] = {container = "Borno"}, -- 1,190,000 (unindicated; population of low reliability)
["Ilorin"] = {container = "Kwara"}, -- 1,160,000 (unindicated; population of low reliability)
["Sokoto"] = {container = "Sokoto", wp = "%l (city)"}, -- 1,140,000 (unindicated; population of low reliability)
["Jos"] = {container = "Plateau"}, -- 1,110,000 (unindicated; population of low reliability)
["Zaria"] = {container = "Kaduna"}, -- 1,050,000 (unindicated; population of low reliability)
["Enugu"] = {container = "Enugu", wp = "%l (city)"}, -- 1,010,000 (unindicated; population of low reliability)
}
export.nigeria_cities_group = {
default_container = "Nigeria",
canonicalize_key_container = make_canonicalize_key_container(" State, Nigeria", "รัฐ"),
default_placetype = "นคร",
data = export.nigeria_cities,
}
export.pakistan_cities = {
-- Figures from citypopulation.de; retrieved 2025-04-26; reference date 2025-01-01.
["Karachi"] = {container = "Sindh"}, -- 21,000,000 (Consolidated Urban Area)
["Lahore"] = {container = "Punjab"}, -- 14,600,000 (Consolidated Urban Area)
["Rawalpindi"] = {container = "Punjab"}, -- 5,600,000 (Consolidated Urban Area; including Islamabad)
["Islamabad"] = {container = {key = "Islamabad Capital Territory, Pakistan", placetype = "federal territory"}}, -- 5,600,000 (Consolidated Urban Area; including Rawalpindi)
["Faisalabad"] = {container = "Punjab"}, -- 4,125,000 (Consolidated Urban Area)
["Gujranwala"] = {container = "Punjab"}, -- 3,450,000 (Consolidated Urban Area)
-- there is also Hyderabad in India (very confusing)
["Hyderabad, Pakistan"] = {container = "Sindh", wp = "%l, %c"}, -- 2,475,000 (Consolidated Urban Area)
["Hyderabad"] = {alias_of = "Hyderabad, Pakistan"},
["Multan"] = {container = "Punjab"}, -- 2,425,000 (Consolidated Urban Area)
["Peshawar"] = {container = "Khyber Pakhtunkhwa"}, -- 2,150,000 (Consolidated Urban Area)
["Quetta"] = {container = "Balochistan"}, -- 1,720,000 (Urban Area)
["Sargodha"] = {container = "Punjab"}, -- 1,080,000 (Urban Area)
["Sialkot"] = {container = "Punjab"}, -- 1,050,000 (Consolidated Urban Area)
}
export.pakistan_cities_group = {
canonicalize_key_container = make_canonicalize_key_container(", Pakistan", "จังหวัด"),
default_placetype = "นคร",
data = export.pakistan_cities,
}
export.philippines_cities = {
-- Skipped some cities in Metro Manila (Taguig, Pasig) which don't have districts.
-- Other cities outside Metro Manila skipped as not central city in their urban area.
["Quezon City"] = {container = {key = "Metro Manila, Philippines", placetype = "ภูมิภาค"}},
-- Don't display-canonicalize Foo to Foo City as it may make the display weird.
["Quezon"] = {alias_of = "Quezon City"},
["Manila"] = {container = {key = "Metro Manila, Philippines", placetype = "ภูมิภาค"}},
["Davao City"] = {container = "Davao del Sur"},
["Davao"] = {alias_of = "Davao City"},
["Caloocan"] = {container = {key = "Metro Manila, Philippines", placetype = "ภูมิภาค"}},
["Zamboanga City"] = {container = "Zamboanga del Sur"},
["Zamboanga"] = {alias_of = "Zamboanga City"},
["Cebu City"] = {container = "Cebu"},
["Cebu"] = {alias_of = "Cebu City"},
["Antipolo"] = {container = "Rizal"},
["Cagayan de Oro"] = {container = "Misamis Oriental"},
["Dasmariñas"] = {container = "Cavite"},
["Dasmarinas"] = {alias_of = "Dasmariñas", display = true},
["General Santos"] = {container = "South Cotabato"},
["San Jose del Monte"] = {container = "Bulacan"},
["Bacolod"] = {container = "Negros Occidental"},
["Calamba"] = {container = "Laguna", wp = "%l, %c"},
["Angeles"] = {container = "Pampanga", wp = "Angeles City"},
["Angeles City"] = {alias_of = "Angeles"},
["Iloilo City"] = {container = "Iloilo"},
["Iloilo"] = {alias_of = "Iloilo City"},
}
export.philippines_cities_group = {
canonicalize_key_container = make_canonicalize_key_container(", Philippines", "จังหวัด"),
default_placetype = "นคร",
data = export.philippines_cities,
}
export.russia_cities = {
-- Figures from citypopulation.de; retrieved 2025-04-26; reference date 2025-01-01.
["Moscow"] = {}, -- 18,800,000 (Agglomeration)
["Saint Petersburg"] = {}, -- 6,350,000 (Agglomeration)
["Novosibirsk"] = {container = "Novosibirsk Oblast"}, -- 1,820,000 (Agglomeration)
["Yekaterinburg"] = {container = "Sverdlovsk Oblast"}, -- 1,810,000 (Agglomeration)
["Nizhny Novgorod"] = {container = "Nizhny Novgorod Oblast"}, -- 1,620,000 (Agglomeration)
["Kazan"] = {container = {key = "Tatarstan, Russia", placetype = "republic"}}, -- 1,560,000 (Agglomeration)
["Chelyabinsk"] = {container = "Chelyabinsk Oblast"}, -- 1,430,000 (Agglomeration)
["Rostov-on-Don"] = {container = "Rostov Oblast"}, -- 1,390,000 (Agglomeration)
["Rostov-na-Donu"] = {alias_of = "Rostov-on-Don", display = true},
["Krasnodar"] = {container = {key = "Krasnodar Krai, Russia", placetype = "krai"}}, -- 1,370,000 (Agglomeration)
["Samara"] = {container = "Samara Oblast"}, -- 1,350,000 (Agglomeration)
["Krasnoyarsk"] = {container = {key = "Krasnoyarsk Krai, Russia", placetype = "krai"}}, -- 1,270,000 (Agglomeration)
["Ufa"] = {container = {key = "Bashkortostan, Russia", placetype = "republic"}}, -- 1,230,000 (Agglomeration)
["Saratov"] = {container = "Saratov Oblast"}, -- 1,170,000 (Agglomeration)
["Omsk"] = {container = "Omsk Oblast"}, -- 1,140,000 (Agglomeration)
["Voronezh"] = {container = "Voronezh Oblast"}, -- 1,130,000 (Agglomeration)
["Volgograd"] = {container = "Volgograd Oblast"}, -- 1,080,000 (Agglomeration)
["Perm"] = {container = {key = "Perm Krai, Russia", placetype = "krai"}, wp = "%l, Russia"}, -- 1,070,000 (Agglomeration)
}
export.russia_cities_group = {
canonicalize_key_container = make_canonicalize_key_container(", Russia", "oblast"),
default_container = "Russia",
default_placetype = "นคร",
data = export.russia_cities,
}
export.saudi_arabia_cities = {
-- Figures for the first five from [[w:List of cities and towns in Saudi Arabia]] as of 2022. Unclear if these are
-- metro, urban or city proper figures.
["Riyadh"] = {container = "Riyadh"}, -- 7,000,100; 7,700,000 per citypopulation.de 2025-01-01 (Agglomeration)
["Jeddah"] = {container = "Mecca"}, -- 3,751,917; 3,950,000 per citypopulation.de 2025-01-01 (Agglomeration)
["Jedda"] = {alias_of = "Jeddah", display = true},
["Jiddah"] = {alias_of = "Jeddah", display = true},
["Jidda"] = {alias_of = "Jeddah", display = true},
["Dammam"] = {container = "Eastern"}, -- 2,638,166; 2,925,000 per citypopulation.de 2025-01-01 (Agglomeration)
["Mecca"] = {container = "Mecca"}, -- 2,385,509; 2,675,000 per citypopulation.de 2025-01-01 (Agglomeration)
["Makkah"] = {alias_of = "Mecca", display = true},
["Medina"] = {container = "Medina"}, -- 1,477,023; 1,530,000 per citypopulation.de 2025-01-01 (City)
["Hofuf"] = {container = "Eastern"}, -- 1,060,000 per citypopulation.de 2025-01-01 (Agglomeration)
["Khamis Mushait"] = {container = "Aseer"}, -- 1,030,000 per citypopulation.de 2025-01-01 (Agglomeration)
["Khamis Mushayt"] = {alias_of = "Khamis Mushait", display = true},
}
export.saudi_arabia_cities_group = {
canonicalize_key_container = make_canonicalize_key_container(" Province, Saudi Arabia", "จังหวัด"),
default_placetype = "นคร",
data = export.saudi_arabia_cities,
}
export.south_korea_cities = {
-- All cities listed are not associated with any county.
["Seoul"] = {},
["Busan"] = {},
["Incheon"] = {},
["Daegu"] = {},
["Daejeon"] = {},
["Gwangju"] = {},
["Ulsan"] = {},
}
export.south_korea_cities_group = {
default_container = "South Korea",
canonicalize_key_container = make_canonicalize_key_container(" County, South Korea", "จังหวัด"),
default_placetype = "นคร",
data = export.south_korea_cities,
}
export.spain_cities = {
["Madrid"] = {container = "Community of Madrid"},
["Barcelona"] = {container = "Catalonia"},
["Valencia"] = {container = "Valencia"},
["Seville"] = {container = "Andalusia"},
["Bilbao"] = {container = "Basque Country"},
}
export.spain_cities_group = {
canonicalize_key_container = make_canonicalize_key_container(", Spain", "autonomous community"),
default_placetype = "นคร",
data = export.spain_cities,
}
export.taiwan_cities = {
["New Taipei City"] = {},
["New Taipei"] = {alias_of = "New Taipei City", display = true},
["Taichung"] = {},
["Kaohsiung"] = {wp = "%l, ไต้หวัน"},
["Taipei"] = {},
["Taoyuan"] = {},
["Tainan"] = {},
-- these last three are not special municipalities
["Chiayi"] = {placetype = "นคร"},
["Hsinchu"] = {placetype = "นคร"},
["Keelung"] = {placetype = "นคร"},
}
export.taiwan_cities_group = {
placename_to_key = false, -- don't add ", ไต้หวัน" to make the key
canonicalize_key_container = make_canonicalize_key_container(", ไต้หวัน", "เทศมณฑล"),
default_container = "ไต้หวัน",
default_placetype = {"special municipality", "เทศบาล", "นคร"},
default_is_city = true,
default_divs = {"อำเภอ"},
data = export.taiwan_cities,
}
-- NOTE: It's OK to mix cities from different constituent countries; as long as the immediate container is correct,
-- everything else will be figured out.
export.united_kingdom_cities = {
["London"] = {container = "Greater London"},
["Manchester"] = {container = "Greater Manchester"},
["Birmingham"] = {container = "West Midlands"},
["Liverpool"] = {container = "Merseyside"},
["Glasgow"] = {container = {key = "City of Glasgow, Scotland", placetype = "council area"}},
["Leeds"] = {container = "West Yorkshire"},
["Newcastle upon Tyne"] = {container = "Tyne and Wear"},
["Newcastle"] = {alias_of = "Newcastle upon Tyne"},
["Bristol"] = {container = {key = "England", placetype = "constituent country"}},
["Cardiff"] = {container = {key = "Wales", placetype = "constituent country"}},
["Portsmouth"] = {container = "Hampshire"},
["Edinburgh"] = {container = {key = "City of Edinburgh, Scotland", placetype = "council area"}},
-- under 1,000,000 people but principal areas of Wales; requested by [[User:Donnanz]]
["Swansea"] = {container = {key = "Wales", placetype = "constituent country"}},
["Newport"] = {container = {key = "Wales", placetype = "constituent country"}, wp = "Newport, Wales"},
}
export.united_kingdom_cities_group = {
canonicalize_key_container = make_canonicalize_key_container(", England", "เทศมณฑล"),
default_placetype = "นคร",
data = export.united_kingdom_cities,
}
export.united_states_cities = {
-- top 50 CSA's by population, with the top and sometimes 2nd or 3rd city listed
["New York City"] = {container = "New York", wp = "%l", divs = {
{type = "boroughs", container_parent_type = false},
}},
-- Don't display-canonicalize as it may make the display weird (e.g. in the context New York, New York).
["New York"] = {alias_of = "New York City"},
["Newark"] = {container = "New Jersey"},
["Los Angeles"] = {container = "California", wp = "%l"},
["Long Beach"] = {container = "California"},
["Riverside"] = {container = "California"},
["Chicago"] = {container = "Illinois", wp = "%l"},
["Washington, D.C."] = {wp = "%l"},
["Washington, DC"] = {alias_of = "Washington, D.C.", display = true},
["Washington D.C."] = {alias_of = "Washington, D.C.", display = true},
["Washington DC"] = {alias_of = "Washington, D.C.", display = true},
-- Don't display-canonicalize as it may make the display weird (e.g. if the holonym is followed by a District of
-- Columbia holonym).
["Washington"] = {alias_of = "Washington, D.C."},
["Baltimore"] = {container = "Maryland", wp = "%l"},
-- to avoid conflict with San Jose in Costa Rica
["San Jose, California"] = {container = "California"},
["San Jose"] = {alias_of = "San Jose, California"},
["San Francisco"] = {container = "California", wp = "%l"},
["Oakland"] = {container = "California"},
["Boston"] = {container = "Massachusetts", wp = "%l"},
["Providence"] = {container = "Rhode Island"},
["Dallas"] = {container = "Texas", wp = "%l", commonscat = "%l, %c"},
["Fort Worth"] = {container = "Texas"},
["Philadelphia"] = {container = "Pennsylvania", wp = "%l"},
["Houston"] = {container = "Texas", wp = "%l"},
["Miami"] = {container = "Florida", wp = "%l", commonscat = "%l, %c"},
["Atlanta"] = {container = "Georgia", wp = "%l"},
["Detroit"] = {container = "Michigan", wp = "%l"},
["Phoenix"] = {container = "Arizona", wp = "%l", commonscat = "%l, %c"},
["Mesa"] = {container = "Arizona"},
["Seattle"] = {container = "Washington", wp = "%l"},
["Orlando"] = {container = "Florida"},
["Minneapolis"] = {container = "Minnesota", wp = "%l"},
["Cleveland"] = {container = "Ohio", wp = "%l", commonscat = "%l, %c"},
["Denver"] = {container = "Colorado", wp = "%l", commonscat = "%l, %c"},
["San Diego"] = {container = "California", wp = "%l", commonscat = "%l, %c"},
["Portland"] = {container = "Oregon"},
["Tampa"] = {container = "Florida"},
["St. Louis"] = {container = "Missouri", wp = "%l", commonscat = "%l, %c"},
["Saint Louis"] = {alias_of = "St. Louis", display = true},
["Charlotte"] = {container = "North Carolina"},
["Sacramento"] = {container = "California"},
["Pittsburgh"] = {container = "Pennsylvania", wp = "%l"},
["Salt Lake City"] = {container = "Utah", wp = "%l"},
["San Antonio"] = {container = "Texas", wp = "%l", commonscat = "%l, %c"},
["Columbus"] = {container = "Ohio"},
["Kansas City"] = {container = "Missouri", wp = "%l metropolitan area", commonscat = "%l, %c"},
["Indianapolis"] = {container = "Indiana", wp = "%l"},
["Las Vegas"] = {container = "Nevada", wp = "%l"},
["Cincinnati"] = {container = "Ohio", wp = "%l", commonscat = "%l, %c"},
["Austin"] = {container = "Texas"},
["Milwaukee"] = {container = "Wisconsin", wp = "%l", commonscat = "%l, %c"},
["Raleigh"] = {container = "North Carolina"},
["Nashville"] = {container = "Tennessee"},
["Virginia Beach"] = {container = "Virginia"},
["Norfolk"] = {container = "Virginia"},
["Greensboro"] = {container = "North Carolina"},
["Winston-Salem"] = {container = "North Carolina"},
["Jacksonville"] = {container = "Florida"},
["New Orleans"] = {container = "Louisiana", wp = "%l"},
["Louisville"] = {container = "Kentucky"},
["Greenville"] = {container = "South Carolina"},
["Hartford"] = {container = "Connecticut"},
["Oklahoma City"] = {container = "Oklahoma", wp = "%l"},
["Grand Rapids"] = {container = "Michigan"},
["Memphis"] = {container = "Tennessee"},
["Birmingham, Alabama"] = {container = "Alabama"},
["Birmingham"] = {alias_of = "Birmingham, Alabama"},
["Fresno"] = {container = "California"},
["Richmond"] = {container = "Virginia"},
["Harrisburg"] = {container = "Pennsylvania"},
-- any major city of top 50 MSA's that's missed by previous
["Buffalo"] = {container = "New York"},
-- any of the top 50 city by city population that's missed by previous
["El Paso"] = {container = "Texas"},
["Albuquerque"] = {container = "New Mexico"},
["Tucson"] = {container = "Arizona"},
["Colorado Springs"] = {container = "Colorado"},
["Omaha"] = {container = "Nebraska"},
["Tulsa"] = {container = "Oklahoma"},
-- skip Arlington, Texas; too obscure and likely to be interpreted as Arlington, Virginia
}
export.united_states_cities_group = {
default_container = "สหรัฐอเมริกา",
canonicalize_key_container = make_canonicalize_key_container(", USA", "รัฐ"),
default_placetype = "นคร",
default_wp = "%l, %c",
data = export.united_states_cities,
}
export.new_york_boroughs = {
["Bronx"] = {the = true, wp = "The Bronx"},
["Brooklyn"] = {},
["Manhattan"] = {},
["Queens"] = {},
["Staten Island"] = {},
}
export.new_york_boroughs_group = {
default_container = {key = "New York City", placetype = "นคร"},
default_placetype = "borough",
default_is_city = true,
data = export.new_york_boroughs,
}
export.vietnam_cities = {
-- Figures from citypopulation.de (retrieved 2025-04-26; reference date 2025-01-01) unless otherwise indicated.
["Ho Chi Minh City"] = {}, -- 14,300,000 (Agglomeration; inclunding Bien Hoa)
["Saigon"] = {alias_of = "Ho Chi Minh City"},
["Hanoi"] = {}, -- 7,350,000 (Agglomeration)
["Da Nang"] = {}, -- 1,500,000 (Agglomeration)
["Danang"] = {alias_of = "Da Nang", display = true},
["Haiphong"] = {}, -- 1,450,000 (Agglomeration)
["Hai Phong"] = {alias_of = "Haiphong", display = true},
-- This is the one entry in this list that is not a province-level municipality; instead it's a "provincial city"
-- meaning it is directly under its province as opposed to being contained in a district.
["Bien Hoa"] = {placetype = "นคร", container = "Đồng Nai", wp = "Biên Hòa"}, -- 1,272,235 (2022 city population per Wikipedia)
["Biên Hòa"] = {alias_of = "Bien Hoa", display = true},
["Biên Hoà"] = {alias_of = "Bien Hoa", display = true},
-- These two not in citypopulation.de because the urban population may be slightly under 1,000,000, but they are
-- both province-level municipalities and close to the 1,000,000 mark.
["Can Tho"] = {wp = "Cần Thơ"}, -- 1,456,000 municipality (2019 census), 994,704 urban (2022 General Statistics Office of Vietnam estimate); capital [[Ninh Kiều district]]
["Cần Thơ"] = {alias_of = "Can Tho", display = true},
["Hue"] = {wp = "Huế"}, -- 1,257,000 municipality (2019 census), 840,000 urban (2022 General Statistics Office of Vietnam estimate); -- capital [[Thuận Hóa district]]
["Huế"] = {alias_of = "Hue", display = true},
}
export.vietnam_cities_group = {
placename_to_key = false, -- don't add ", เวียดนาม" to make the key
default_container = "เวียดนาม",
canonicalize_key_container = make_canonicalize_key_container(", เวียดนาม", "จังหวัด"),
-- Most of the cities listed are province-level municipalities in addition, which contain a certain amount of
-- rural territory surrounding the city, but not enough to separate the municipality from the city as distinct
-- known locations.
default_placetype = {"เทศบาล", "นคร"},
default_is_city = true,
-- There may not be enough districts to subcategorize like this.
-- default_divs = "อำเภอ",
data = export.vietnam_cities,
}
export.misc_cities = {
------------------ Africa -------------------
-- Sorted by country and then within the country, by decreasing population; figures from citypopulation.de
-- (retrieved 2025-04-26; reference date 2025-01-01) unless otherwise indicated; combined with data from
-- [[w:List of urban areas in Africa by population]].
["Algiers"] = {container = "แอลจีเรีย"}, -- 4,325,000 (Consolidated Urban Area)
["Oran"] = {container = "แอลจีเรีย"}, -- 1,640,000 (Consolidated Urban Area)
["Luanda"] = {container = "แองโกลา"}, -- 9,650,000 (Urban Area)
["Benguela"] = {container = "แองโกลา"}, -- 1,420,000 (Urban Area)
["Cotonou"] = {container = "เบนิน"}, -- 2,150,000 (Agglomeration)
["Ouagadougou"] = {container = "บูร์กินาฟาโซ"}, -- 3,425,000 (Agglomeration)
["Bobo-Dioulasso"] = {container = "บูร์กินาฟาโซ"}, -- 1,100,000 (Agglomeration)
["Bujumbura"] = {container = "บุรุนดี"}, -- 1,143,202 (Urban Area 2023 per PopulationStat, cited in Wikipedia)
["Yaoundé"] = {container = "แคเมอรูน"}, -- 3,975,000 (City)
["Yaounde"] = {alias_of = "Yaoundé", display = true},
["Douala"] = {container = "แคเมอรูน"}, -- 3,900,000 (City)
["Bangui"] = {container = "สาธารณรัฐแอฟริกากลาง"}, -- 1,680,000 (Agglomeration)
["N'Djamena"] = {container = "ชาด"}, -- 1,950,000 (City)
["Ndjamena"] = {alias_of = "N'Djamena", display = true},
["Kinshasa"] = {container = "สาธารณรัฐประชาธิปไตยคองโก"}, -- 16,300,000 (City; population of low reliability)
["Lubumbashi"] = {container = "สาธารณรัฐประชาธิปไตยคองโก"}, -- 2,875,000 (City; population of low reliability)
["Mbuji-Mayi"] = {container = "สาธารณรัฐประชาธิปไตยคองโก"}, -- 2,500,000 (City; population of low reliability)
["Kananga"] = {container = "สาธารณรัฐประชาธิปไตยคองโก"}, -- 1,370,000 (City; population of low reliability)
["Kisangani"] = {container = "สาธารณรัฐประชาธิปไตยคองโก"}, -- 1,300,000 (City; population of low reliability)
["Bukavu"] = {container = "สาธารณรัฐประชาธิปไตยคองโก"}, -- 1,100,000 (City; population of low reliability)
["Goma"] = {container = "สาธารณรัฐประชาธิปไตยคองโก"}, -- 1,010,000 (City; population of low reliability)
["Tshikapa"] = {container = "สาธารณรัฐประชาธิปไตยคองโก"}, -- 1,020,468 (2023 Wikipedia [[w:List of cities with over one million inhabitants]] from populationstat.com; not in citypopulation.de)
["Cairo"] = {container = "อียิปต์"}, -- 22,800,000 (Agglomeration, including Giza and Subhra El Kheima)
["Alexandria"] = {container = "อียิปต์"}, -- 6,250,000 (Agglomeration)
["Giza"] = {container = "อียิปต์"}, -- 4,458,135 (2023 from citypopulation.de)
["Shubra El Kheima"] = {container = "อียิปต์"}, -- 1,240,239 (2021 from citypopulation.de)
["Asmara"] = {container = "เอริเทรีย"}, -- 1,090,000 (City; population of low reliability)
["Asmera"] = {alias_of = "Asmara", display = true},
["Addis Ababa"] = {container = "เอธิโอเปีย"}, -- 4,825,000 (Agglomeration)
["Banjul"] = {container = "Gambia"}, -- 1,170,000 (Agglomeration)
["Accra"] = {container = "กานา"}, -- 6,800,000 (Agglomeration)
["Kumasi"] = {container = "กานา"}, -- 2,900,000 (Agglomeration)
["Conakry"] = {container = "กินี"}, -- 2,975,000 (Consolidated Urban Area)
["Abidjan"] = {container = "โกตดิวัวร์"}, -- 7,050,000 (Agglomeration)
["Nairobi"] = {container = "Kenya"}, -- 6,900,000 (unindicated)
["Mombasa"] = {container = "Kenya"}, -- 1,370,000 (City)
["Monrovia"] = {container = "Liberia"}, -- 1,940,000 (Urban Area)
["Tripoli"] = {container = "Libya", wp = "%l, %c"}, -- 1,870,000 (unindicated)
["Antananarivo"] = {container = "Madagascar"}, -- 3,150,000 (Agglomeration)
["Lilongwe"] = {container = "Malawi"}, -- 1,210,000 (City)
["Bamako"] = {container = "Mali"}, -- 5,700,000 (Agglomeration)
["Nouakchott"] = {container = "Mauritania"}, -- 1,500,000 (City)
["Casablanca"] = {container = {key = "Casablanca-Settat, Morocco", placetype = "ภูมิภาค"}}, -- 4,450,000 (Municipality (urban population))
["Rabat"] = {container = {key = "Rabat-Sale-Kenitra, Morocco", placetype = "ภูมิภาค"}}, -- 2,125,000 (Municipality (urban population))
["Tangier"] = {container = {key = "Tangier-Tetouan-Al Hoceima, Morocco", placetype = "ภูมิภาค"}}, -- 1,410,000 (Municipality (urban population))
["Tanger"] = {alias_of = "Tangier", display = true},
["Tangiers"] = {alias_of = "Tangier", display = true},
["Fez"] = {container = {key = "Fez-Meknes, Morocco", placetype = "ภูมิภาค"}, wp = "%l, Morocco"}, -- 1,310,000 (Municipality (urban population))
["Fes"] = {alias_of = "Fez", display = true},
["Fès"] = {alias_of = "Fez", display = true},
["Agadir"] = {container = {key = "Souss-Massa, Morocco", placetype = "ภูมิภาค"}}, -- 1,270,000 (Municipality (urban population))
["Marrakesh"] = {container = {key = "Marrakesh-Safi, Morocco", placetype = "ภูมิภาค"}}, -- 1,140,000 (Municipality (urban population))
["Marrakech"] = {alias_of = "Marrakesh", display = true},
["Maputo"] = {container = "Mozambique"}, -- 2,575,000 (Agglomeration)
["Niamey"] = {container = "Niger"}, -- 1,530,000 (City)
["Brazzaville"] = {container = "Republic of the Congo"}, -- 2,475,000 (Agglomeration)
["Pointe-Noire"] = {container = "Republic of the Congo"}, -- 1,480,000 (City)
["Kigali"] = {container = "Rwanda"}, -- 1,960,000 (Municipality (urban population))
["Dakar"] = {container = "Senegal"}, -- 4,225,000 (Agglomeration)
["Touba"] = {container = "Senegal"}, -- 1,320,000 (Agglomeration)
["Freetown"] = {container = "Sierra Leone"}, -- 1,420,000 (Agglomeration)
["Mogadishu"] = {container = "โซมาเลีย"}, -- 2,250,000 (unindicated; population of low reliability)
["Johannesburg"] = {container = {key = "Gauteng, South Africa", placetype = "จังหวัด"}}, -- 14,800,000 (Consolidated Urban Area; including Pretoria, Soweto, etc.)
["Cape Town"] = {container = {key = "Western Cape, South Africa", placetype = "จังหวัด"}}, -- 5,100,000 (Consolidated Urban Area)
["Durban"] = {container = {key = "KwaZulu-Natal, South Africa", placetype = "จังหวัด"}}, -- 3,900,000 (Consolidated Urban Area)
["Pretoria"] = {container = {key = "Gauteng, South Africa", placetype = "จังหวัด"}}, -- 2,921,488 (2011 census)
["Port Elizabeth"] = {container = {key = "Eastern Cape, South Africa", placetype = "จังหวัด"}, wp = "Gqeberha"}, -- 1,200,000 (Consolidated Urban Area)
["Gqeberha"] = {alias_of = "Port Elizabeth"}, -- official name; not a display alias
["Khartoum"] = {container = "Sudan"}, -- 7,200,000 (unindicated; population of low reliability)
["Dar es Salaam"] = {container = "Tanzania"}, -- 6,650,000 (Agglomeration)
["Mwanza"] = {container = "Tanzania"}, -- 1,340,000 (Agglomeration)
["Mwanza City"] = {alias_of = "Mwanza", display = true},
["Arusha"] = {container = "Tanzania"}, -- 1,190,000 (Agglomeration)
["Zanzibar"] = {container = "Tanzania"}, -- 1,030,000 (Agglomeration)
["Lomé"] = {container = "Togo"}, -- 2,625,000 (unindicated)
["Lome"] = {alias_of = "Lomé", display = true},
["Tunis"] = {container = "Tunisia"}, -- 2,725,000 (Municipality (urban population))
["Sousse"] = {container = "Tunisia"}, -- 1,180,000 (Municipality (urban population))
["Soussa"] = {alias_of = "Sousse", display = true},
["Kampala"] = {container = "Uganda"}, -- 4,300,000 (unindicated)
["Lusaka"] = {container = "Zambia"}, -- 3,000,000 (Consolidated Urban Area)
["Harare"] = {container = "Zimbabwe"}, -- 2,675,000 (Agglomeration)
------------------ Asia -------------------
-- sorted by country and then within the country, by decreasing population; figures from citypopulation.de
-- (retrieved 2025-04-26; reference date 2025-01-01) unless otherwise indicated.
["Kabul"] = {container = "อัฟกานิสถาน"}, -- 5,250,000 (Agglomeration)
["Baku"] = {container = "อาเซอร์ไบจาน"}, -- 3,725,000 (Administrative Area (urban population))
["Manama"] = {container = "บาห์เรน"}, -- 1,560,000 (unindicated)
["Dhaka"] = {container = {key = "Dhaka Division, บังกลาเทศ", placetype = "division"}}, -- 23,100,000 (Agglomeration)
["Dacca"] = {alias_of = "Dhaka", display = true},
["Chittagong"] = {container = {key = "Chittagong Division, บังกลาเทศ", placetype = "division"}}, -- 5,050,000 (Agglomeration)
["Gazipur"] = {container = {key = "Dhaka Division, บังกลาเทศ", placetype = "division"}}, -- 2,674,697 (City per 2022; countied in citypopulation.de as part of Dhaka metro area)
["Khulna"] = {container = {key = "Khulna Division, บังกลาเทศ", placetype = "division"}}, -- 1,210,000 (Agglomeration)
["Phnom Penh"] = {container = "กัมพูชา"}, -- 2,925,000 (Agglomeration)
["Tehran"] = {container = {key = "Tehran, อิหร่าน", placetype = "จังหวัด"}}, -- 16,800,000 (Agglomeration)
["Teheran"] = {alias_of = "Tehran", display = true},
["Mashhad"] = {container = {key = "Razavi Khorasan, อิหร่าน", placetype = "จังหวัด"}}, -- 3,475,000 (Agglomeration)
["Mashad"] = {alias_of = "Mashhad", display = true},
["Meshhed"] = {alias_of = "Mashhad", display = true},
["Meshed"] = {alias_of = "Mashhad", display = true},
["Isfahan"] = {container = {key = "Isfahan, อิหร่าน", placetype = "จังหวัด"}}, -- 3,425,000 (Agglomeration)
["Esfahan"] = {alias_of = "Isfahan", display = true},
["Tabriz"] = {container = {key = "East Azerbaijan, อิหร่าน", placetype = "จังหวัด"}}, -- 1,970,000 (Agglomeration)
["Shiraz"] = {container = {key = "Fars, อิหร่าน", placetype = "จังหวัด"}}, -- 1,950,000 (Agglomeration)
["Ahvaz"] = {container = {key = "Khuzestan, อิหร่าน", placetype = "จังหวัด"}}, -- 1,550,000 (Agglomeration)
["Qom"] = {container = {key = "Qom, อิหร่าน", placetype = "จังหวัด"}}, -- 1,450,000 (City)
["Kermanshah"] = {container = {key = "Kermanshah, อิหร่าน", placetype = "จังหวัด"}}, -- 1,130,000 (City)
["Baghdad"] = {container = "อิรัก"}, -- 7,800,000 (Administrative Area (urban population))
["Basra"] = {container = "อิรัก"}, -- 1,710,000 (Administrative Area (urban population))
["Mosul"] = {container = "อิรัก"}, -- 1,550,000 (Administrative Area (urban population))
["Erbil"] = {container = "อิรัก"}, -- 1,220,000 (Administrative Area (urban population))
["Kirkuk"] = {container = "อิรัก"}, -- 1,160,000 (Administrative Area (urban population))
["Najaf"] = {container = "อิรัก"}, -- 1,050,000 (Administrative Area (urban population))
["Tel Aviv"] = {container = "อิสราเอล"}, -- 3,000,000 (Agglomeration)
-- Jerusalem is not recognized internationally as part of either Israel or Palestine, but as a
-- [[w:corpus separatum]], so put the container as "เอเชีย" and list Israel and Palestine as additional parents for
-- categorization purposes.
["Jerusalem"] = {container = {key = "เอเชีย", placetype = "ทวีป"},
addl_parents = {"อิสราเอล", "Palestine"}}, -- 1,080,000 (Agglomeration)
["Amman"] = {container = "Jordan"}, -- 6,150,000 (unindicated)
["Irbid"] = {container = "Jordan"}, -- 1,070,000 (unindicated)
["Almaty"] = {container = "Kazakhstan"}, -- 2,700,000 (Agglomeration)
["Alma-Ata"] = {alias_of = "Almaty"}, -- former name, sometimes still used; don't display-canonicalize
["Astana"] = {container = "Kazakhstan"}, -- 1,600,000 (Agglomeration)
["Shymkent"] = {container = "Kazakhstan"}, -- 1,370,000 (Agglomeration)
["Kuwait City"] = {container = "Kuwait"}, -- 5,050,000 (Agglomeration)
["Bishkek"] = {container = "Kyrgyzstan"}, -- 1,540,000 (Agglomeration)
["Beirut"] = {container = "Lebanon"}, -- 1,930,000 (unindicated; population of low reliability)
-- Kuala Lumpur is a federal capital city, not in any state
["Kuala Lumpur"] = {container = "Malaysia"}, -- 9,550,000 (Agglomeration)
-- there are various George Towns and Georgetowns
["George Town, Malaysia"] = {container = {key = "Penang, Malaysia", placetype = "รัฐ"}, wp = "%l, %c"}, -- 2,075,000 (Agglomeration)
["George Town"] = {alias_of = "George Town, Malaysia"},
["Ulaanbaatar"] = {container = "Mongolia"}, -- 1,610,000 (City)
["Ulan Bator"] = {alias_of = "Ulaanbaatar", display = true},
["Yangon"] = {container = "Myanmar"}, -- 5,650,000 (Municipality (urban population))
["Rangoon"] = {alias_of = "Yangon", display = true},
["Mandalay"] = {container = "Myanmar"}, -- 1,600,000 (Municipality (urban population))
["Kathmandu"] = {container = "Nepal"}, -- 3,175,000 (Agglomeration)
-- Pyongyang is a directly governed city, not in any province
["Pyongyang"] = {container = "North Korea"}, -- 3,025,000 (Administrative Area (urban population))
["Muscat"] = {container = "Oman"}, -- 1,620,000 (Agglomeration)
["Gaza"] = {container = "Palestine", wp = "Gaza City"}, -- 2,275,000 (unindicated)
["Gaza City"] = {alias_of = "Gaza"},
["Doha"] = {container = "Qatar"}, -- 2,650,000 (Agglomeration)
["Colombo"] = {container = "Sri Lanka"}, -- 4,975,000 (unindicated)
["Damascus"] = {container = "Syria"}, -- 3,975,000 (unindicated; population of low reliability)
["Aleppo"] = {container = "Syria"}, -- 1,980,000 (unindicated; population of low reliability)
["Dushanbe"] = {container = "Tajikistan"}, -- 1,270,000 (City)
["Bangkok"] = {container = "Thailand"}, -- 21,800,000 (Agglomeration)
-- Chiang Mai not in citypopulation.de, but 1,198,000 urban population in 2021 per Wikipedia
-- [[w:List_of_municipalities_in_Thailand#Largest_cities_by_urban_population]]
["Chiang Mai"] = {container = {key = "Chiang Mai Province, Thailand", placetype = "จังหวัด"}},
["Chonburi"] = {container = {key = "Chonburi Province, Thailand", placetype = "จังหวัด"}}, -- 1,570,000 (Agglomeration; including Pattaya)
-- metro area population stats from https://www.statista.com/statistics/255483/biggest-cities-in-turkey/ as of 2021;
-- second source is citypopulation.de reference date 2025-01-01.
["Istanbul"] = {placetype = {"นคร", "จังหวัด"}, divs = {"อำเภอ"}, container = "Turkey"}, -- 15.2 million; 16,000,000 (Agglomeration)
["İstanbul"] = {alias_of = "Istanbul", display = true},
["Ankara"] = {container = {key = "Ankara Province, Turkey", placetype = "จังหวัด"}}, -- 5.15 million; 5,200,000 (Agglomeration)
["Izmir"] = {container = {key = "İzmir Province, Turkey", placetype = "จังหวัด"}, wp = "İzmir"}, -- 2.95 million; 3,025,000 (Agglomeration)
["İzmir"] = {alias_of = "Izmir", display = true},
["Bursa"] = {container = {key = "Bursa Province, Turkey", placetype = "จังหวัด"}}, -- 2.02 million; 2,200,000 (Agglomeration)
["Adana"] = {container = {key = "Adana Province, Turkey", placetype = "จังหวัด"}}, -- 1.77 million; 1,780,000 (Agglomeration)
["Gaziantep"] = {container = {key = "Gaziantep Province, Turkey", placetype = "จังหวัด"}}, -- 1.71 million; 1,750,000 (Agglomeration)
["Antalya"] = {container = {key = "Antalya Province, Turkey", placetype = "จังหวัด"}}, -- 1.3 million; 1,400,000 (Agglomeration)
["Konya"] = {container = {key = "Konya Province, Turkey", placetype = "จังหวัด"}}, -- 1.35 million; 1,390,000 (Agglomeration)
["Diyarbakır"] = {container = {key = "Diyarbakır Province, Turkey", placetype = "จังหวัด"}}, -- 1.07 million; 1,100,000 (Agglomeration)
-- Diyarbakır is more common per Ngrams and Google Scholar, but Diyarbakir is the Kurdish form, so we should not
-- display-canonicalize to the Turkish form Diyarbakır.
["Diyarbakir"] = {alias_of = "Diyarbakır"},
["Mersin"] = {container = {key = "Mersin Province, Turkey", placetype = "จังหวัด"}}, -- 1.03 million; 1,060,000 (Agglomeration)
["Ashgabat"] = {container = "Turkmenistan"}, -- 1,150,000 (Agglomeration)
["Dubai"] = {container = "United Arab Emirates"}, -- 6,050,000 (Agglomeration; including Sharjah)
["Abu Dhabi"] = {container = "United Arab Emirates"}, -- 1,850,000 (City)
["Sharjah"] = {container = "United Arab Emirates"}, -- 1,800,000 (Metro area 2022-2023 per Wikipedia; separate from Dubai)
["Tashkent"] = {container = "Uzbekistan"}, -- 3,850,000 (unindicated)
["Sanaa"] = {container = "Yemen"}, -- 3,275,000 (City; population of low reliability)
["Sana'a"] = {alias_of = "Sanaa", display = true},
["Aden"] = {container = "Yemen"}, -- 1,079,060 (?; 2023 estimate from World Population Review per Wikipedia)
------------------ Europe or Europe-like (Caucasus etc.) ---------------------
["Yerevan"] = {container = "อาร์มีเนีย"}, -- 1,520,000 (Agglomeration)
["Vienna"] = {container = "ออสเตรีย"}, -- 2,375,000 (Agglomeration)
["Minsk"] = {container = "เบลารุส"}, -- 2,100,000 (unindicated)
["Brussels"] = {container = "เบลเยียม"}, -- 2,800,000 (Consolidated Urban Area)
["Antwerp"] = {container = "เบลเยียม"}, -- 1,270,000 (Consolidated Urban Area)
["Sofia"] = {container = "บัลแกเรีย"}, -- 1,260,000 (Agglomeration)
["Zagreb"] = {container = "โครเอเชีย"},
["Prague"] = {container = "สาธารณรัฐเช็ก"}, -- 1,470,000 (Agglomeration)
["Brno"] = {container = "สาธารณรัฐเช็ก"}, -- 729,405 (metro area per Wikipedia as of 2024-01-01 Czech Statistical Office)
["Olomouc"] = {container = "สาธารณรัฐเช็ก"}, -- 102,293 (city; included only because someone went crazy creating Olomouc-related terms)
["Copenhagen"] = {container = "เดนมาร์ก"}, -- 1,800,000 (Consolidated Urban Area)
["Helsinki"] = {container = {key = "Uusimaa, ฟินแลนด์", placetype = "ภูมิภาค"}}, -- 1,560,000 (Consolidated Urban Area)
["Tbilisi"] = {container = "Georgia"}, -- 1,430,000 (Agglomeration)
["Athens"] = {container = "กรีซ"},
["Thessaloniki"] = {container = "กรีซ"},
["Budapest"] = {container = "ฮังการี"},
-- FIXME, per Wikipedia "County Dublin" is now the "Dublin Region"
["Dublin"] = {container = {key = "County Dublin, ไอร์แลนด์", placetype = "เทศมณฑล"}},
["Riga"] = {container = "Latvia"},
["Amsterdam"] = {container = {key = "North Holland, Netherlands", placetype = "จังหวัด"}},
["Rotterdam"] = {container = {key = "South Holland, Netherlands", placetype = "จังหวัด"}},
["The Hague"] = {container = {key = "South Holland, Netherlands", placetype = "จังหวัด"}},
-- Christchurch (metro 546,600) and Wellington (metro 439,800) are too small to make it.
["Auckland"] = {container = {key = "Auckland, New Zealand", placetype = "ภูมิภาค"}},
["Oslo"] = {container = {key = "Oslo, Norway", placetype = "เทศมณฑล"}},
["Warsaw"] = {container = {key = "Masovian Voivodeship, Poland", placetype = "voivodeship"}},
["Katowice"] = {container = {key = "Silesian Voivodeship, Poland", placetype = "voivodeship"}},
--- Ngrams (up through 2022) and Google Scholar (>= 2024) confirms the common form "Krakow" without accent.
["Krakow"] = {container = {key = "Lesser Poland Voivodeship, Poland", placetype = "voivodeship"}, wp = "Kraków"},
["Kraków"] = {alias_of = "Krakow", display = true},
["Cracow"] = {alias_of = "Krakow", display = true},
--- Ngrams (up through 2022) and Google Scholar (>= 2024) confirm "Gdańsk" and "Poznań" with accent.
["Gdańsk"] = {container = {key = "Pomeranian Voivodeship, Poland", placetype = "voivodeship"}},
["Gdansk"] = {alias_of = "Gdańsk", display = true},
["Poznań"] = {container = {key = "Greater Poland Voivodeship, Poland", placetype = "voivodeship"}},
["Poznan"] = {alias_of = "Poznań", display = true},
--- Ngrams (up through 2022) and Google Scholar (>= 2024) confirms the common form "Lodz" without accents.
["Lodz"] = {container = {key = "Lodz Voivodeship, Poland", placetype = "voivodeship"}, wp = "Łódź"},
["Łódź"] = {alias_of = "Lodz", display = true},
["Lisbon"] = {container = {key = "Lisbon District, Portugal", placetype = "district"}},
["Porto"] = {container = {key = "Porto District, Portugal", placetype = "district"}},
["Oporto"] = {alias_of = "Porto", display = true},
["Bucharest"] = {container = "Romania"},
["Belgrade"] = {container = "Serbia"},
["Stockholm"] = {container = "Sweden"},
["Zurich"] = {container = "Switzerland"},
--- Ngrams (up through 2022) and Google Scholar (>= 2024) confirms the common form "Zurich" without umlaut.
--- Even Wikipedia uses the form without umlaut.
["Zürich"] = {alias_of = "Zurich", display = true},
["Kyiv"] = {container = "Ukraine"}, -- not in Kyiv Oblast
-- Don't display-canonicalize Kiev -> Kyiv because in ancient contexts, Kiev is still more common.
["Kiev"] = {alias_of = "Kyiv"},
["Kharkiv"] = {container = {key = "Kharkiv Oblast, Ukraine", placetype = "oblast"}},
["Odessa"] = {container = {key = "Odesa Oblast, Ukraine", placetype = "oblast"}, wp = "Odesa"},
-- Don't display-canonicalize Odesa -> Odessa because it may be interpreted as a political statement.
["Odesa"] = {alias_of = "Odessa"},
------------------ North America, South America ---------------------
-- Primary figures from citypopulation.de retrieved on 2025-04-26 (reference date 2025-01-01);
-- Wikipedia metropolitan figures from [[w:List of metropolitan areas in the Americas]] based on per-country data;
-- Wikipedia city limits figures from [[w:List of largest cities in the Americas]].
["Buenos Aires"] = {container = "อาร์เจนตินา"}, -- 16,800,000 (Consolidated Urban Area; 13,985,794 metropolitan area per Wikipedia)
["Córdoba, Argentina"] = {container = "อาร์เจนตินา", wp = "%l, %c"}, -- 1,810,000 (Consolidated Urban Area; 1,505,25 city limits per Wikipedia)
-- to avoid confusion with Córdoba in Spain
["Córdoba"] = {alias_of = "Córdoba, Argentina"},
["Cordoba"] = {alias_of = "Córdoba, Argentina", display = "Córdoba"},
["Rosario"] = {container = "อาร์เจนตินา", wp = "%l, Santa Fe"}, -- 1,510,000 (Consolidated Urban Area; 1,348,725 metropolitan area per Wikipedia)
["Mendoza"] = {container = "อาร์เจนตินา", wp = "%l, %c"}, -- 1,180,000 (Consolidated Urban Area)
["San Miguel de Tucumán"] = {container = "อาร์เจนตินา"}, -- 1,110,000 (Consolidated Urban Area)
["Tucumán"] = {alias_of = "San Miguel de Tucumán"},
["Tucuman"] = {alias_of = "San Miguel de Tucumán", display = "Tucumán"},
["Santa Cruz de la Sierra"] = {container = "โบลิเวีย"}, -- 1,960,000 (Consolidated Urban Area); 1,606,671 (city limits per Wikipedia)
["Santa Cruz"] = {alias_of = "Santa Cruz de la Sierra"},
["La Paz"] = {container = "โบลิเวีย"}, -- 1,870,000 (Consolidated Urban Area; composed of El Alto, now slightly larger, and La Paz)
["El Alto"] = {container = "โบลิเวีย"},
["Cochabamba"] = {container = "โบลิเวีย"}, -- 1,280,000 (Consolidated Urban Area)
["Santiago"] = {container = "ชิลี"}, -- 8,400,000 (Consolidated Urban Area; 6,903,479 city limits? per Wikipedia)
["Valparaíso"] = {container = "ชิลี"}, -- 1,060,000 (Consolidated Urban Area)
["Valparaiso"] = {alias_of = "Valparaíso"}, -- 1,060,000 (Consolidated Urban Area)
["Bogotá"] = {container = "โคลอมเบีย"}, -- 10,600,000 (Agglomeration; 12,772,828 metropolitan area per Wikipedia)
["Bogota"] = {alias_of = "Bogotá", display = true},
["Medellín"] = {container = "โคลอมเบีย"}, -- 4,350,000 (Agglomeration; 4,068,000 metropolitan area per Wikipedia)
["Medellin"] = {alias_of = "Medellín", display = true},
["Cali"] = {container = "โคลอมเบีย"}, -- 2,975,000 (Agglomeration; 2,837,000 metropolitan area per Wikipedia)
["Barranquilla"] = {container = "โคลอมเบีย"}, -- 2,375,000 (Agglomeration; 1,341,160 city limits per Wikipedia)
["Bucaramanga"] = {container = "โคลอมเบีย"}, -- 1,380,000 (Agglomeration)
["Cartagena, Colombia"] = {container = "โคลอมเบีย", wp = "%l, %c"}, -- 1,250,000 (Agglomeration)
-- to avoid confusion with Cartagena, Spain
["Cartagena"] = {alias_of = "Cartagena, Colombia"},
["Cúcuta"] = {container = "โคลอมเบีย"}, -- 1,130,000 (Agglomeration)
["Cucuta"] = {alias_of = "Cúcuta", display = true},
-- to avoid conflict with San Jose, California
["San José, Costa Rica"] = {container = "คอสตาริกา", wp = "%l, %c"}, -- 2,450,000 (Municipality (urban population); 3,160,000 metropolitan area per Wikipedia)
["San José"] = {alias_of = "San José, Costa Rica"},
["San Jose"] = {alias_of = "San José, Costa Rica"}, -- display = "San José"; causes error due to San Jose alias for California city; FIXME
["Havana"] = {container = "คิวบา"}, -- 2,150,000 (City; 2,137,847 city limits? per Wikipedia)
["Santo Domingo"] = {container = "สาธารณรัฐโดมินิกัน"}, -- 3,900,000 (Municipality (urban population); 4,274,651 ??? per Wikipedia)
["Guayaquil"] = {container = "เอกวาดอร์"}, -- 3,350,000 (Agglomeration; 3,092,000 metro area? per Wikipedia)
["Quito"] = {container = "เอกวาดอร์"}, -- 2,875,000 (Agglomeration; 2,889,703 metro area? per Wikipedia)
["San Salvador"] = {container = "เอลซัลวาดอร์"}, -- 1,580,000 (Municipality (urban population))
["Guatemala City"] = {container = "กัวเตมาลา"}, -- 3,375,000 (Municipality (urban population); 3,160,000 metro area? per Wikipedia)
["Port-au-Prince"] = {container = "เฮติ"}, -- 3,050,000 (Agglomeration; population of low reliability; 2,915,000 metro area? per Wikipedia)
["San Pedro Sula"] = {container = "ฮอนดูรัส"}, -- 1,330,000 (Consolidated Urban Area)
["Tegucigalpa"] = {container = "ฮอนดูรัส"}, -- 1,220,000 (Urban Area)
["Managua"] = {container = "Nicaragua"}, -- 1,400,000 (Consolidated Urban Area)
["Panama City"] = {container = "Panama"}, -- 1,430,000 (Urban Area)
["Asunción"] = {container = "Paraguay"}, -- 2,350,000 (Municipality (urban population))
["Lima"] = {container = "Peru"}, -- 12,000,000 (Agglomeration; 11,283,787 ??? per Wikipedia)
["Arequipa"] = {container = "Peru"}, -- 1,210,000 (Agglomeration)
["San Juan"] = {container = {key = "Puerto Rico", placetype = "commonwealth"}, wp = "%l, %c"}, -- 1,910,000 (Consolidated Urban Area)
["Montevideo"] = {container = "Uruguay"}, -- 1,810,000 (Agglomeration; 1,302,954 ??? per Wikipedia)
["Caracas"] = {container = "Venezuela"}, -- 3,850,000 (Consolidated Urban Area; 5,243,301 ??? per Wikipedia)
["Maracaibo"] = {container = "Venezuela"}, -- 2,825,000 (Consolidated Urban Area; 5,278,448 ??? per Wikipedia)
-- to avoid confusion with Valencia (city and autonomous community of Spain)
["Valencia, Venezuela"] = {container = "Venezuela", wp = "%l, %c"}, -- 2,100,000 (Consolidated Urban Area)
["Valencia"] = {alias_of = "Valencia, Venezuela"},
["Maracay"] = {container = "Venezuela"}, -- 1,480,000 (Consolidated Urban Area)
["Barquisimeto"] = {container = "Venezuela"}, -- 1,360,000 (Consolidated Urban Area)
}
export.misc_cities_group = {
canonicalize_key_container = make_canonicalize_key_container(nil, "ประเทศ"),
default_placetype = "นคร",
data = export.misc_cities,
}
--[==[ var:
List of all known locations, in groups. The first group lists continents and continental regions, followed by three
groups listing top-level locations: countries, "country-like entities" (de-facto/unrecognized/etc. countries and
dependent territories) and former polities (countries, empires, etc.). After that come first-level subpolities
(administrative divisions) of several, mostly large, countries, followed by groups of cities. China and the United
Kingdom include second-level subpolities (in the case of China, only the largest ones as the full list runs in the
hundreds).
]==]
export.locations = {
export.continents_group,
export.countries_group,
export.country_like_entities_group,
export.former_countries_group,
export.australia_group,
export.austria_group,
export.bangladesh_group,
export.brazil_group,
export.canada_group,
export.china_group,
export.china_prefecture_level_cities_group,
export.china_prefecture_level_cities_group_2,
export.egypt_group,
export.finland_group,
export.france_group,
export.france_departments_group,
export.germany_group,
export.greece_group,
export.india_group,
export.indonesia_group,
export.iran_group,
export.ireland_group,
export.italy_group,
export.japan_group,
export.laos_group,
export.lebanon_group,
export.malaysia_group,
export.malta_group,
export.mexico_group,
export.moldova_group,
export.morocco_group,
export.netherlands_group,
export.new_zealand_group,
export.nigeria_group,
export.north_korea_group,
export.norway_group,
export.pakistan_group,
export.philippines_group,
export.poland_group,
export.portugal_group,
export.romania_group,
export.russia_group,
export.saudi_arabia_group,
export.south_africa_group,
export.south_korea_group,
export.spain_group,
export.taiwan_group,
export.thailand_group,
export.turkey_group,
export.ukraine_group,
export.united_kingdom_group,
export.united_states_group,
export.england_group,
export.northern_ireland_group,
export.scotland_group,
export.wales_group,
export.vietnam_group,
export.australia_cities_group,
export.brazil_cities_group,
export.canada_cities_group,
export.france_cities_group,
export.germany_cities_group,
export.india_cities_group,
export.indonesia_cities_group,
export.italy_cities_group,
export.japan_cities_group,
export.mexico_cities_group,
export.nigeria_cities_group,
export.pakistan_cities_group,
export.philippines_cities_group,
export.russia_cities_group,
export.saudi_arabia_cities_group,
export.south_korea_cities_group,
export.spain_cities_group,
export.taiwan_cities_group,
export.united_kingdom_cities_group,
export.united_states_cities_group,
export.new_york_boroughs_group,
export.vietnam_cities_group,
export.misc_cities_group,
}
return export
ofhmtq58gfdveqagct5z7fna2kem0zu
5720698
5720697
2026-04-21T01:41:27Z
OctraBot
3198
5720698
Scribunto
text/plain
local export = {}
export.force_cat = false -- set to true to force category generation even on non-mainspace pages
local m_table = require("Module:table")
local string_utilities_module = "Module:string utilities"
local en_utilities_module = "Module:en-utilities"
local insert = table.insert
local concat = table.concat
local dump = mw.dumpObject
local unpack = unpack or table.unpack -- Lua 5.2 compatibility
--[==[ intro:
This module contains data on all known locations, along with some lower-level code to process them (higher-level
known-location code is in [[Module:place/placetypes]]). You must load this module using require(), not using
mw.loadData().
===Location data===
'''NOTE: In order to understand the following better, first read the introductory documentation in [[Module:place]],
especially the section `More about known locations`.'''
The bulk of the code in this module (after some helper functions and placetype tables) describes the known locations
and their relationships. Locations are grouped into ''location groups'' that share some common properties (examples are
states of the United States and cities in Brazil). Each location group is associated with two tables, a ''data table''
that lists the locations and their individual properties, and a ''metadata table'' that lists group-level properties and
defaults for the location properties. Each metadata table points to the associated data table (i.e. contains the data
table as its `data` field), and the global `locations` variable holds a list of all group metadata tables. A given
location is generally described by three values: (a) the group metadata table for the group the location is part of; (b)
the location's canonical ''key'', which is the actual key in the group's data table and is globally unique across all
locations; and (c) the location's ''spec'', which is the initialized object describing the properties of the location
and comes from the value in the data table corresponding to the canonical key, transformed by the `initialize_spec()`
function. These are typically named `group`, `key` and `spec`, respectively and in that order, and are found in the
arguments to many functions.
In a per-group data table, the keys are either ''canonical keys'' describing locations (which, as mentioned above, must
be globally unique) or ''alias keys'' specifying an allowed alias for a given location. There may be multiple aliases
for a given location and the alias keys only need to be unique within a particular group data table, not across all
groups. It is also possible for the same string to serve as an alias key in one group and a canonical key in another
group. (For example, `Newcastle` appears as an alias key in two different groups, referring to two different locations,
canonically known as `Newcastle upon Tyne`, for the city in England, and `Newcastle, New South Wales`, for the city in
New South Wales, ออสเตรเลีย; and `Birmingham` appears both as a canonical key in the group of English cities and an alias
key for canonical `Birmingham, Alabama` in the group of US cities.) The corresponding value objects are different for
canonical and alias keys. Corresponding to canonical keys are ''location specs'', describing the properies of the
location that cannot be derived from default properties of the group or global defaults. Corresponding to alias keys
are ''alias specs'', which are highly restricted in the properties they can contain, and whose properties do not have
per-group defaults, but only global defaults.
The canonical key is always the same as the bare category corresponding to the location, which is one of the reasons it
must be globally unique. For example, the country of Georgia uses the canonical key `Georgia` and corresponding bare
category [[:Category:Georgia]], while the US state of Georgia uses the canonical key `Georgia, USA` and corresponding
bare category [[:Category:Georgia, USA]]. The following conventions are followed in naming keys:
* Countries, ''country-like entities'' (which are a mixture of unrecognized de-facto states and dependent territories)
and ''former countries'' (which also includes other types of polities, such as the Roman Empire) use their unqualified
placename as the canonical key. (See the documentation for [[Module:place]] for the distinction between keys and
placenames, which is critical to understand when working with location data.) This also applies to constituent
countries (such as England, Aruba and the Faroe Islands) and constituent parts of grouped dependent territories (such
as the island of Saint Helena, which is administratively part of the British overseas territory of Saint Helena,
Ascension and Tristan da Cunha).
* Cities (including prefecture-level cities in China, which behave in most respects more like non-city administrative
divisions) also normally use their unqualified placename as the canonical key, but if this causes name conflicts or
ambiguities, they use a ''qualified key'' containing either the country name or immediate containing division (if
different) following a comma, such as the case of `Newcastle, New South Wales` and `Birmingham, Alabama` above.
Examples of name conflicts are the two cities just given; examples of ambiguities are the major cities of León and
Mérida in Mexico and city of Cartagena, Colombia, which are given the respective canonical keys of `León, Guanajuato`,
`Mérida, Yucatán` and `Cartagena, Colombia` to avoid ambiguity with the well-known respective cities of the same name
in Spain, even though none of those cities are large enough to be included as known locations in this module. (The
cutoff is generally having a metro area of at least 1,000,000 inhabitants, although there are exceptions.)
* Administrative divisions of countries, other than the exceptions noted above for constituent countries and dependent
territories, use a qualified key that contains the name of the country or constituent country in it, e.g.
`Normandy, ฝรั่งเศส` (a region), `Calvados, ฝรั่งเศส` (a department in the region of Normandy), `Herefordshire, England`
(a ceremonial county), `Northwest Territories, Canada` (a territory), `Central Finland, ฟินแลนด์` (a region),
`Antalya Province, Turkey` (a province), `Cluj County, Romania` (a county), `County Cork, ไอร์แลนด์` (a county) and
`New York, USA` (a state). As shown in these various examples, (a) first and second-level divisions are sometimes both
included (as in France, the United Kingdom and China); (b) the qualifier after the comma is sometimes a constituent
country (England) instead of a country (United Kingdom), and is sometimes abbreviated (USA rather than United States
or Unites States of America); (c) the word `the` is not normally included in the key even if the location is normally
preceded by `the` when following a preposition (there is a property in the location and alias specs to indicate this),
except in a very few cases (most notably `The Hague`); (d) the country is included as a qualifier even if it creates
an apparent redundancy, as with `Central Finland, ฟินแลนด์`; and (e) sometimes the placetype is included in the key, as
with provinces in Turkey and several other countries; states in Nigeria; and counties in Ireland, Romania and several
other countries. Whether the placetype is included, and whether it follows or precedes the placename, depends on
per-country conventions. For example, provinces in Turkey, อิหร่าน and several other countries (likewise for states in
Nigeria, oblasts in Russia, etc.) conventionally include the word "จังหวัด", "รัฐ", "Oblast" etc. in their name
because they are normally named after the largest city in the division, which would otherwise lead to ambiguity; and
counties in Ireland and Northern Ireland (and likewise County Durham, England) normally have the word "เทศมณฑล"
preceding rather than following them in their conventional name, so we follow this practice. The Wikipedia article
naming scheme for a given administrative division is a strong clue as to how the division is normally referred to,
and we usually follow this practice. (A minor exception is that the Wikipedia articles for provinces in Iran, Laos and
Thailand include the word `province` with an initial lowercase letter while provinces elsewhere, e.g. North and South
Korea, Saudi Arabia and Turkey, use uppercase `Province`; we normalize to uppercase `Province` in all cases.)
As mentioned above, associated with canonical keys in the group data table are location specs, which are objects
containing properties. It is important here to distinguish ''initialized specs'' from ''uninitialized specs''.
Unininitialized specs are as directly specified in [[Module:place/locations]], containing only those properties that
differ from the per-group or global defaults. Initialized specs result from calling `initialize_spec()` on an
uninitialized spec (it is idempotent in that it will do nothing if encountering an already-initialized spec). This
copies all group-level defaults that are not overridden in the location spec itself from the group-level metadata table
into the location spec, so that in general, no more reference need be made to the group to fetch the correct value of a
given location property. (The initialization process also does more transformations in a few cases, noted below.) Note
that the default value of a given property is stored under a key in the group metadata table that is preceded by the
string `default_`; for example, the default value corresponding to the `placetype` property of a given location is
specified in the `default_placetype` key in the group metadata table.
The following are the properties of the location spec.
* `placetype`: String specifying the placetype of the location (e.g. "ประเทศ", "รัฐ", province"). This can also be a
table of such types; in this case, the first listed type is the canonical type that will be used in descriptions, but
the location will be recognized (e.g. in a holonym, or for categorizing into the bare category) when tagged with any
of the specified types. The placetype '''must''' be either specified on an individual location or defaulted at the
group level, or an error occurs.
* `container`: Either a string, a ''canonicalized container'' structure or a list of either type, specifying the
immediate ''container'' (or containers) of the given location. A container is another location which this location is
considered to be directly part of, either politically or (above the country level) geographically. Some locations
belong to multiple immediate containers; this applies especially to transcontinental countries such as Russia and
Turkey. Containers can themselves have containers, forming a tree (or more correctly, a [[w:directed acyclic graph]])
of locations. The list of immediate container(s), followed by the container(s) of the container(s), etc., is termed
the ''container trail'', and some functions compute and return this trail as part of their operation. When a location
spec is initialized, the given container spec is canonicalized into ''canonical container form'', which consists of a
list of canonicalized container structures, each of which is of the form
`{key = "``container_key``", placetype = "``container_placetype``"}`, where ``container_key`` is a canonical location
key and ``container_placetype`` should be the listed placetype for the location, or the first listed placetype if
there are multiple. (FIXME: Since the key uniquely identifies the container location, we should eliminate the
placetype from the container structure.) The list of canonicalized container structures is stored into the
`.containers` field of the location spec (this happens even if the container value is unset in its uninitialized spec
form, causing it to default to the corresponding group-level value), and the `.container` field is set to {nil}. The
canonicalization process is described in more detail below under [[#Container spec canonicalization]].
* `divs`: List of recognized political divisions; e.g. for the Netherlands, a specification of the form
`divs = {"จังหวัด", "เทศบาล"}` will allow categories such as [[:Category:de:Provinces of the Netherlands]]
and [[:Category:pt:Municipalities of the Netherlands]] to be created. Any division that appears here must also be
found in `placetype_data`, or an error occurs. The entities appearing in the `divs` list can be structures as well as
just strings; this is explained more below under [[#Location divisions]]. Additional political divisions that apply to
all locations in a group can be specified at the group level using the group-only property `addl_divs`, which has the
same format as `divs`. This is intended to be used in the situation where some division types are shared among all
locations in the group and others differ from location to location. An example where this is used is the United
States, where `census-designated places` is specified in the group-level `addl_divs` so that all 50 states have
census-designated places categorized as e.g. [[:Category:Census-designated places in Arizona, USA]], but `counties`
and `county seats` are specified in the group-level `default_divs` because not all states have counties and county
seats (Alaska has boroughs and borough seats and Louisiana has parishes and parish seats), and some states have
additional divisions (New Jersey and Pennsylvania also have boroughs, while Colorado and Connecticut have
municipalities). Note that under most circumstances (particularly, if `container_parent_type` is not set as a property
associated with the division type), any division type specified on a sub-country-level location must also be specified
on all containers up through the country. For example, since French departments specify `communes` and
`municipalities` in `default_divs`, the same division types must be (and are) specified on French regions and for
France itself.
* `keydesc`: String directly specifying a description of the location, for use in generating the contents of category
pages related to the location. In place of a string, a function of three arguments (`group`, `key`, `spec`, as is
normal for locations) that computes the location description can also be given. This is used, for example, for
Russian federal subjects; see `construct_russia_federal_subject_keydesc`. The special string `+++` contained in the
keydesc is replaced with the default value of the location description, which specifies the location's placename,
placetype, and the corresponding values for each container in the container trail, generally up through (but not
beyond) the country level; see `no_include_container_in_desc` below. The location description is used to construct
the full description of various categories, such as bare location categories, whose description generally reads
`"{{(((}}langname}}} terms related to the people, culture, or territory of ``keydesc``."` where ``keydesc`` is the
specified or auto-constructed location description.
* `fulldesc`: String overriding the full description for the bare location category (but not for any other category).
This is currently used only for the location `Earth`, at the very top of the tree (because the standard
`people, culture or territory of ...` text doesn't make sense here), and for `Antarctica` (because it has no permanent
inhabitants). FIXME: This should be renamed `bare_category_fulldesc`.
* `addl_parents`: Specify additional parents for the bare location category, in addition to the category or categories
generated based on the immediate container(s). For example, `Hawaii, USA` specifies `Polynesia` as an additional
parent category; both `North Korea` and `South Korea` specify `Korea` (which is a specially handled location category)
as an additional parent; and `Earth` specifies `nature` (not a location category, but still a topic category) as an
additional parent (which in this case becomes the first parent, as `Earth` has no container). The only restriction on
the categories in `addl_parents` is that they must be topic categories, because each language-specific version of the
bare location category gets the corresponding language-specific versions of the categories in `addl_parents`. FIXME:
This shoudl be renamed `bare_category_addl_parents`.
* `wp`: Spec describing how to construct the Wikipedia article for the location. Each spec is either `true` (equivalent
to `"%l"`, i.e. use the full location placename directly) or a string containing formatting directives, indicating how
to construct the article name. The allowed formatting directives are `%l` (the full location placename), `%e` (the
elliptical location placename) and `%c` (the full placename of the first immediate container). For example, the
default value of `wp` for the group of United States cities is `"%l, %c"` since the city articles tend to be named
e.g. `Austin, Texas` (but with many exceptions, specified using `wp` fields at the city level). Another example is
Thai provinces, which specify a group-level default of `"%e province"` as the Wikipedia articles have lowercase
`province` in their name but the Thai province keys specified in this module have uppercase `Province`. Here we have
to use `%e` to get the placename without the word `Province` in it. The default is `true`, which simply uses the full
location placename as the article name. Note that the Wikipedia article, along with the Wikipedia and Commons category
pages, are shown in the upper right of bare category pages.
* `wpcat`: Spec describing how to construct the Wikipedia category page for the location (i.e. the page listing articles
and categories relevant to the location). The format is the same as with `wp`, and it defaults to the value of `wp`.
It rarely needs to be specified because the category page and the article page almost always follow the same format.
* `commonscat`: Spec describing how to construct the Commons category page for the location (i.e. the page on the
MediaWiki Commons site listing articles and categories relevant to the location). It has the same format as `wp` and
`wpcat` and defaults to `wpcat`, which is usually (but not always) correct.
* `the`: Boolean specifying whether a location should be preceded by `the` when following a preposition, e.g. in
category names such as [[:Category:Cities in the Northern Territory, ออสเตรเลีย]] and in old-style place descriptions
when the location occurs as the first holonym, such as the city [[Darwin]] described using
{{tl|place|city|terr/Northern Territory|c/Australia}}. Note that the global default for this and all Boolean
properties is {nil}, which amounts to the same as {false}.
* `british_spelling`: Boolean indicating whether the location in question uses British spelling. Currently this only
affects whether the spelling `neighborhoods` or `neighbourhoods` is used in categories such as
[[:Category:Neighborhoods of New York City]] and [[:Category:Neighbourhoods of Sydney]]. This usually needs to be set
only at the top level (i.e. country or country-like entity), because lower-level entities look up the container trail
for any container that has `british_spelling = true` set, and if found, assume that British spelling applies. The
general principle used in setting this is that all countries in Europe, all dependent territories of any such country,
all former British colonies, and any dependent territories of these former colonies, are assumed to use British
spelling, while all other countries and associated dependent territories are assumed to use American spelling. This
can potentially be modified on a case-by-case basis.
* `is_city`: Boolean indicating whether the location in question is a city. This is explicitly set to `true` for
city-states (e.g. Monaco and Vatican City), dependent territories that are cities (e.g. Hong Kong, Macau, Bonaire,
Gibraltar, etc.), certain city-level administrative divisions (such as `City of Belfast, Northern Ireland`) and
(through a group-levell setting) New York boroughs. In addition, it is set to `true` in initialize_spec() whenever
the group-level `default_placetype == "นคร"`, so that all cities get it set without explicitly needing to add a
group-level setting for this. Note that the condition `default_placetype == "นคร"` intentionally excludes Chinese
prefecture-level cities, which aren't really cities in that (for example) they don't directly contain neighborhoods,
but do contain cities within them. This setting is used in various places: (a) to add cities, rivers, etc. to
categories like [[:Category:Rivers in Osaka, ญี่ปุ่น]] and [[:Category:Cities in Wuhan]] for holonyms that
are ''not'' cities; (b) to add districts, neighborhoods, and the like to categories like
[[:Category:Neighborhoods of Brooklyn]] and [[:Category:Neighborhoods of Monaco]] for holoynms that ''are'' cities;
(c) generally, to determine which "generic" placetypes (cities, rivers, neighborhoods, etc.) apply to the location.
(Those that can occur with cities have a `generic_before_cities` setting in [[Module:place/placetypes]], and those
that can occur with non-cities have a `generic_before_non_cities` setting.)
* `is_former_place`: Boolean that should be set on former places such as the Soviet Union and the Roman Empire. For such
places, categories such as [[:Category:fr:Rivers in the Soviet Union]] are neither generated nor recognized (more
generally, no "generic" placetypes apply except for `places`), and category descriptions include the word `former`.
* `overriding_bare_label_parents`: Document me!
* `bare_category_parent_type`: Document me!
* `no_container_cat`: Document me!
* `no_container_parent`: Document me!
* `no_generic_place_cat`: Document me!
* `no_check_holonym_mismatch`: Document me!
* `no_auto_augment_container`: Document me!
* `no_include_container_in_desc`: Document me!
====Location divisions====
The `divs` field of a location describes the recognized political division types of that location. Specifying a given
division type will cause places defined as being of the specified division type and with the location as a holonym will
cause the place to be categorized as ` ``placetypes`` in/of ``location`` `; for example, specifying that the United
States has `"รัฐ"` as a division will cause anything defined as {{tl|place|fr|state|c/US}} to be categorized under
[[:Category:fr:States of the United States]]. Note that you do not have to explicitly specify division types for
"generic" placetypes (those that have a `generic_before_non_cities` field if the location is not a city, or that have a
`generic_before_cities` field if the location is a city); this includes things like cities, towns, villages,
neighbo(u)rhoods and rivers. A given element in the `divs` list is usually a string naming a plural placetype; the
placetype is automatically converted to the singular for recognizing the placetype in a {{tl|place}} spec, and irregular
plurals such as `kibbutzim` are handled correctly as long as the placetype specifies an appropriate `plural` field
(if the `plural` isn't explicitly given, the default singularization algorithm in [[Module:en-utilities]] is run, which
gets most things correctly but has problems with `passes` and `fortresses`, which are singularized to `passe` and
`fortresse`; for this reason, an explicit plural entry is added to terms in ''-ss''). In place of a string, an object
can be given with the plural placetype in the `type` field; this allows additional properties to be specified along with
the placetype. An example of this is the `divs` list for Canada:
{
["แคนาดา"] = {divs = {
{type = "รัฐ", cat_as = "รัฐและดินแดน"},
{type = "ดินแดน", cat_as = "รัฐและดินแดน"},
"เทศมณฑล", "อำเภอ", "เทศบาล", "regional municipalities",
"rural municipalities", "parishes",
"Indian reserves",
"census divisions",
{type = "townships", prep = "ใน"},
}, ...},
}
Here, both provinces and territories are set to categorize as `provinces and territories`, meaning that there is a
single category [[:Category:Provinces and territories of Canada]] rather than separate categories for provinces and
territories. Similar things are done for other countries that have more than one type of first-level administrative
division (e.g. Australia, จีน, อินเดีย and Pakistan). Note that any placetype listed under `cat_as` must exist in the
table of placetypes in [[Module:place/placetypes]], and in fact there is a category-only entry there for `provinces and
territories!` (the use of exclamation point following a plural placetype means that the placetype is present only for
use in categories and won't be recognized as the placetype field in a {{tl|place}} description). In addition, townships
are declared to use `in` rather than `of` as the preposition in the category; hence the category name will be
[[:Category:Townships in Canada]] rather than [[:Category:Townships of Canada]]. (The use of `in` vs. `of` is somewhat
related to whether a given placetype is an official administrative or statistical division of the location in question
and comes in a defined list, in which case `of` should be used, or is more ill-defined, in which case `in` should be
used; the default is `of`, and the use of `in` with `townships` is probably by analogy with the use of `in` with cities
and towns.)
Another more complex example is the divisions given for Quebec:
{
["Quebec, Canada"] = {divs = {
"เทศมณฑล",
{type = "regional county municipalities", container_parent_type = "regional municipalities"},
{type = "ภูมิภาค", container_parent_type = false},
{type = "townships", prep = "ใน"},
{type = "parish municipalities", cat_as = {{type = "parishes", container_parent_type = "เทศมณฑล"}, "เทศบาล"}},
{type = "township municipalities", cat_as = {{type = "townships", prep = "ใน"}, "เทศบาล"}},
{type = "village municipalities", cat_as = {{type = "villages", prep = "ใน"}, "เทศบาล"}},
}, ...},
}
Here, `container_parent_type` controls the second parent category of the placetype/location category associated with the
entry. In this case, for example, [[:Category:Counties of Quebec, Canada]] will have [[:Category:Counties of Canada]] as
its second or ''container-level'' parent. However, this doesn't make sense for `regional county municipalities`, which
exist only in Quebec (so the parent category [[:Category:Regional county municipalities of Canada]] would have only one
subcategory); but they are similar to regional municipalities in British Columbia, Nova Scotia and Ontario, so the
`container_parent_type = "regional municipalities"` spec causes the container-level parent of this category to be
[[:Category:Regional municipalities of Canada]]. Likewise, `regions` as administrative divisions (as opposed to mere
geographic regions) exist only in Quebec; they have no equivalent elsewhere, so we disable the container-level parent
using `container_parent_type = false`. The specs for `parish municipalities`, `township municipalities` and
`village municipalities` show both that multiple types can be specified under `cat_as` (here, for example, we categorize
`parish municipalities` as both `parishes` and `municipalities`) and that these types can themselves have properties,
just as for entries directly under `divs`. Specifically, `{type = "parishes", container_parent_type = "เทศมณฑล"}`
means that any place defined as a parish municipality in Quebec will be categorized under both [[:Category:Parishes of
Quebec, Canada]] and [[:Category:Municipalities of Quebec, Canada]], and that the former will have a container-level
parent of [[:Category:Counties of Canada]] (rather than the default of [[:Category:Parishes of Canada]]). Similarly,
`township municipalities` will be categorized under both [[:Category:Townships in Quebec, Canada]] (''not''
[[:Category:Townships of Quebec, Canada]]) and [[:Category:Municipalities of Quebec, Canada]].
====Container spec canonicalization====
A fully canonicalized container spec for a given location consists of a list of ''canonicalized container objects'',
each with a `key` and `placetype` field. The `key` field should name the canonical key of some other location at a
higher level (e.g. French cities are contained in French departments, which are contained in French regions, which are
contained in France, which is contained in Europe, which is contained in Eurasia, which is contained in the Earth). The
`placetype` field should correspond to the first (canonical) placetype listed for the key in question. The process of
initializing a locaion spec converts the container spec in `.container` into a canonicalized spec in `.containers` and
removes the spec from `.container`. It works as follows:
# If the `container` field is missing, and there is a group-level `default_container` field, it is used in its place.
For example, none of the Brazilian states listed in `brazil_states` specifies a container, but the group specifies
`default_container = "บราซิล"`.
# A single string or canonicalized container object is allowed and made into a one-element list.
# If a list element is a string that did ''not'' come from `default_container`, and there is a group-level
`canonicalize_key_container` field, it is assumed to be a one-argument function and is called on the string to get
a canonicalized container object.
# Any remaining strings are assumed to be countries and are used directly as the `key`, with `placetype` set to
`"ประเทศ"`.
====Alias keys====
Aliases can be provided for canonical keys using ''alias keys''. Alias keys have a very different location spec
structure from canonical keys. This structure does not, in general, have defaults at the group level and is not
initialized using `initialize_spec()`, but is used as-is. The following properties are recognized in an alias location
spec:
* `alias_of`: The canonical key of which this key is an alias. Required.
* `the`: If true, this alias key is preceded by `the` following a preposition. Defaults to the group-level `default_the`
but does not pay attention to the value of `the` for the corresponding canonical key.
* `display`: This is a display alias, meaning that holonyms using the placename corresponding to this alias will be
converted to the placename corresponding to the canonical key when formatting the holonym for display. (Otherwise,
the aliasing applies only to categorization.) If the value is true, the display canonicalization is to the placename
of the canonical key; otherwise, the value should be a key whose corresponding placename is used when display
canonicalizing.
* `placetype`: The placetype of the alias. Rarely needs to be specified as it defaults to the canonical key's placetype,
and if that is unspecified, to the group-level default placetype.
====Location group metadata tables====
As mentioned above, associated with each location group is a ''metadata table'' listing group-level properties. The
metadata table contains two types of keys: group-level defaults (named like the corresponding location-level keys but
preceded by `default_`, e.g. `default_placetype` corresponding to the location-level `placetype` key) and group-only
keys, which are mostly functions. The following are the possible group-only keys:
* `data`: This points to the group data table for the group, as described above.
* `key_to_placename`: This is a function of one argument to transform the location's key (whether canonical or alias)
into the full and elliptical placenames. The difference between full and elliptical placenames is described in the
documentation for [[Module:place]], but in essence, it applies for keys that include the placetype in them (e.g.
`Phuket Province, Thailand` or `County Mayo, ไอร์แลนด์`), in which case the full placename includes the placetype and
the elliptical placename does not. For keys that do not include the placetype in them (e.g. `Arizona, USA` or
`Gloucestershire, England`), the full and elliptical placenames are identical. Note that neither the full nor the
elliptical placename includes the container in it; hence, for `Phuket Province, Thailand`, the full placename is
`Phuket Province` and the elliptical placename is just `Phuket`. (Note that the full vs. elliptical placename
distinction is intended only for handling cases where the placetype follows or precedes the raw placename and there
is no difference between the two in whether they are normally preceded by `the`. More complex situations, such as
`State of Mexico` (which normally takes `the`) vs. just `Mexico` (which doesn't), or `Islamabad Capital Territory` vs.
just `Islamabad`, should be handled instead by aliases.) The `key_to_placename` function takes one argument, the key,
and returns two arguments, the full and elliptical placenames, respectively. If left undefined, the default is to
chop off anything starting with a comma and return the result as both full and elliptical placename, and if
specifically set to `false`, the key is used directly as both full and elliptical placename. If it needs to be
defined, it is best to use the helper function `make_key_to_placename`, if possible (or
`make_irish_type_key_to_placename` in the case of Ireland and Northern Ireland, where `County` precedes), rather than
rolling your own. In addition, you should use the global `key_to_placename` function (which takes care of the default
implementation and such) rather than directly calling the function in the `key_to_placename` field.
* `placename_to_key`: This is approximately the inverse of `key_to_placename`, transforming a placename (which can be
either in full or elliptical form) into the corresponding key. As with `key_to_placename`, if you need to define this
(generally, when the full and elliptical placenames are different), prefer using `make_placename_to_key` (or
`make_irish_type_placename_to_key` for Ireland and Northern Ireland) to rolling your own. In addition, similarly to
`key_to_placename`, use the global `placename_to_key` function to convert placenames to keys rather than directly
invoking the function in the `placename_to_key` field. If the field is set to `false`, the placename is used unchanged
as the key. Otherwise, the default algorithm works as follows:
*# If the group-level `default_placetype == "นคร"`, use the placename unchanged as the key.
*# Otherwise, if the group-level `default_container` exists and is a string, append it to the placename after a comma +
space and use the result as the key.
*# Otherwise, if the group-level `default_container` is a canonical container object (an object with `key` and
`placetype` fields), and the `placetype` field is either `country` or `constituent country`, append the `key` field
to the placename after a comma + space and use the result as the key.
*# Otherwise, use the placename unchanged as the key.
* `canonicalize_key_container`: A function of one argument to convert the specified `container` field, when a string,
to canonical form. Described in more detail above under [[#Container spec canonicalization]]. It is preferable to
construct the function using `make_canonicalize_key_container`, if possible, rather than rolling your own.
* `addl_divs`: Additional political divisions appended, for all locations in the group, to the list of divisions derived
from the location-level `divs` or group-level `default_divs` fields to get the final list of divisions for the
location. See [[#Location divisions]] for more details.
]==]
-----------------------------------------------------------------------------------
-- Helper functions --
-----------------------------------------------------------------------------------
--[==[
Throw an error. `fmt` is a format string and the remaining arguments are passed through `mw.dumpObject` and then used to
format the format string as if `fmt:format(...)` were called. In general, callers should use `internal_error` unless the
error was due to bad user input rather than a logic error (which usually isn't the case in deep back-end code like
this).
]==]
function export.process_error(fmt, ...)
local args = {...}
for i = 1, select("#", ...) do
args[i] = dump(args[i])
end
return error(string.format(fmt, unpack(args)))
end
--[==[
Throw an internal error (a logic error that should never happen unless there is a bug in the code, as opposed to a user
error triggered by bad input or a system error due to something like running out of memory or hitting a time limit).
`fmt` is a format string and the remaining arguments are passed through `mw.dumpObject` and then used to format the
format string as if `fmt:format(...)` were called.
]==]
function export.internal_error(fmt, ...)
export.process_error("Internal error: " .. fmt, ...)
end
local internal_error = export.internal_error
-- Return whether `list_or_element` (a list of strings, or a single string) "contains" `item` (a string). If
-- `list_or_element` is a list, this returns true if `item` is in the list; otherwise it returns true if `item`
-- equals `list_or_element`.
local function list_or_element_contains(list_or_element, item)
if type(list_or_element) == "table" then
return m_table.contains(list_or_element, item) and true or false
end
return list_or_element == item
end
--[==[
Call the location group's `key_to_placename` function if it exists (see the comment at the top of [[Module:place]] for
the distinction between keys and placenames). Two values are returned, the full and elliptical placenames (e.g. full
`"County Durham"` vs. elliptical `"Durham"`). If the group does not define `key_to_placename`, both full and elliptical
placenames are computed by chopping off anything starting with a comma.
]==]
function export.key_to_placename(group, key)
if group.key_to_placename == false then
return key, key
end
if group.key_to_placename then
local full_placename, elliptical_placename = group.key_to_placename(key)
if type(full_placename) ~= "string" then
internal_error("Key %s returned a non-string full placename: %s", key, full_placename)
end
if type(elliptical_placename) ~= "string" then
internal_error("Key %s returned a non-string elliptical placename: %s", key, elliptical_placename)
end
return full_placename, elliptical_placename
end
key = key:gsub(",.*", "")
return key, key
end
--[==[
Call the location group's `placename_to_key` function if it exists (see the comment at the top of [[Module:place]] for
the distinction between keys and placenames) and return the result. If `placename_to_key` exists with the value `false`,
return the placename unchanged. If the group does not define `placename_to_key`, and it defines a `default_container`
whose placetype is either `country` or `constituent country`, the container name is appended to the placename after a
comma and a space. Otherwise the placename is returned unchanged.
]==]
function export.placename_to_key(group, placename)
if group.placename_to_key == false then
return placename
elseif group.placename_to_key then
local key = group.placename_to_key(placename)
if type(key) ~= "string" then
internal_error("Placename %s returned a non-string key: %s", placename, key)
end
return key
elseif group.default_placetype == "นคร" then
return placename
else
local defcon = group.default_container
if not defcon then
return placename
elseif type(defcon) == "string" then
return placename .. ", " .. defcon
elseif type(defcon) == "table" and (defcon.placetype == "ประเทศ" or
defcon.placetype == "constituent country") then
return placename .. ", " .. defcon.key
else
return placename
end
end
end
--[==[
Initialize the location spec `spec`, augmenting it with default values taken from `group` if the spec itself doesn't
specify values for the properties. This sets `containers` to a canonicalized list of objects, each with `key` and
`placetype` keys, describing the immediate containers of the location, and erases (sets to nil) the original
non-canonicalized `container` field. (Most locations have only one immediate container but some, e.g. Russia, have more
than one. Containers should be carefully distinguished from category parents. Generally the container is the first
category parent, or the first ``n`` parents if there are ``n`` containers, but there may be additional category parents,
which indicate some sort of relation between the category parent and the location but not necessarily one of
containment.)
This function is idempotent in that nothing happens if called more than once on the same spec.
FIXME: Consider reimplementing this in a more standardly object-oriented way using metatables.
]==]
function export.initialize_spec(group, key, spec)
if spec.initialized then
return
end
local container = spec.container
local containers
local container_from_default
if not container then
container = group.default_container
container_from_default = true
end
if container then
if type(container) == "string" or container.key then
container = {container}
end
containers = {}
for _, cont in ipairs(container) do
if type(cont) == "string" then
if group.canonicalize_key_container and not container_from_default then
cont = group.canonicalize_key_container(cont)
else
cont = {key = cont, placetype = "ประเทศ"}
end
end
insert(containers, cont)
end
end
spec.containers = containers
spec.container = nil
local function value_with_default(val, default_val)
if val == nil then
return default_val
else
return val
end
end
local function set_or_default(prop)
spec[prop] = value_with_default(spec[prop], group["default_" .. prop])
end
set_or_default("placetype")
if not spec.placetype then
internal_error("No placetype found in key %s for spec %s or in group `default_placetype`", key, spec)
end
set_or_default("divs")
spec.addl_divs = group.addl_divs
for _, prop in ipairs {
"keydesc",
"fulldesc",
"addl_parents",
"overriding_bare_label_parents",
"bare_category_parent_type",
"wp",
"wpcat",
"commonscat",
"british_spelling",
"the",
"no_container_cat",
"no_container_parent",
"no_generic_place_cat",
"no_check_holonym_mismatch",
"no_auto_augment_container",
"no_include_container_in_desc",
"is_city",
"is_former_place",
} do
set_or_default(prop)
end
-- `default_placetype == "นคร"` is correct; if `default_placetype` has something else like `prefecture-level city`
-- as the canonical placetype but also lists `city` (as Chinese prefecture-level cities do), don't mark as
-- is_city.
spec.is_city = value_with_default(spec.is_city, group.default_placetype == "นคร")
spec.initialized = true
end
--[=[
Given a location group, key and possible placetypes that the placename must match, check if the key exists in the group
with at least one of the group's key's placetypes matching one of the passed-in placetypes. If so, return two values:
the group key (which potentially could differ from the passed-in key due to aliases) and the corresponding spec object,
which (as with all functions that return spec objects) has been initialized using `initialize_spec()` (i.e. default
property values have been copied from the group into the spec, if the spec doesn't itself specify a value for the
property in question).
`alias_resolution` controls how aliases are resolved. Normally, both display and category aliases are followed, and
the returned key will reflect the canonical location key. However, if `alias_resolution` is {"none"}, no alias following
happens. In that case, if the key specifies an alias, the spec for the alias rather than the spec for the canonical
location is returned, and importantly, it is returned uninitialized, meaning that properties from the group are not
copied into the spec. (If the key specifies a canonical location, its spec is returned initialized, as in the normal
case where `alias_resolution` is unspecified.) The caller needs to check whether the returned spec is an alias by
looking for an `alias_of` property. If `alias_resolution` is {"display"}, the behavior is the same as for {"none"}
except that if the alias contains a setting `display = true`, the returned key will reflect the canonical location key,
and if the alias contains a setting `display = ``string`` `, the returned key will reflect that string.
This is a low-level function meant for internal use; external callers should generally use `get_matching_location` (for
internally-derived locations), `find_matching_holonym_location` (for externally-derived locations) or
`find_canonical_key` (for known-canonical locations where the placetype isn't known).
]=]
local function find_matching_key_in_group(group, placetypes, key, alias_resolution)
if alias_resolution ~= nil and alias_resolution ~= "none" and alias_resolution ~= "display" and
alias_resolution ~= "all" then
internal_error("Bad value for 'alias_resolution': %s", alias_resolution)
end
local spec = group.data[key]
if not spec then
return nil
end
local function check_correct_placetype(placetype)
if type(placetype) == "table" then
for _, pt in ipairs(placetype) do
if list_or_element_contains(placetypes, pt) then
return true
end
end
return false
else
return list_or_element_contains(placetypes, placetype)
end
end
if spec.alias_of then
local resolved_key = spec.alias_of
local resolved_spec = group.data[resolved_key]
if not resolved_spec then
internal_error("Key %s is an alias of %s, which doesn't exist", key, resolved_key)
elseif resolved_spec.alias_of then
internal_error("Key %s is an alias of %s, which is itself an alias; indirect aliasing not allowed",
key, resolved_key)
end
if alias_resolution == "none" or alias_resolution == "display" then
-- We could be working with non-initialized/defaulted spec, since we're pulling it directly from the group.
local placetype = spec.placetype or resolved_spec.placetype or group.default_placetype
if not placetype then
internal_error("No placetype found for key %s in any of spec %s, alias-resolved spec %s or in group " ..
"`default_placetype`", key, spec, resolved_spec)
end
if not check_correct_placetype(placetype) then
return nil
end
if alias_resolution == "display" then
if spec.display == true then
key = resolved_key
elseif spec.display then
key = spec.display
end
end
return key, spec
end
key = resolved_key
spec = resolved_spec
end
-- We could be working with non-initialized/defaulted spec, since we're pulling it directly from the group.
local placetype = spec.placetype or group.default_placetype
if not placetype then
internal_error("No placetype found for key %s in spec %s or group `default_placetype`", key, spec)
end
if not check_correct_placetype(placetype) then
return nil
end
export.initialize_spec(group, key, spec)
return key, spec
end
--[=[
Given a location group, placename and possible placetypes that the placename must match, check if the placename exists
in the group with at least one of the placetypes of the key in the group that corresponds to the placename matching one
of the passed-in placetypes. If so, return two values: the key corrsponding to the passed-in placename and the
corresponding spec object. This is similar to `find_matching_key_in_group()` but works with placenames rather than keys.
`alias_resolution` is as in `find_matching_key_in_group()`.
This is a low-level function meant for internal use; external callers should generally use `get_matching_location` (for
internally-derived locations), `find_matching_holonym_location` (for externally-derived locations) or
`find_canonical_key` (for known-canonical locations where the placetype isn't known).
]=]
local function find_matching_placename_in_group(group, placetypes, placename, alias_resolution)
local key = export.placename_to_key(group, placename)
return find_matching_key_in_group(group, placetypes, key, alias_resolution)
end
--[==[
If `key` is a canonical known location key (i.e. not an alias), return the corresponding group and initialized spec.
If no such key exists, return {nil}. This throws an internal error if two locations with the same key are found.
]==]
function export.find_canonical_key(key)
local found_locations = {}
for _, group in ipairs(export.locations) do
local spec = group.data[key]
if not spec then
-- do nothing
elseif spec.alias_of then
mw.log(("Skipping alias '%s' of canonical '%s'"):format(key, spec.alias_of))
else
insert(found_locations, {group, spec})
end
end
if not found_locations[1] then
return nil
elseif found_locations[2] then
internal_error("Found multiple matching locations for canonical key %s: %s", key, found_locations)
else
local group, spec = unpack(found_locations[1])
export.initialize_spec(group, key, spec)
return group, spec
end
end
--[==[
Iterator that returns all locations matching a given description, where the description consists of either a placename
or a key along with a list of possible placetypes. Usually there will be at most one such location. The iterator
returns three values at each iteration: the location group, canonical key by which the location is known and the spec
object describing the location. `data` contains the following possible fields:
* `placetypes`: A list of possible placetypes, one of which must match one of the location's placetypes; or a string
specifying a placetype, which must match one of the location's placetypes. This must be specified.
* `placename`: The placename of the location. Either this or `key` must be specified.
* `key`: The key of the location. Either this or `placename` must be specified.
* `alias_resolution`: If specified, it behaves the same as for `find_matching_key_in_group`.
The spec is normally initialized using `initialize_spec()` prior to it being returned (but may not be if
`alias_resolution` is given and the specified key or placename is an alias; see the documentation for
`find_matching_key_in_group`).
]==]
function export.iterate_matching_location(data)
local i = 0
local n = #export.locations
return function()
while true do
i = i + 1
if i > n then
break
end
local group = export.locations[i]
local key, spec
if data.placename then
key, spec = find_matching_placename_in_group(group, data.placetypes, data.placename,
data.alias_resolution)
else
if not data.key then
internal_error("'.placename' or '.key' must be defined: %s", data)
end
key, spec = find_matching_key_in_group(group, data.placetypes, data.key, data.alias_resolution)
end
if key then
return group, key, spec
end
end
end
end
--[==[
Return the location matching a given description, where the description consists of either a placename or a key along
with a list of possible placetypes. This is similar to `iterate_matching_location()` but throws an internal error if
there is not exactly one location found; as such, it is for use with internally specified locations (such as the
containers of known locations) rather than externally specified locations, which may not match a known location and in
some cases may match multiple known locations. For finding an externally specified location, consider using
`find_matching_holonym_location`, which returns {nil} rather than throwing an error if the location isn't found, but
also (more importantly) checks to make sure there are no conflicting holonyms among the user-specified holonyms (e.g.
{{tl|place|city|s/Delaware|c/USA|t=Newark}} will not match the known location `Newark` (in New Jersey, not Delaware).
]==]
function export.get_matching_location(data)
local all_found = {}
for group, key, spec in export.iterate_matching_location(data) do
insert(all_found, {group, key, spec})
end
if not all_found[1] then
internal_error("Couldn't find matching location for data %s", data)
elseif all_found[2] then
internal_error("Found multiple matching locations for data %s: %s", data, all_found)
else
return unpack(all_found[1])
end
end
--[==[
Successively iterate over a location's containers, and then the containers of those containers, etc. Keep in mind that
locations may have multiple containers (e.g. Russia has both Europe and Asia as containers, and both Europe and Asia
have Eurasia as their container). A given container will never be returned twice (e.g. in the case where a specific
location A has locations B and C as containers, and B has C as its container, C will not be returned twice). An
internal error happens if a container loop is detected. The return value is a list of location objects, each of which
contains `group`, `key` and `spec` fields.
]==]
function export.iterate_containers(group, key, spec)
local keys_seen = {}
keys_seen[key] = true
local iterations = 0
local last_iteration_containers = {{group = group, key = key, spec = spec}}
return function()
iterations = iterations + 1
if iterations > 10 then
internal_error("Probable loop in containers when processing key %s", key)
end
local next_iteration_containers = {}
for _, location in ipairs(last_iteration_containers) do
local containers = location.spec.containers
if containers then
for _, container in ipairs(containers) do
local container_group, container_key, container_spec = export.get_matching_location {
placetypes = container.placetype,
key = container.key,
}
if not keys_seen[container_key] then
insert(next_iteration_containers, {
group = container_group, key = container_key, spec = container_spec
})
keys_seen[container_key] = true
end
end
end
end
if not next_iteration_containers[1] then
return nil
end
last_iteration_containers = next_iteration_containers
return next_iteration_containers
end
end
--[==[
Given a placename, convert it into a link (two-part if `display_form` is given and differs from `placename`) and add
`"the "` to the beginning if called for in `spec`.
]==]
function export.construct_linked_placename(spec, placename, display_form)
local linked_placename = display_form and placename ~= display_form and ("[[%s|%s]]"):format(placename,
display_form) or ("[[%s]]"):format(placename)
if spec.the then
linked_placename = "the " .. linked_placename
end
return linked_placename
end
--[=[
This is typically used to define `key_to_placename`. It generates a function that chops off parts of a string (a
location key), typically at the end, in order to get the full and elliptical versions of a placename. (See the
documentation above for `key_to_placename` under "Location group tables" for the difference between full and elliptical
placenames.) `container_patterns` is a Lua pattern or a list of possible patterns matching the container at the end of
the key, which will be used to remove that container. If multiple patterns are specified, each one is tried until one
matches. If `container_patterns` is omitted, this part of the process is skipped. The reulting string becomes the full
placename. If `divtype_patterns` is specified, it is likewise either a Lua pattern or list of possible patterns to match
and remove the political division affixed onto the end (or possibly the beginning) of the key in the keys of certain
countries (such as South Korean and North Korean counties, which include the word "เทศมณฑล" in the key). The resulting
chopped string becomes the elliptical placename. If `divtype_patterns` is omitted, this part of the process is skipped
and the full and elliptical placenames are the same.
Typical usage is as follows:
```
key_to_placename = make_key_to_placename(", England$"),
```
or (when the political division is part of the key)
```
key_to_placename = make_key_to_placename(", South Korea$", " County$")
```
]=]
local function make_key_to_placename(container_patterns, divtype_patterns)
if type(container_patterns) == "string" then
container_patterns = {container_patterns}
end
if type(divtype_patterns) == "string" then
divtype_patterns = {divtype_patterns}
end
return function(key)
local full_placename = key
if container_patterns then
for _, container_pattern in ipairs(container_patterns) do
local nsubs
full_placename, nsubs = full_placename:gsub(container_pattern, "")
if nsubs > 0 then
break
end
end
end
local elliptical_placename = full_placename
if divtype_patterns then
for _, divtype_pattern in ipairs(divtype_patterns) do
local nsubs
elliptical_placename, nsubs = elliptical_placename:gsub(divtype_pattern, "")
if nsubs > 0 then
break
end
end
end
return full_placename, elliptical_placename
end
end
--[=[
This is typically used to define `placename_to_key`. It generates a function that appends a string to the end of a given
placename to get the key (see the definition of `placename_to_key` above in the documentation under "Location group
tables"). Optional `divtype_suffix` is a raw string (which should not contain hyphens or other characters that have
special meaning in Lua patterns) to be appended first to the placename; if already present at the end, it is not
appended. `container_suffix` is then added in the same fashion if given. Typical usage is like this:
```
placename_to_key = make_placename_to_key(", England")
```
(which will convert e.g. `"Hampshire"` into `"Hampshire, England"`)
or
```
placename_to_key = make_placename_to_key(", South Korea", " County")
```
(which will convert e.g. `"Gangwon"` or `"Gangwon County"` into `"Gangwon County, South Korea"`).
]=]
local function make_placename_to_key(container_suffix, divtype_suffix)
return function(placename)
local key = placename
if divtype_suffix then
if not key:find("^" .. divtype_suffix) then --th; เปลี่ยนไปเติมข้างหน้าแทน
key = divtype_suffix .. key --th
end
end
if container_suffix then
key = container_suffix .. key --th
end
return key
end
end
--[=[
This is typically used to define `canonicalize_key_container`, which converts a container as specified in the location
data into the canonical form containing both the full container key and its placetype. It generates a function to do
the canonicalization of a given container. If the container is a string, `suffix` is appended onto the string (use {nil}
or {""} if there is no suffix to append), and the placetype is set to `placetype`. Otherwise the container is left
as-is. Typical usage is like this:
```
canonicalize_key_container = make_canonicalize_key_container(", Canada", "จังหวัด")
```
which will convert e.g. `"Ontario"` into `{key = "Ontario, Canada", placetype = "จังหวัด"}`.
]=]
local function make_canonicalize_key_container(suffix, placetype)
return function(container)
if type(container) == "string" then
return {key = container .. (suffix or ""), placetype = placetype}
else
return container
end
end
end
-----------------------------------------------------------------------------------
-- Top-level tables --
-----------------------------------------------------------------------------------
export.continents = {
["โลก"] = {the = true, placetype = "ดาวเคราะห์", addl_parents = {"ธรรมชาติ"},
fulldesc = "=the planet [[Earth]] and the features found on it"},
["แอฟริกา"] = {placetype = "ทวีป", container = {key = "โลก", placetype = "ดาวเคราะห์"}},
["อเมริกา"] = {placetype = {"มหาทวีป", "ทวีป"}, container = {key = "โลก", placetype = "ดาวเคราะห์"},
keydesc = "[[America]], in the sense of [[North America]] and [[South America]] combined",
wp = "Americas"},
["อเมริกาส์"] = {alias_of = "อเมริกา", the = true},
["อเมริกาเหนือ"] = {placetype = "ทวีป", container = {key = "อเมริกา", placetype = "มหาทวีป"}},
["แคริบเบียน"] = {the = true, placetype = {"continental region", "ภูมิภาค"}, container = {key = "อเมริกาเหนือ", placetype = "ทวีป"}},
["อเมริกากลาง"] = {placetype = {"continental region", "ภูมิภาค"}, container = {key = "อเมริกาเหนือ", placetype = "ทวีป"}},
["อเมริกาใต้"] = {placetype = "ทวีป", container = {key = "อเมริกา", placetype = "มหาทวีป"}},
["แอนตาร์กติกา"] = {placetype = "ทวีป", container = {key = "โลก", placetype = "ดาวเคราะห์"},
fulldesc = "=the territory of [[Antarctica]]"},
["ยูเรเชีย"] = {placetype = {"มหาทวีป", "ทวีป"}, container = {key = "โลก", placetype = "ดาวเคราะห์"},
keydesc = "[[Eurasia]], i.e. [[Europe]] and [[Asia]] together"},
["เอเชีย"] = {placetype = "ทวีป", container = {key = "ยูเรเชีย", placetype = "มหาทวีป"}},
["ยุโรป"] = {placetype = "ทวีป", container = {key = "ยูเรเชีย", placetype = "มหาทวีป"}},
["โอเชียเนีย"] = {placetype = "ทวีป", container = {key = "โลก", placetype = "ดาวเคราะห์"}},
["เมลานีเชีย"] = {placetype = {"continental region", "ภูมิภาค"}, container = {key = "โอเชียเนีย", placetype = "ทวีป"}},
["ไมโครนีเชีย (ภูมิภาค)"] = {placetype = {"continental region", "ภูมิภาค"}, container = {key = "โอเชียเนีย", placetype = "ทวีป"}}, --ชื่อซ้ำกัน: ภูมิภาค/สหพันธรัฐ
["พอลินีเชีย"] = {placetype = {"continental region", "ภูมิภาค"}, container = {key = "โอเชียเนีย", placetype = "ทวีป"}},
}
export.continents_group = {
default_overriding_bare_label_parents = {}, -- container parents should be used
default_divs = {{type = "ประเทศ", prep = "ใน"}},
-- It's enough to mention the first-level continent or continent group. It seems excessive to write e.g.
-- "El Salvador, a country in Central America, a continental region in North America, a continent in America, ...".
default_no_include_container_in_desc = true,
default_no_container_cat = true,
default_no_container_parent = true,
default_no_auto_augment_container = true,
default_no_generic_place_cat = true,
-- French Guyana is in France but not in Europe, which should not be an issue, so don't check holonym mismatches at
-- this level. We also run into problems with supercontinents, which have "ทวีป" as the fallback and cause
-- mismatches.
default_no_check_holonym_mismatch = true,
data = export.continents,
}
-- Countries: including those with partial recognition that are normally considered countries (e.g. Kosovo, Taiwan).
export.countries = {
["อัฟกานิสถาน"] = {container = "เอเชีย", divs = {"จังหวัด", "อำเภอ"}},
["แอลเบเนีย"] = {container = "ยุโรป", divs = {"เทศมณฑล", "เทศบาล", "communes",
{type = "administrative units", cat_as = "communes"},
}, british_spelling = true},
["แอลจีเรีย"] = {container = "แอฟริกา", divs = {"จังหวัด", "communes", "อำเภอ", "เทศบาล"}},
["อันดอร์รา"] = {container = "ยุโรป", divs = {"parishes"}, british_spelling = true},
["แองโกลา"] = {container = "แอฟริกา", divs = {"จังหวัด", "เทศบาล"}},
["แอนทีกาและบาร์บิวดา"] = {container = "แคริบเบียน", divs = {"จังหวัด"}, british_spelling = true},
["อาร์เจนตินา"] = {container = "อเมริกาใต้", divs = {"จังหวัด", "departments", "เทศบาล"}},
["อาร์มีเนีย"] = {container = {"ยุโรป", "เอเชีย"}, divs = {"จังหวัด", "อำเภอ", "เทศบาล"},
british_spelling = true},
["สาธารณรัฐอาร์มีเนีย"] = {alias_of = "อาร์มีเนีย", the = true}, -- differs in "the"
-- Both a country and continent
["ออสเตรเลีย"] = {container = "โอเชียเนีย", divs = {
{type = "รัฐ", cat_as = "states and territories"},
{type = "ดินแดน", cat_as = "states and territories"},
{type = "ABBREVIATION_OF states", cat_as = "abbreviations of states and territories"},
{type = "ABBREVIATION_OF territories", cat_as = "abbreviations of states and territories"},
"local government areas", "dependent territories",
}, british_spelling = true},
["ออสเตรีย"] = {container = "ยุโรป", divs = {"รัฐ", "อำเภอ", "เทศบาล"}, british_spelling = true},
["อาเซอร์ไบจาน"] = {container = {"ยุโรป", "เอเชีย"}, divs = {"อำเภอ", "เทศบาล"}, british_spelling = true},
["บาฮามาส"] = {the = true, container = "แคริบเบียน", divs = {"อำเภอ"}, british_spelling = true, wp = "The %l"},
["บาห์เรน"] = {container = "เอเชีย", divs = {"governorates"}},
["บังกลาเทศ"] = {container = "เอเชีย", divs = {"divisions", "อำเภอ", "เทศบาล"}, british_spelling = true},
["บาร์เบโดส"] = {container = "แคริบเบียน", divs = {"parishes"}, british_spelling = true},
["เบลารุส"] = {container = "ยุโรป", divs = {"ภูมิภาค", "อำเภอ"}, british_spelling = true},
["เบลเยียม"] = {container = "ยุโรป", divs = {"ภูมิภาค", "จังหวัด", "เทศบาล"}, british_spelling = true},
["เบลีซ"] = {container = "อเมริกากลาง", divs = {"อำเภอ"}, british_spelling = true},
["เบนิน"] = {container = "แอฟริกา", divs = {"departments", "communes"}},
["ภูฏาน"] = {container = "เอเชีย", divs = {"อำเภอ", "gewogs"}},
["โบลิเวีย"] = {container = "อเมริกาใต้", divs = {"จังหวัด", "departments", "เทศบาล"}},
["บอสเนียและเฮอร์เซโกวีนา"] = {container = "ยุโรป", divs = {"entities", "cantons", "เทศบาล"}, british_spelling = true},
--["Bosnia and Hercegovina"] = {alias_of = "บอสเนียและเฮอร์เซโกวีนา", display = true},
["บอสเนีย-เฮอร์เซโกวีนา"] = {alias_of = "บอสเนียและเฮอร์เซโกวีนา", display = true},
--["Bosnia-Hercegovina"] = {alias_of = "บอสเนียและเฮอร์เซโกวีนา", display = true},
["บอสเนีย"] = {alias_of = "บอสเนียและเฮอร์เซโกวีนา", display = true},
["บอตสวานา"] = {container = "แอฟริกา", divs = {"อำเภอ", "ตำบล"}, british_spelling = true},
["บราซิล"] = {container = "อเมริกาใต้", divs = {
"รัฐ", "เทศบาล", "macroregions",
{type = "ABBREVIATION_OF states", cat_as = "abbreviations of states"},
}},
["บรูไน"] = {container = "เอเชีย", divs = {"อำเภอ", "mukims"}, british_spelling = true},
["บัลแกเรีย"] = {container = "ยุโรป", divs = {"จังหวัด", "เทศบาล"}, british_spelling = true},
["บูร์กินาฟาโซ"] = {container = "แอฟริกา", divs = {"ภูมิภาค", "departments", "จังหวัด"}},
["บุรุนดี"] = {container = "แอฟริกา", divs = {"จังหวัด", "communes"}},
["กัมพูชา"] = {container = "เอเชีย", divs = {"จังหวัด", "อำเภอ"}},
["แคเมอรูน"] = {container = "แอฟริกา", divs = {"ภูมิภาค", "departments"}},
["แคนาดา"] = {container = "อเมริกาเหนือ", divs = {
{type = "รัฐ", cat_as = "รัฐและดินแดน"}, --ตาม thwiki
{type = "ดินแดน", cat_as = "รัฐและดินแดน"},
{type = "ABBREVIATION_OF provinces", cat_as = "abbreviations of รัฐและดินแดน"},
{type = "ABBREVIATION_OF territories", cat_as = "abbreviations of รัฐและดินแดน"},
"เทศมณฑล", "อำเภอ", "เทศบาล", "regional municipalities",
"rural municipalities", "parishes",
-- Don't change the following to something more politically correct (e.g. "First Nations reserves") until/unless
-- the Canadian government makes a similar switch (and note that as of Apr 18 2025, the Wikipedia article is
-- still at [[w:Indian reserves]]).
"Indian reserves",
"census divisions",
{type = "townships", prep = "ใน"},
},
british_spelling = true},
["กาบูเวร์ดี"] = {container = "แอฟริกา", divs = {"เทศบาล", "parishes"}},
["เคปเวิร์ด"] = {alias_of = "กาบูเวร์ดี", display = true},
["สาธารณรัฐแอฟริกากลาง"] = {the = true, container = "แอฟริกา", divs = {"prefectures", "subprefectures"}},
["CAR"] = {alias_of = "สาธารณรัฐแอฟริกากลาง", display = true, the = true},
["C.A.R"] = {alias_of = "สาธารณรัฐแอฟริกากลาง", display = true, the = true},
["ชาด"] = {container = "แอฟริกา", divs = {"ภูมิภาค", "departments"}},
["ชิลี"] = {container = "อเมริกาใต้", divs = {"ภูมิภาค", "จังหวัด", "communes"}},
["จีน"] = {container = "เอเชีย", divs = {
{type = "มณฑล", cat_as = "provinces and autonomous regions"}, --ตาม thwiki
{type = "autonomous regions", cat_as = "provinces and autonomous regions"},
{type = "FORMER provinces", cat_as = "former provinces"},
"special administrative regions",
"จังหวัด", --ตาม thwiki
{type = "FORMER prefectures", cat_as = "former prefectures"},
"prefecture-level cities",
{type = "เทศมณฑล", cat_as = "counties and county-level cities"},
{type = "county-level cities", cat_as = "counties and county-level cities"},
{type = "FORMER counties", cat_as = "former counties and county-level cities"},
{type = "FORMER county-level cities", cat_as = "former counties and county-level cities"},
-- "towns" (but not "townships") are automatically added as they are specified as generic_before_non_cities.
"อำเภอ",
{type = "FORMER districts", cat_as = "former districts"},
"ตำบล",
"townships",
"เทศบาล",
{type = "direct-administered municipalities", cat_as = "เทศบาล"},
}},
["สาธารณรัฐประชาชนจีน"] = {alias_of = "จีน", the = true}, -- differs in "the"
["โคลอมเบีย"] = {container = "อเมริกาใต้", divs = {"departments", "เทศบาล"}},
["คอโมโรส"] = {the = true, container = "แอฟริกา", divs = {"autonomous islands"}},
["คอสตาริกา"] = {container = "อเมริกากลาง", divs = {"จังหวัด", "cantons"}},
["โครเอเชีย"] = {container = "ยุโรป", divs = {"เทศมณฑล", "เทศบาล"}, british_spelling = true},
["คิวบา"] = {container = "แคริบเบียน", divs = {"จังหวัด", "เทศบาล"}},
["ไซปรัส"] = {container = {"ยุโรป", "เอเชีย"}, divs = {"อำเภอ"}, british_spelling = true},
["สาธารณรัฐเช็ก"] = {the = true, container = "ยุโรป", divs = {"ภูมิภาค", "อำเภอ", "เทศบาล"}, british_spelling = true},
["เช็กเกีย"] = {alias_of = "สาธารณรัฐเช็ก"}, -- differs in "the"
["สาธารณรัฐประชาธิปไตยคองโก"] = {the = true, container = "แอฟริกา", divs = {"จังหวัด", "ดินแดน"}},
["คองโก"] = {alias_of = "สาธารณรัฐประชาธิปไตยคองโก", display = true, the = true},
["DRC"] = {alias_of = "สาธารณรัฐประชาธิปไตยคองโก", display = true, the = true},
["D.R.C"] = {alias_of = "สาธารณรัฐประชาธิปไตยคองโก", display = true, the = true},
["เดนมาร์ก"] = {container = "ยุโรป", divs = {"ภูมิภาค", "เทศบาล", "dependent territories"},
british_spelling = true,
-- Wikipedia separates [[w:Denmark]] (constituent country) from [[w:Danish Realm]] (country)
},
["จิบูตี"] = {container = "แอฟริกา", divs = {"ภูมิภาค", "อำเภอ"}},
["ดอมินีกา"] = {container = "แคริบเบียน", divs = {"parishes"}, british_spelling = true},
["สาธารณรัฐโดมินิกัน"] = {the = true, container = "แคริบเบียน", divs = {"จังหวัด", "เทศบาล"},
keydesc = "the [[Dominican Republic]], the country that shares the [[Caribbean]] island of [[Hispaniola]] with [[Haiti]]"},
["ติมอร์-เลสเต"] = {container = "เอเชีย", divs = {"เทศบาล"}, wp = "ติมอร์-เลสเต"},
["ติมอร์ตะวันออก"] = {alias_of = "ติมอร์-เลสเต", display = true},
["เอกวาดอร์"] = {container = "อเมริกาใต้", divs = {"จังหวัด", "cantons"}},
["อียิปต์"] = {container = "แอฟริกา", divs = {"governorates", "ภูมิภาค"}, british_spelling = true},
["เอลซัลวาดอร์"] = {container = "อเมริกากลาง", divs = {"departments", "เทศบาล"}},
["อิเควทอเรียลกินี"] = {container = "แอฟริกา", divs = {"จังหวัด"}},
["เอริเทรีย"] = {container = "แอฟริกา", divs = {"ภูมิภาค", "subregions"}},
["เอสโตเนีย"] = {container = "ยุโรป", divs = {"เทศมณฑล", "เทศบาล"}, british_spelling = true},
["เอสวาตินี"] = {container = "แอฟริกา", british_spelling = true},
["สวาซีแลนด์"] = {alias_of = "เอสวาตินี", display = true},
["เอธิโอเปีย"] = {container = "แอฟริกา", divs = {"ภูมิภาค", "zones"}},
["สหพันธรัฐไมโครนีเชีย"] = {the = true, container = "ไมโครนีเชีย", divs = {"รัฐ"}},
["ไมโครนีเชีย"] = {alias_of = "สหพันธรัฐไมโครนีเชีย"}, --ชื่อซ้ำกัน: ภูมิภาค/สหพันธรัฐ
["ฟีจี"] = {container = "เมลานีเชีย", divs = {"divisions", "จังหวัด"}, british_spelling = true},
["ฟินแลนด์"] = {container = "ยุโรป", divs = {"ภูมิภาค", "เทศบาล"}, british_spelling = true},
["ฝรั่งเศส"] = {container = "ยุโรป", divs = {"ภูมิภาค", "cantons", "collectivities",
"communes",
{type = "เทศบาล", cat_as = "communes"},
"departments",
{type = "prefectures", cat_as = {"prefectures", "departmental capitals"}},
{type = "French prefectures", cat_as = {"prefectures", "departmental capitals"}},
"dependent territories", "ดินแดน", "จังหวัด",
}, british_spelling = true},
["กาบอง"] = {container = "แอฟริกา", divs = {"จังหวัด", "departments"}},
["แกมเบีย"] = {the = true, container = "แอฟริกา", divs = {"divisions", "อำเภอ"}, british_spelling = true, wp = "The %l"},
["จอร์เจีย"] = {container = {"ยุโรป", "เอเชีย"}, divs = {"ภูมิภาค", "อำเภอ"},
keydesc = "the country of [[Georgia]], in [[Eurasia]]", british_spelling = true, wp = "%l (country)"},
["เยอรมนี"] = {container = "ยุโรป", divs = {
"รัฐ",
-- Bavaria, Baden-Württemberg, Hesse and North Rhine-Westphalia have administrative regions as divisions, but
-- there aren't really enough of them to categorize per state.
"ภูมิภาค",
"เทศบาล", "อำเภอ"}, british_spelling = true},
["กานา"] = {container = "แอฟริกา", divs = {"ภูมิภาค", "อำเภอ"}, british_spelling = true},
["กรีซ"] = {container = "ยุโรป", divs = {"ภูมิภาค", "regional units", "เทศบาล",
{type = "peripheries", cat_as = {"ภูมิภาค"}},
}, british_spelling = true},
["กรีเนดา"] = {container = "แคริบเบียน", divs = {"parishes"}, british_spelling = true},
["กัวเตมาลา"] = {container = "อเมริกากลาง", divs = {"จังหวัด", "เทศบาล"}},
["กินี"] = {container = "แอฟริกา", divs = {"ภูมิภาค", "prefectures"}},
["กินี-บิสเซา"] = {container = "แอฟริกา", divs = {"ภูมิภาค"}},
["กายอานา"] = {container = "อเมริกาใต้", divs = {"ภูมิภาค"}, british_spelling = true},
["เฮติ"] = {container = "แคริบเบียน", divs = {"departments", "arrondissements"}},
["ฮอนดูรัส"] = {container = "อเมริกากลาง", divs = {"departments", "เทศบาล"}},
["ฮังการี"] = {container = "ยุโรป", divs = {"เทศมณฑล", "อำเภอ"}, british_spelling = true},
["ไอซ์แลนด์"] = {container = "ยุโรป", divs = {"ภูมิภาค", "เทศบาล", "เทศมณฑล"}, british_spelling = true},
["อินเดีย"] = {container = "เอเชีย", divs = {
{type = "รัฐ", cat_as = "states and union territories"},
{type = "union territories", cat_as = "states and union territories"},
{type = "ABBREVIATION_OF states", cat_as = "abbreviations of states and union territories"},
{type = "ABBREVIATION_OF union territories", cat_as = "abbreviations of states and union territories"},
"divisions", "อำเภอ", "เทศบาล",
}, british_spelling = true},
["อินโดนีเซีย"] = {container = "เอเชีย", divs = {"regencies", "จังหวัด",
{type = "ABBREVIATION_OF provinces", cat_as = "abbreviations of provinces"},
}},
["อิหร่าน"] = {container = "เอเชีย", divs = {"จังหวัด", "เทศมณฑล"}},
["อิรัก"] = {container = "เอเชีย", divs = {"governorates", "อำเภอ"}},
["ไอร์แลนด์"] = {container = "ยุโรป", addl_parents = {"British Isles"},
divs = {"เทศมณฑล", "อำเภอ", "จังหวัด"}, british_spelling = true, wp = "Republic of %l"},
["สาธารณรัฐไอร์แลนด์"] = {alias_of = "ไอร์แลนด์", the = true}, -- differs in "the"
["อิสราเอล"] = {container = "เอเชีย", divs = {"อำเภอ"}},
["อิตาลี"] = {container = "ยุโรป", divs = {
"ภูมิภาค", "จังหวัด", "metropolitan cities", "เทศบาล",
{type = "autonomous regions", cat_as = "ภูมิภาค"},
}, british_spelling = true},
["โกตดิวัวร์"] = {container = "แอฟริกา", divs = {"อำเภอ", "ภูมิภาค"}},
-- We should really be using Ivory Coast (common name) but there are political ramifications to the use of
-- Côte d'Ivoire so don't make it a display alias.
["ไอวอรีโคสต์"] = {alias_of = "โกตดิวัวร์"},
["จาเมกา"] = {container = "แคริบเบียน", divs = {"parishes"}, british_spelling = true},
["ญี่ปุ่น"] = {container = "เอเชีย", divs = {"จังหวัด", "กิ่งจังหวัด", "เทศบาล"}},
["จอร์แดน"] = {container = "เอเชีย", divs = {"governorates"}},
["คาซัคสถาน"] = {container = {"เอเชีย", "ยุโรป"}, divs = {"ภูมิภาค", "อำเภอ"}},
["เคนยา"] = {container = "แอฟริกา", divs = {"เทศมณฑล"}, british_spelling = true},
["Kiribati"] = {container = "ไมโครนีเชีย", british_spelling = true},
["Kosovo"] = {container = "ยุโรป", divs = {"อำเภอ", "เทศบาล"}, british_spelling = true},
["Kuwait"] = {container = "เอเชีย", divs = {"governorates", "areas"}},
["Kyrgyzstan"] = {container = "เอเชีย", divs = {"ภูมิภาค", "อำเภอ"}},
["Laos"] = {container = "เอเชีย", divs = {"จังหวัด", "อำเภอ"}},
["Latvia"] = {container = "ยุโรป", divs = {"เทศบาล"}, british_spelling = true},
["Lebanon"] = {container = "เอเชีย", divs = {"governorates", "อำเภอ"}},
["Lesotho"] = {container = "แอฟริกา", divs = {"อำเภอ"}, british_spelling = true},
["Liberia"] = {container = "แอฟริกา", divs = {"เทศมณฑล", "อำเภอ"}},
["Libya"] = {container = "แอฟริกา", divs = {"อำเภอ", "เทศบาล"}},
["Liechtenstein"] = {container = "ยุโรป", divs = {"เทศบาล"}, british_spelling = true},
["Lithuania"] = {container = "ยุโรป", divs = {"เทศมณฑล", "เทศบาล"}, british_spelling = true},
["Luxembourg"] = {container = "ยุโรป", divs = {"cantons", "อำเภอ"}, british_spelling = true},
["Madagascar"] = {container = "แอฟริกา", divs = {"ภูมิภาค", "อำเภอ"}},
["Malawi"] = {container = "แอฟริกา", divs = {"ภูมิภาค", "อำเภอ"}, british_spelling = true},
["Malaysia"] = {container = "เอเชีย", divs = {"รัฐ", "federal territories", "อำเภอ"}, british_spelling = true},
["Maldives"] = {the = true, container = "เอเชีย", divs = {"จังหวัด", "administrative atolls"}, british_spelling = true},
["Mali"] = {container = "แอฟริกา", divs = {"ภูมิภาค", "cercles"}},
["Malta"] = {container = "ยุโรป", divs = {"ภูมิภาค", "local councils"}, british_spelling = true},
["Marshall Islands"] = {the = true, container = "ไมโครนีเชีย", divs = {"เทศบาล"}},
["Mauritania"] = {container = "แอฟริกา", divs = {"ภูมิภาค", "departments"}},
["Mauritius"] = {container = "แอฟริกา", divs = {"อำเภอ"}, british_spelling = true},
["Mexico"] = {container = "อเมริกาเหนือ", addl_parents = {"อเมริกากลาง"}, divs = {
"รัฐ", "เทศบาล",
{type = "ABBREVIATION_OF states", cat_as = "abbreviations of states"},
}},
["Moldova"] = {container = "ยุโรป", divs = {
{type = "อำเภอ", cat_as = "districts and autonomous territorial units"},
{type = "autonomous territorial units", cat_as = "districts and autonomous territorial units"},
"communes", "เทศบาล",
}, british_spelling = true},
["Monaco"] = {placetype = {"city-state", "ประเทศ"}, container = "ยุโรป",
-- We want the first placetype to be 'city-state' so the description of Monaco says it's a city-state, but we
-- want its parent to be "countries in Europe".
bare_category_parent_type = {type = "ประเทศ", prep = "ใน"},
is_city = true, british_spelling = true},
["Mongolia"] = {container = "เอเชีย", divs = {"จังหวัด", "อำเภอ"}},
["Montenegro"] = {container = "ยุโรป", divs = {"เทศบาล"}},
["Morocco"] = {container = "แอฟริกา", divs = {"ภูมิภาค", "prefectures", "จังหวัด"}},
["Mozambique"] = {container = "แอฟริกา", divs = {"จังหวัด", "อำเภอ"}},
["Myanmar"] = {container = "เอเชีย",
divs = {"ภูมิภาค", "รัฐ", "union territories",
{type = "self-administered zones", cat_as = "self-administered areas"},
{type = "self-administered divisions", cat_as = "self-administered areas"},
"อำเภอ"}},
["Burma"] = {alias_of = "Myanmar"}, -- not display-canonicalizing; has political connotations
["Namibia"] = {container = "แอฟริกา", divs = {"ภูมิภาค", "constituencies"}, british_spelling = true},
["Nauru"] = {container = "ไมโครนีเชีย", divs = {"อำเภอ"}, british_spelling = true},
["Nepal"] = {container = "เอเชีย", divs = {"จังหวัด", "อำเภอ"}},
["เนเธอร์แลนด์"] = {the = true, placetype = {"ประเทศ", "constituent country"}, container = "ยุโรป",
divs = {"จังหวัด", "เทศบาล",
{type = "FORMER municipalities", cat_as = "former municipalities"},
"dependent territories", "constituent countries"}, british_spelling = true,
-- Wikipedia separates [[w:Netherlands]] (constituent country) from [[w:Kingdom of the Netherlands]]
-- (country)
},
["New Zealand"] = {container = "พอลินีเชีย", divs = {
"ภูมิภาค", "dependent territories", "territorial authorities",
{type = "อำเภอ", cat_as = "territorial authorities"},
},
british_spelling = true},
["Nicaragua"] = {container = "อเมริกากลาง", divs = {"departments", "เทศบาล"}},
["Niger"] = {container = "แอฟริกา", divs = {"ภูมิภาค", "departments"}},
["Nigeria"] = {container = "แอฟริกา", divs = {
"รัฐ",
-- Categorize the Federal Capital Territory as a state because there's only one of it; we could categorize
-- everything under 'states and territories' but that seems a bit pointless.
{type = "federal territories", cat_as = "รัฐ"},
"local government areas",
}, british_spelling = true},
["North Korea"] = {container = "เอเชีย", addl_parents = {"Korea"}, divs = {"จังหวัด", "เทศมณฑล"}},
["North Macedonia"] = {container = "ยุโรป", divs = {"ภูมิภาค", "เทศบาล"}, british_spelling = true},
["Macedonia"] = {alias_of = "North Macedonia", display = true},
["Republic of North Macedonia"] = {alias_of = "North Macedonia", the = true}, -- differs in "the"
["Republic of Macedonia"] = {alias_of = "North Macedonia", the = true}, -- differs in "the"
["Norway"] = {container = "ยุโรป",
divs = {"เทศมณฑล", "เทศบาล", "dependent territories", "อำเภอ", "unincorporated areas"},
british_spelling = true},
["Oman"] = {container = "เอเชีย", divs = {"governorates", "จังหวัด"}},
["Pakistan"] = {container = "เอเชีย", divs = {
{type = "จังหวัด", cat_as = "provinces and territories"},
{type = "administrative territories", cat_as = "provinces and territories"},
{type = "federal territories", cat_as = "provinces and territories"},
{type = "ดินแดน", cat_as = "provinces and territories"},
"divisions", "อำเภอ",
}, british_spelling = true},
["Palau"] = {container = "ไมโครนีเชีย", divs = {"รัฐ"}},
["Palestine"] = {container = "เอเชีย", divs = {"governorates"}},
["State of Palestine"] = {alias_of = "Palestine", the = true}, -- differs in "the"
["Panama"] = {container = "อเมริกากลาง", divs = {"จังหวัด", "อำเภอ"}},
["Papua New Guinea"] = {container = "เมลานีเชีย", divs = {"จังหวัด", "อำเภอ"}, british_spelling = true},
["Paraguay"] = {container = "อเมริกาใต้", divs = {"departments", "อำเภอ"}},
["Peru"] = {container = "อเมริกาใต้", divs = {"ภูมิภาค", "จังหวัด", "อำเภอ"}},
["Philippines"] = {the = true, container = "เอเชีย", divs = {"ภูมิภาค", "จังหวัด", "อำเภอ", "เทศบาล", "barangays"}},
["Poland"] = {divs = {"voivodeships", "เทศมณฑล",
{type = "Polish colonies", cat_as = {{type = "villages", prep = "ใน"}}},
}, container = "ยุโรป", british_spelling = true},
["Portugal"] = {container = "ยุโรป", divs = {
{type = "autonomous regions", cat_as = "districts and autonomous regions"},
{type = "อำเภอ", cat_as = "districts and autonomous regions"},
"จังหวัด", "เทศบาล"}, british_spelling = true},
["Qatar"] = {container = "เอเชีย", divs = {"เทศบาล", "zones"}},
["Republic of the Congo"] = {the = true, container = "แอฟริกา", divs = {"departments", "อำเภอ"}},
["Congo Republic"] = {alias_of = "Republic of the Congo", display = true, the = true},
["Romania"] = {container = "ยุโรป", divs = {
"ภูมิภาค", "เทศมณฑล", "communes",
{type = "ABBREVIATION_OF counties", cat_as = "abbreviations of counties"},
}, british_spelling = true},
["Russia"] = {container = {"ยุโรป", "เอเชีย"}, divs = {
"federal subjects", "republics", "autonomous oblasts", "autonomous okrugs", "oblasts", "krais", "federal cities",
"อำเภอ", "federal districts"},
british_spelling = true},
["Rwanda"] = {container = "แอฟริกา", divs = {"จังหวัด", "อำเภอ"}},
["Saint Kitts and Nevis"] = {container = "แคริบเบียน", divs = {"parishes"}, british_spelling = true},
["Saint Kitts"] = {alias_of = "Saint Kitts and Nevis", display = true},
["Saint Lucia"] = {container = "แคริบเบียน", divs = {"อำเภอ"}, british_spelling = true},
["Saint Vincent and the Grenadines"] = {container = "แคริบเบียน", divs = {"parishes"}, british_spelling = true},
["Saint Vincent"] = {alias_of = "Saint Vincent and the Grenadines", display = true},
["SVG"] = {alias_of = "Saint Vincent and the Grenadines", display = true},
["S.V.G"] = {alias_of = "Saint Vincent and the Grenadines", display = true},
["Samoa"] = {container = "พอลินีเชีย", divs = {"อำเภอ"}, british_spelling = true},
["San Marino"] = {container = "ยุโรป", divs = {"เทศบาล"}, british_spelling = true},
["São Tomé and Príncipe"] = {container = "แอฟริกา", divs = {"อำเภอ"}},
["São Tome and Principe"] = {alias_of = "São Tomé and Príncipe", display = true},
["São Tomé"] = {alias_of = "São Tomé and Príncipe", display = true},
["São Tome"] = {alias_of = "São Tomé and Príncipe", display = true},
["Saudi Arabia"] = {container = "เอเชีย", divs = {"จังหวัด", "governorates"}},
["Senegal"] = {container = "แอฟริกา", divs = {"ภูมิภาค", "departments"}},
["Serbia"] = {container = "ยุโรป", divs = {"อำเภอ", "เทศบาล", "autonomous provinces"}},
["Seychelles"] = {container = "แอฟริกา", divs = {"อำเภอ"}, british_spelling = true},
["Sierra Leone"] = {container = "แอฟริกา", divs = {"จังหวัด", "อำเภอ"}, british_spelling = true},
["Singapore"] = {container = "เอเชีย", divs = {"อำเภอ", "ภูมิภาค"}, british_spelling = true},
["Slovakia"] = {container = "ยุโรป", divs = {"ภูมิภาค", "อำเภอ"}, british_spelling = true},
["Slovenia"] = {container = "ยุโรป", divs = {"statistical regions", "เทศบาล"}, british_spelling = true},
-- Note: While the official name does not include "the" at the beginning,
-- it sounds strange in English to leave it out and it's commonly included.
["Solomon Islands"] = {the = true, container = "เมลานีเชีย", divs = {"จังหวัด"}, british_spelling = true},
["โซมาเลีย"] = {container = "แอฟริกา", divs = {"ภูมิภาค", "อำเภอ"}},
["South Africa"] = {container = "แอฟริกา", divs = {
"จังหวัด",
"อำเภอ",
{type = "district municipalities", cat_as = "อำเภอ"},
{type = "metropolitan municipalities", cat_as = "อำเภอ"},
"เทศบาล",
}, british_spelling = true},
["South Korea"] = {container = "เอเชีย", addl_parents = {"Korea"}, divs = {"จังหวัด", "เทศมณฑล", "อำเภอ"}},
["South Sudan"] = {container = "แอฟริกา", divs = {"ภูมิภาค", "รัฐ", "เทศมณฑล"}, british_spelling = true},
["Spain"] = {container = "ยุโรป", divs = {"autonomous communities", "จังหวัด", "เทศบาล",
"comarcas", "autonomous cities"},
british_spelling = true},
["Sri Lanka"] = {container = "เอเชีย", divs = {"จังหวัด", "อำเภอ"}, british_spelling = true},
["Sudan"] = {container = "แอฟริกา", divs = {"รัฐ", "อำเภอ"}, british_spelling = true},
["Suriname"] = {container = "อเมริกาใต้", divs = {"อำเภอ"}},
["Sweden"] = {container = "ยุโรป", divs = {"จังหวัด", "เทศมณฑล", "เทศบาล"}, british_spelling = true},
["Switzerland"] = {container = "ยุโรป", divs = {"cantons", "เทศบาล", "อำเภอ"}, british_spelling = true},
["Syria"] = {container = "เอเชีย", divs = {"governorates", "อำเภอ"}},
["ไต้หวัน"] = {container = "เอเชีย", divs = {"เทศมณฑล", "อำเภอ", "townships", "special municipalities"}},
["สาธารณรัฐจีน"] = {alias_of = "ไต้หวัน", the = true}, -- differs in "the", different political connotations
["Tajikistan"] = {container = "เอเชีย", divs = {"ภูมิภาค", "อำเภอ"}},
["Tanzania"] = {container = "แอฟริกา", divs = {"ภูมิภาค", "อำเภอ"}, british_spelling = true},
["ไทย"] = {container = "เอเชีย", divs = {"จังหวัด", "อำเภอ", "ตำบล"}},
["Togo"] = {container = "แอฟริกา", divs = {"จังหวัด", "prefectures"}},
["Tonga"] = {container = "พอลินีเชีย", divs = {"divisions"}, british_spelling = true},
["Trinidad and Tobago"] = {container = "แคริบเบียน", divs = {"ภูมิภาค", "เทศบาล"}, british_spelling = true},
["Tunisia"] = {container = "แอฟริกา", divs = {"governorates", "delegations"}},
["Turkey"] = {container = {"ยุโรป", "เอเชีย"}, divs = {"จังหวัด", "อำเภอ"}},
-- Foreign names generally get display-canonicalized.
["Türkiye"] = {alias_of = "Turkey", display = true},
["Turkmenistan"] = {container = "เอเชีย", divs = {
-- The 5 regions are often also called provinces
"ภูมิภาค", {type = "จังหวัด", cat_as = "ภูมิภาค"}, "อำเภอ"},
},
["Tuvalu"] = {container = "พอลินีเชีย", divs = {"atolls"}, british_spelling = true},
["Uganda"] = {container = "แอฟริกา", divs = {"อำเภอ", "เทศมณฑล"}, british_spelling = true},
["Ukraine"] = {container = "ยุโรป", divs = {
{type = "oblasts", cat_as = "oblasts and autonomous republics"},
{type = "autonomous republics", cat_as = "oblasts and autonomous republics"},
"raions", "hromadas",
}, british_spelling = true},
["United Arab Emirates"] = {the = true, container = "เอเชีย", divs = {"emirates"}},
-- Abbreviations get display-canonicalized.
["UAE"] = {alias_of = "United Arab Emirates", display = true, the = true},
["U.A.E."] = {alias_of = "United Arab Emirates", display = true, the = true},
["สหราชอาณาจักร"] = {the = true, container = "ยุโรป", addl_parents = {"British Isles"},
divs = {"constituent countries", "เทศมณฑล", "อำเภอ", "boroughs", "ดินแดน", "dependent territories",
"traditional counties"},
keydesc = "the [[United Kingdom]] of Great Britain and Northern Ireland", british_spelling = true},
-- Abbreviations get display-canonicalized.
["UK"] = {alias_of = "สหราชอาณาจักร", display = true, the = true},
["U.K."] = {alias_of = "สหราชอาณาจักร", display = true, the = true},
["สหรัฐอเมริกา"] = {the = true, container = "อเมริกาเหนือ",
divs = {"เทศมณฑล", "county seats", "รัฐ", "ดินแดน", "dependent territories",
{type = "ABBREVIATION_OF states", cat_as = "abbreviations of states"},
{type = "DEROGATORY_NAME_FOR states", cat_as = "derogatory names for states"},
{type = "NICKNAME_FOR states", cat_as = "nicknames for states"},
{type = "OFFICIAL_NICKNAME_FOR states", cat_as = "official nicknames for states"},
{type = "boroughs", prep = "ใน"}, -- exist in Pennsylvania and New Jersey
"เทศบาล", -- these exist politically at least in Colorado and Connecticut
{type = "census-designated places", prep = "ใน"},
{type = "unincorporated communities", prep = "ใน"},
-- Don't change the following to something more politically correct until/unless the US government makes a
-- similar switch (and note that as of Apr 18 2025, the Wikipedia article is still at
-- [[w:Indian reservations]]).
"Indian reservations",
}},
-- Abbreviations and long forms (when possible) get display-canonicalized.
["US"] = {alias_of = "สหรัฐอเมริกา", display = true, the = true},
["U.S."] = {alias_of = "สหรัฐอเมริกา", display = true, the = true},
["USA"] = {alias_of = "สหรัฐอเมริกา", display = true, the = true},
["U.S.A."] = {alias_of = "สหรัฐอเมริกา", display = true, the = true},
["สหรัฐ"] = {alias_of = "สหรัฐอเมริกา", display = true, the = true},
["Uruguay"] = {container = "อเมริกาใต้", divs = {"departments", "เทศบาล"}},
["Uzbekistan"] = {container = "เอเชีย", divs = {"ภูมิภาค", "อำเภอ"}},
["Vanuatu"] = {container = "เมลานีเชีย", divs = {"จังหวัด"}, british_spelling = true},
["Vatican City"] = {placetype = {"city-state", "ประเทศ"}, container = "ยุโรป",
-- First placetype should be 'city-state' for to shown up in its description,
-- Its parent should still be "countries in Europe".
bare_category_parent_type = {type = "ประเทศ", prep = "ใน"},
addl_parents = {"Rome"}, is_city = true, british_spelling = true},
["Vatican"] = {alias_of = "Vatican City", the = true}, -- differs in "the"
["Venezuela"] = {container = "อเมริกาใต้", divs = {"รัฐ", "เทศบาล"}},
["เวียดนาม"] = {container = "เอเชีย", divs = {"จังหวัด", "อำเภอ", "เทศบาล"}},
["Western Sahara"] = {placetype = {"ดินแดน", "ประเทศ"}, container = "แอฟริกา",
bare_category_parent_type = {type = "ประเทศ", prep = "ใน"},
},
-- Not display-canonicalizable both due to differences in 'the' and the sovereignty dispute over Western Sahara
["Sahrawi Arab Democratic Republic"] = {alias_of = "Western Sahara", the = true},
["SADR"] = {alias_of = "Sahrawi Arab Democratic Republic", display = true, the = true},
["Yemen"] = {container = "เอเชีย", divs = {"governorates", "อำเภอ"}},
["Zambia"] = {container = "แอฟริกา", divs = {"จังหวัด", "อำเภอ"}, british_spelling = true},
["Zimbabwe"] = {container = "แอฟริกา", divs = {"จังหวัด", "อำเภอ"}, british_spelling = true},
}
local function canonicalize_continent_container(key)
if type(key) ~= "string" then
return key
end
if export.continents[key] then
return {key = key, placetype = export.continents[key].placetype}
end
internal_error("Unrecognized key %s in `canonicalize_continent_like`", key)
end
export.countries_group = {
canonicalize_key_container = canonicalize_continent_container,
default_overriding_bare_label_parents = {"+++", "ประเทศ"},
default_placetype = "ประเทศ",
default_no_container_cat = true,
default_no_container_parent = true,
-- No need to augment country holonyms with continents; not needed for disambiguation.
default_no_auto_augment_container = true,
data = export.countries,
}
-- Country-like entities: typically overseas territories or de-facto independent countries, which in both cases
-- are not internationally recognized as sovereign nations but which we treat similarly to countries.
export.country_like_entities = {
-- British Overseas Territory
["Akrotiri and Dhekelia"] = {
placetype = {"overseas territory", "ดินแดน"},
container = "สหราชอาณาจักร",
addl_parents = {"ไซปรัส", "ยุโรป", "เอเชีย"},
british_spelling = true,
},
-- Åland: Listed as a region of Finland. Wikipedia lists this under "dependent territories" in
-- [[w:List of sovereign states and dependent territories by continent]].
-- unincorporated territory of the United States
["American Samoa"] = {
placetype = {"unincorporated territory", "overseas territory", "ดินแดน"},
container = "สหรัฐอเมริกา",
addl_parents = {"พอลินีเชีย"},
},
-- British Overseas Territory
["Anguilla"] = {
placetype = {"overseas territory", "ดินแดน"},
container = "สหราชอาณาจักร",
addl_parents = {"แคริบเบียน"},
british_spelling = true,
},
-- de-facto independent state, internationally recognized as part of Georgia
["Abkhazia"] = {
placetype = {"unrecognized country", "ประเทศ"},
addl_parents = {"Georgia", "ยุโรป", "เอเชีย"},
divs = {"อำเภอ"},
keydesc = "the de-facto independent state of [[Abkhazia]], internationally recognized as part of the country of [[Georgia]]",
british_spelling = true,
},
-- Australian external territory
["Ashmore and Cartier Islands"] = {
the = true,
placetype = {"external territory", "ดินแดน"},
container = "ออสเตรเลีย",
addl_parents = {"เอเชีย"},
},
-- constituent country of the Netherlands
["Aruba"] = {
placetype = {"constituent country", "ประเทศ"},
container = "เนเธอร์แลนด์",
addl_parents = {"แคริบเบียน"},
british_spelling = true,
},
-- British Overseas Territory
["Bermuda"] = {
placetype = {"overseas territory", "ดินแดน"},
container = "สหราชอาณาจักร",
addl_parents = {"อเมริกาเหนือ"},
british_spelling = true,
},
-- special municipality of the Netherlands
["Bonaire"] = {
placetype = {"special municipality", "เทศบาล", "overseas territory", "ดินแดน"},
container = "เนเธอร์แลนด์",
addl_parents = {"แคริบเบียน"},
is_city = true,
british_spelling = true,
},
-- British Overseas Territory
["British Indian Ocean Territory"] = {
the = true,
placetype = {"overseas territory", "ดินแดน"},
container = "สหราชอาณาจักร",
addl_parents = {"เอเชีย"},
british_spelling = true,
},
-- British Overseas Territory
["British Virgin Islands"] = {
the = true,
placetype = {"overseas territory", "ดินแดน"},
container = "สหราชอาณาจักร",
addl_parents = {"แคริบเบียน"},
british_spelling = true,
},
-- Norwegian dependent territory
["Bouvet Island"] = {
placetype = {"dependent territory", "ดินแดน"},
container = "Norway",
addl_parents = {"แอฟริกา"},
british_spelling = true,
},
-- British Overseas Territory
["Cayman Islands"] = {
the = true,
placetype = {"overseas territory", "ดินแดน"},
container = "สหราชอาณาจักร",
addl_parents = {"แคริบเบียน"},
british_spelling = true,
},
-- Australian external territory
["Christmas Island"] = {
placetype = {"external territory", "ดินแดน"},
container = "ออสเตรเลีย",
addl_parents = {"เอเชีย"},
british_spelling = true,
},
-- Sui generis French "state private property" per Wikipedia; classify as overseas territory like the
-- French Southern and Antarctic Lands.
["Clipperton Island"] = {
placetype = {"overseas territory", "ดินแดน"},
container = "ฝรั่งเศส",
addl_parents = {"อเมริกาเหนือ"},
},
-- Australian external territory; also called the Keeling Islands or (officially) the Cocos (Keeling) Islands
["Cocos Islands"] = {
the = true,
placetype = {"external territory", "ดินแดน"},
container = "ออสเตรเลีย",
addl_parents = {"เอเชีย"},
wp = "Cocos (Keeling) Islands",
british_spelling = true,
},
["Cocos (Keeling) Islands"] = {alias_of = "Cocos Islands", display = true, the = true},
["Keeling Islands"] = {alias_of = "Cocos Islands", display = true, the = true},
-- self-governing but in free association with New Zealand
["Cook Islands"] = {
the = true,
placetype = {"ประเทศ"},
container = "New Zealand",
addl_parents = {"พอลินีเชีย"},
british_spelling = true,
},
-- constituent country of the Netherlands
["Curaçao"] = {
placetype = {"constituent country", "ประเทศ"},
container = "เนเธอร์แลนด์",
addl_parents = {"แคริบเบียน"},
british_spelling = true,
},
-- special territory of Chile
["Easter Island"] = {
placetype = {"special territory", "ดินแดน"},
container = "ชิลี",
addl_parents = {"พอลินีเชีย"},
},
-- British Overseas Territory
["Falkland Islands"] = {
the = true,
placetype = {"overseas territory", "ดินแดน"},
container = "สหราชอาณาจักร",
addl_parents = {"อเมริกาใต้"},
british_spelling = true,
},
-- autonomous territory of Denmark
["Faroe Islands"] = {
the = true,
placetype = {"autonomous territory", "ดินแดน"},
container = "เดนมาร์ก",
addl_parents = {"ยุโรป"},
british_spelling = true,
},
-- overseas department and region of France
["French Guiana"] = {
placetype = {"overseas department", "department", "administrative region", "ภูมิภาค"},
container = "ฝรั่งเศส",
divs = {"communes"},
addl_parents = {"อเมริกาใต้"},
british_spelling = true,
},
-- overseas collectivity of France
["French Polynesia"] = {
placetype = {"overseas collectivity", "collectivity"},
container = "ฝรั่งเศส",
addl_parents = {"พอลินีเชีย"},
british_spelling = true,
},
-- French overseas territory
["French Southern and Antarctic Lands"] = {
the = true,
placetype = {"overseas territory", "ดินแดน"},
container = "ฝรั่งเศส",
addl_parents = {"แอฟริกา"},
},
-- British Overseas Territory
["Gibraltar"] = {
placetype = {"overseas territory", "ดินแดน"},
container = "สหราชอาณาจักร",
addl_parents = {"ยุโรป"},
is_city = true,
british_spelling = true,
},
-- autonomous territory of Denmark
["Greenland"] = {
placetype = {"autonomous territory", "ดินแดน"},
container = "เดนมาร์ก",
addl_parents = {"อเมริกาเหนือ"},
divs = {"เทศบาล"},
british_spelling = true,
},
-- overseas department and region of France
["Guadeloupe"] = {
placetype = {"overseas department", "department", "administrative region", "ภูมิภาค"},
container = "ฝรั่งเศส",
addl_parents = {"แคริบเบียน"},
divs = {"communes"},
british_spelling = true,
},
-- unincorporated territory of the United States
["Guam"] = {
placetype = {"unincorporated territory", "overseas territory", "ดินแดน"},
container = "สหรัฐอเมริกา",
addl_parents = {"ไมโครนีเชีย"},
},
-- self-governing British Crown dependency; technically called the Bailiwick of Guernsey
["Guernsey"] = {
placetype = {"crown dependency", "dependency", "dependent territory", "bailiwick", "ดินแดน"},
container = "สหราชอาณาจักร",
addl_parents = {"British Isles", "ยุโรป"},
british_spelling = true,
wp = "Bailiwick of %l",
},
["Bailiwick of Guernsey"] = {alias_of = "Guernsey", the = true},
-- Australian external territory
["Heard Island and McDonald Islands"] = {
the = true,
placetype = {"external territory", "ดินแดน"},
container = "ออสเตรเลีย",
addl_parents = {"แอฟริกา"},
},
-- special administrative region of China
["Hong Kong"] = {
placetype = {"special administrative region", "นคร"},
container = "จีน",
is_city = true,
british_spelling = true,
},
-- self-governing British Crown dependency
["Isle of Man"] = {
the = true,
placetype = {"crown dependency", "dependency", "dependent territory", "ดินแดน"},
container = "สหราชอาณาจักร",
addl_parents = {"British Isles", "ยุโรป"},
british_spelling = true,
},
-- Norwegian unincorporated area
["Jan Mayen"] = {
placetype = {"unincorporated area", "dependent territory", "ดินแดน", "เกาะ"},
container = "Norway",
addl_parents = {"ยุโรป"},
british_spelling = true,
},
-- self-governing British Crown dependency; technically called the Bailiwick of Jersey
["Jersey"] = {
placetype = {"crown dependency", "dependency", "dependent territory", "bailiwick", "ดินแดน"},
container = "สหราชอาณาจักร",
addl_parents = {"British Isles", "ยุโรป"},
british_spelling = true,
},
["Bailiwick of Jersey"] = {alias_of = "Jersey", the = true},
-- special administrative region of China
["Macau"] = {
placetype = {"special administrative region", "นคร"},
container = "จีน",
is_city = true,
british_spelling = true,
},
-- overseas department and region of France
["Martinique"] = {
placetype = {"overseas department", "department", "administrative region", "ภูมิภาค"},
container = "ฝรั่งเศส",
divs = {"communes"},
addl_parents = {"แคริบเบียน"},
british_spelling = true,
},
-- overseas department and region of France
["Mayotte"] = {
placetype = {"overseas department", "department", "administrative region", "ภูมิภาค"},
container = "ฝรั่งเศส",
divs = {"communes"},
addl_parents = {"แอฟริกา"},
british_spelling = true,
},
-- British Overseas Territory
["Montserrat"] = {
placetype = {"overseas territory", "ดินแดน"},
container = "สหราชอาณาจักร",
addl_parents = {"แคริบเบียน"},
british_spelling = true,
},
-- special collectivity of France
["New Caledonia"] = {
placetype = {"special collectivity", "collectivity"},
container = "ฝรั่งเศส",
addl_parents = {"เมลานีเชีย"},
british_spelling = true,
},
-- dependent territory of New Zealand
["New Zealand Subantarctic Islands"] = {
the = true,
placetype = {"dependent territory", "ดินแดน"},
container = "New Zealand",
addl_parents = {"แอนตาร์กติกา"},
british_spelling = true,
},
-- self-governing but in free association with New Zealand
["Niue"] = {
placetype = {"ประเทศ"},
container = "New Zealand",
addl_parents = {"พอลินีเชีย"},
british_spelling = true,
},
-- Australian external territory
["Norfolk Island"] = {
placetype = {"external territory", "ดินแดน"},
container = "ออสเตรเลีย",
addl_parents = {"พอลินีเชีย"},
british_spelling = true,
},
-- de-facto independent state, internationally recognized as part of Cyprus
["Northern Cyprus"] = {
placetype = {"unrecognized country", "ประเทศ"},
addl_parents = {"ไซปรัส", "Turkey", "ยุโรป", "เอเชีย"},
divs = {"อำเภอ"},
keydesc = "the de-facto independent state of [[Northern Cyprus]], internationally recognized as part of the country of [[Cyprus]]",
british_spelling = true,
},
-- commonwealth, unincorporated territory of the United States
["Northern Mariana Islands"] = {
the = true,
placetype = {"commonwealth", "unincorporated territory", "overseas territory", "ดินแดน"},
container = "สหรัฐอเมริกา",
addl_parents = {"ไมโครนีเชีย"},
},
-- British Overseas Territory
["Pitcairn Islands"] = {
the = true,
placetype = {"overseas territory", "ดินแดน"},
container = "สหราชอาณาจักร",
addl_parents = {"พอลินีเชีย"},
british_spelling = true,
},
-- commonwealth of the United States
["Puerto Rico"] = {
placetype = {"commonwealth", "overseas territory", "ดินแดน"},
container = "สหรัฐอเมริกา",
addl_parents = {"แคริบเบียน"},
divs = {"เทศบาล"},
},
-- overseas department and region of France
["Réunion"] = {
placetype = {"overseas department", "department", "administrative region", "ภูมิภาค"},
container = "ฝรั่งเศส",
divs = {"communes"},
addl_parents = {"แอฟริกา"},
british_spelling = true,
},
-- special municipality of the Netherlands
["Saba"] = {
placetype = {"special municipality", "เทศบาล", "overseas territory", "ดินแดน"},
container = "เนเธอร์แลนด์",
addl_parents = {"แคริบเบียน"},
is_city = true,
british_spelling = true,
},
-- overseas collectivity of France
["Saint Barthélemy"] = {
placetype = {"overseas collectivity", "collectivity"},
container = "ฝรั่งเศส",
addl_parents = {"แคริบเบียน"},
british_spelling = true,
},
-- British Overseas Territory
["Saint Helena, Ascension and Tristan da Cunha"] = {
placetype = {"overseas territory", "ดินแดน"},
container = "สหราชอาณาจักร",
divs = {{type = "constituent parts", container_parent_type = false}},
addl_parents = {"มหาสมุทรแอตแลนติก", "แอฟริกา"},
british_spelling = true,
},
-- constituent parts of the combined oveseas territory
["Ascension Island"] = {
placetype = {"constituent part", "ดินแดน", "เกาะ"},
container = {key = "Saint Helena, Ascension and Tristan da Cunha", placetype = "overseas territory"},
addl_parents = {"มหาสมุทรแอตแลนติก"},
overriding_bare_label_parents = {},
no_container_cat = false,
no_container_parent = false,
no_auto_augment_container = false,
},
["Saint Helena"] = {
placetype = {"constituent part", "ดินแดน", "เกาะ"},
container = {key = "Saint Helena, Ascension and Tristan da Cunha", placetype = "overseas territory"},
addl_parents = {"มหาสมุทรแอตแลนติก"},
overriding_bare_label_parents = {},
no_container_cat = false,
no_container_parent = false,
no_auto_augment_container = false,
},
["Tristan da Cunha"] = {
placetype = {"constituent part", "ดินแดน", "archipelago"},
container = {key = "Saint Helena, Ascension and Tristan da Cunha", placetype = "overseas territory"},
addl_parents = {"มหาสมุทรแอตแลนติก"},
overriding_bare_label_parents = {},
no_container_cat = false,
no_container_parent = false,
no_auto_augment_container = false,
},
-- overseas collectivity of France
["Saint Martin"] = {
placetype = {"overseas collectivity", "collectivity"},
container = "ฝรั่งเศส",
addl_parents = {"แคริบเบียน"},
british_spelling = true,
},
-- overseas collectivity of France
["Saint Pierre and Miquelon"] = {
placetype = {"overseas collectivity", "collectivity"},
container = "ฝรั่งเศส",
divs = {"communes"},
addl_parents = {"อเมริกาเหนือ"},
british_spelling = true,
},
-- special municipality of the Netherlands
["Sint Eustatius"] = {
placetype = {"special municipality", "เทศบาล", "overseas territory", "ดินแดน"},
container = "เนเธอร์แลนด์",
addl_parents = {"แคริบเบียน"},
is_city = true,
british_spelling = true,
},
-- constituent country of the Netherlands
["Sint Maarten"] = {
placetype = {"constituent country", "ประเทศ"},
container = "เนเธอร์แลนด์",
addl_parents = {"แคริบเบียน"},
british_spelling = true,
},
-- de-facto independent state, internationally recognized as part of Somalia
["Somaliland"] = {
placetype = {"unrecognized country", "ประเทศ"},
addl_parents = {"โซมาเลีย", "แอฟริกา"},
keydesc = "the de-facto independent state of [[Somaliland]], internationally recognized as part of the country of [[Somalia]]",
british_spelling = true,
},
-- British Overseas Territory
-- FIXME: We should form the group "South Georgia and the South Sandwich Islands" like we did for
-- "Saint Helena, Ascension and Tristan da Cunha".
["South Georgia"] = {
placetype = {"overseas territory", "ดินแดน"},
container = "สหราชอาณาจักร",
addl_parents = {"มหาสมุทรแอตแลนติก"},
british_spelling = true,
},
-- de-facto independent state, internationally recognized as part of Georgia
["South Ossetia"] = {
placetype = {"unrecognized country", "ประเทศ"},
addl_parents = {"Georgia", "ยุโรป", "เอเชีย"},
keydesc = "the de-facto independent state of [[South Ossetia]], internationally recognized as part of the country of [[Georgia]]",
british_spelling = true,
},
-- British Overseas Territory
["South Sandwich Islands"] = {
the = true,
placetype = {"overseas territory", "ดินแดน"},
container = "สหราชอาณาจักร",
addl_parents = {"มหาสมุทรแอตแลนติก"},
wp = true,
wpcat = "South Georgia and the South Sandwich Islands",
british_spelling = true,
},
-- Norwegian unincorporated area
["Svalbard"] = {
placetype = {"unincorporated area", "dependent territory", "ดินแดน", "archipelago"},
container = "Norway",
addl_parents = {"ยุโรป"},
british_spelling = true,
},
-- dependent territory of New Zealand
["Tokelau"] = {
placetype = {"dependent territory", "ดินแดน"},
container = "New Zealand",
addl_parents = {"พอลินีเชีย"},
british_spelling = true,
},
-- de-facto independent state, internationally recognized as part of Moldova
["Transnistria"] = {
placetype = {"unrecognized country", "ประเทศ"},
addl_parents = {"Moldova", "ยุโรป"},
keydesc = "the de-facto independent state of [[Transnistria]], internationally recognized as part of [[Moldova]]",
british_spelling = true,
},
-- British Overseas Territory
["Turks and Caicos Islands"] = {
the = true,
placetype = {"overseas territory", "ดินแดน"},
container = "สหราชอาณาจักร",
addl_parents = {"แคริบเบียน"},
british_spelling = true,
},
-- unincorporated territory of the United States
["United States Minor Outlying Islands"] = {
the = true,
placetype = {"unincorporated territory", "overseas territory", "ดินแดน"},
container = "สหรัฐอเมริกา",
addl_parents = {"เกาะ", "ไมโครนีเชีย", "พอลินีเชีย", "แคริบเบียน"},
},
-- FIXME: We should add entries for the other minor outlying islands.
-- Baker Island (Oceania)
-- Howland Island (Oceania)
-- Jarvis Island (Oceania)
-- Johnston Atoll (Oceania)
-- Kingman Reef (Oceania)
-- Midway Atoll (Oceania)
-- Navassa Island (Caribbean)
-- Palmyra Atoll (Oceania)
-- Wake Island (Oceania)
["Wake Island"] = {
placetype = {"unincorporated territory", "overseas territory", "ดินแดน"},
container = "สหรัฐอเมริกา",
addl_parents = {"ไมโครนีเชีย"},
},
-- unincorporated territory of the United States
["United States Virgin Islands"] = {
the = true,
placetype = {"unincorporated territory", "overseas territory", "ดินแดน"},
container = "สหรัฐอเมริกา",
addl_parents = {"แคริบเบียน"},
},
["U.S. Virgin Islands"] = {alias_of = "United States Virgin Islands", display = true, the = true},
["US Virgin Islands"] = {alias_of = "United States Virgin Islands", display = true, the = true},
-- overseas collectivity of France
["Wallis and Futuna"] = {
placetype = {"overseas collectivity", "collectivity"},
container = "ฝรั่งเศส",
addl_parents = {"พอลินีเชีย"},
british_spelling = true,
},
}
export.country_like_entities_group = {
-- don't do any transformations between key and placename; in particular, don't chop off anything from
-- "Saint Helena, Ascension and Tristan da Cunha".
key_to_placename = false,
placename_to_key = false,
canonicalize_key_container = make_canonicalize_key_container(nil, "ประเทศ"),
default_overriding_bare_label_parents = {"country-like entities"},
default_no_container_cat = true,
default_no_container_parent = true,
-- These entities often aren't really part of their container; a village in Wallis and Futuna (an overseas
-- collectivity of France in Polynesia), for example, shouldn't be treated as a village in France, nor as a village
-- in Europe.
default_no_auto_augment_container = true,
data = export.country_like_entities,
}
-- Former countries and such; we don't create "Cities in ..." categories because they don't exist anymore
export.former_countries = {
-- de-facto independent state of Armenian ethnicity, internationally recognized as part of Azerbaijan
-- (also known as Nagorno-Karabakh)
-- NOTE: Formerly listed Armenia as a parent; this seems politically non-neutral so I've taken it out.
["Artsakh"] = {
placetype = {"unrecognized country", "ประเทศ"},
addl_parents = {"อาเซอร์ไบจาน", "ยุโรป", "เอเชีย"},
keydesc = "the former de-facto independent state of [[Artsakh]], internationally recognized as part of [[Azerbaijan]]",
british_spelling = true,
},
["Nagorno-Karabakh"] = {alias_of = "Artsakh"},
["Czechoslovakia"] = {container = "ยุโรป", british_spelling = true},
["East Germany"] = {container = "ยุโรป", addl_parents = {"เยอรมนี"}, british_spelling = true},
["เวียดนามเหนือ"] = {container = "เอเชีย", addl_parents = {"เวียดนาม"}},
["เปอร์เซีย"] = {placetype = {"จักรวรรดิ", "ประเทศ"}, container = "เอเชีย", divs = {"จังหวัด"}},
["Byzantine Empire"] = {
the = true, placetype = {"จักรวรรดิ", "ประเทศ"}, container = {"ยุโรป", "แอฟริกา", "เอเชีย"},
addl_parents = {"Ancient Europe", "Ancient Near East"},
divs = {
"จังหวัด", "themes",
}},
["Roman Empire"] = {
the = true, placetype = {"จักรวรรดิ", "ประเทศ"}, container = {"ยุโรป", "แอฟริกา", "เอเชีย"}, addl_parents = {"Rome"},
divs = {
"จังหวัด",
{type = "FORMER provinces", cat_as = "จังหวัด"},
}},
["เวียดนามใต้"] = {container = "เอเชีย", addl_parents = {"เวียดนาม"}},
["Soviet Union"] = {
the = true, container = {"ยุโรป", "เอเชีย"}, divs = {"republics", "autonomous republics"},
british_spelling = true},
["West Germany"] = {container = "ยุโรป", addl_parents = {"เยอรมนี"}, british_spelling = true},
["Yugoslavia"] = {container = "ยุโรป", divs = {"อำเภอ"},
keydesc = "the former [[Kingdom of Yugoslavia]] (1918–1943) or the former [[Socialist Federal Republic of Yugoslavia]] (1943–1992)", british_spelling = true},
}
export.former_countries_group = {
canonicalize_key_container = canonicalize_continent_container,
default_overriding_bare_label_parents = {"former countries and country-like entities"},
default_is_former_place = true,
default_placetype = "ประเทศ",
default_no_container_cat = true,
default_no_container_parent = true,
-- No need to augment country holonyms with continents; not needed for disambiguation.
default_no_auto_augment_container = true,
data = export.former_countries,
}
-----------------------------------------------------------------------------------
-- Subpolity tables --
-----------------------------------------------------------------------------------
export.australia_states_and_territories = {
["Australian Capital Territory, ออสเตรเลีย"] = {the = true, placetype = "ดินแดน"},
["Jervis Bay Territory, ออสเตรเลีย"] = {the = true, placetype = "ดินแดน"},
["New South Wales, ออสเตรเลีย"] = {},
["Northern Territory, ออสเตรเลีย"] = {the = true, placetype = "ดินแดน"},
["Queensland, ออสเตรเลีย"] = {},
["South Australia, ออสเตรเลีย"] = {},
["Tasmania, ออสเตรเลีย"] = {},
["Victoria, ออสเตรเลีย"] = {},
["Western Australia, ออสเตรเลีย"] = {},
}
-- states and territories of Australia
export.australia_group = {
default_container = "ออสเตรเลีย",
default_placetype = "รัฐ",
default_divs = "local government areas",
data = export.australia_states_and_territories,
}
export.austria_states = {
["Vienna, ออสเตรีย"] = {},
["Lower Austria, ออสเตรีย"] = {},
["Upper Austria, ออสเตรีย"] = {},
["Styria, ออสเตรีย"] = {},
["Tyrol, ออสเตรีย"] = {wp = "Tyrol (รัฐ)"},
["Carinthia, ออสเตรีย"] = {},
["Salzburg, ออสเตรีย"] = {wp = "Salzburg (รัฐ)"},
["Vorarlberg, ออสเตรีย"] = {},
["Burgenland, ออสเตรีย"] = {},
}
-- states of Austria
export.austria_group = {
default_container = "ออสเตรีย",
default_placetype = "รัฐ",
default_divs = "เทศบาล",
data = export.austria_states,
}
export.bangladesh_divisions = {
["Barisal Division, บังกลาเทศ"] = {},
["Chittagong Division, บังกลาเทศ"] = {},
["Dhaka Division, บังกลาเทศ"] = {},
["Khulna Division, บังกลาเทศ"] = {},
["Mymensingh Division, บังกลาเทศ"] = {},
["Rajshahi Division, บังกลาเทศ"] = {},
["Rangpur Division, บังกลาเทศ"] = {},
["Sylhet Division, บังกลาเทศ"] = {},
}
-- divisions of Bangladesh
export.bangladesh_group = {
key_to_placename = make_key_to_placename(", บังกลาเทศ$", " Division$"),
placename_to_key = make_placename_to_key(", บังกลาเทศ", " Division"),
default_container = "บังกลาเทศ",
default_placetype = "division",
default_divs = "อำเภอ",
data = export.bangladesh_divisions,
}
export.brazil_states = {
["Acre, บราซิล"] = {wp = "%l (รัฐ)"},
["Alagoas, บราซิล"] = {},
["Amapá, บราซิล"] = {},
["Amazonas, บราซิล"] = {wp = "%l (Brazilian state)"},
["Bahia, บราซิล"] = {},
["Ceará, บราซิล"] = {},
["Distrito Federal, บราซิล"] = {wp = "Federal District (Brazil)"},
["Espírito Santo, บราซิล"] = {},
["Goiás, บราซิล"] = {},
["Maranhão, บราซิล"] = {},
["Mato Grosso, บราซิล"] = {},
["Mato Grosso do Sul, บราซิล"] = {},
["Minas Gerais, บราซิล"] = {},
["Pará, บราซิล"] = {},
["Paraíba, บราซิล"] = {},
["Paraná, บราซิล"] = {wp = "%l (รัฐ)"},
["Pernambuco, บราซิล"] = {},
["Piauí, บราซิล"] = {},
["Rio de Janeiro, บราซิล"] = {wp = "%l (รัฐ)"},
["Rio Grande do Norte, บราซิล"] = {},
["Rio Grande do Sul, บราซิล"] = {},
["Rondônia, บราซิล"] = {},
["Roraima, บราซิล"] = {},
["Santa Catarina, บราซิล"] = {wp = "%l (รัฐ)"},
["São Paulo, บราซิล"] = {wp = "%l (รัฐ)"},
["Sergipe, บราซิล"] = {},
["Tocantins, บราซิล"] = {},
}
-- states of Brazil
export.brazil_group = {
default_container = "บราซิล",
default_placetype = "รัฐ",
default_divs = "เทศบาล",
data = export.brazil_states,
}
export.canada_provinces_and_territories = {
["Alberta, แคนาดา"] = {divs = {
{type = "municipal districts", container_parent_type = "rural municipalities"},
}},
["British Columbia, แคนาดา"] = {divs =
{type = "regional districts", container_parent_type = false},
"regional municipalities",
},
["Manitoba, แคนาดา"] = {divs = {"rural municipalities"}},
["New Brunswick, แคนาดา"] = {divs = {"เทศมณฑล", "parishes", {type = "civil parishes", cat_as = "parishes"}}},
["Newfoundland and Labrador, แคนาดา"] = {},
["Northwest Territories, แคนาดา"] = {the = true, placetype = "ดินแดน"},
["Nova Scotia, แคนาดา"] = {divs = {"เทศมณฑล", "regional municipalities"}},
["Nunavut, แคนาดา"] = {placetype = "ดินแดน"},
["Ontario, แคนาดา"] = {divs = {"เทศมณฑล", "regional municipalities", {type = "townships", prep = "ใน"}}},
["Prince Edward Island, แคนาดา"] = {divs = {"เทศมณฑล", "parishes", "rural municipalities"}},
["Saskatchewan, แคนาดา"] = {divs = {"rural municipalities"}},
["Quebec, แคนาดา"] = {divs = {
"เทศมณฑล",
{type = "regional county municipalities", container_parent_type = "regional municipalities"},
-- administrative regions have an official (but non-governmental) function but there don't appear to be any
-- equivalent regions elsewhere in Canada, so disable the [[Category:Regions of Canada]] grouping
{type = "ภูมิภาค", container_parent_type = false},
{type = "townships", prep = "ใน"},
{type = "parish municipalities", cat_as = {{type = "parishes", container_parent_type = "เทศมณฑล"}, "เทศบาล"}},
{type = "township municipalities", cat_as = {{type = "townships", prep = "ใน"}, "เทศบาล"}},
{type = "village municipalities", cat_as = {{type = "villages", prep = "ใน"}, "เทศบาล"}},
}},
["Yukon, แคนาดา"] = {placetype = "ดินแดน"},
["Yukon Territory, แคนาดา"] = {alias_of = "Yukon, Canada", the = true},
}
-- provinces and territories of Canada
export.canada_group = {
default_container = "แคนาดา",
default_placetype = "รัฐ", --ตาม thwiki
data = export.canada_provinces_and_territories,
}
export.china_provinces_and_autonomous_regions = {
-- direct-administered municipalities are not here but below under prefecture-level cities
["Anhui, จีน"] = {},
["Fujian, จีน"] = {},
["Fuchien, จีน"] = {alias_of = "Fujian, จีน", display = true},
["Gansu, จีน"] = {},
["Guangdong, จีน"] = {},
["Guangxi, จีน"] = {placetype = "autonomous region"},
["Guizhou, จีน"] = {},
["Hainan, จีน"] = {},
["Hebei, จีน"] = {},
["Heilongjiang, จีน"] = {},
["Henan, จีน"] = {},
["Hubei, จีน"] = {},
["Hunan, จีน"] = {},
["Inner Mongolia, จีน"] = {placetype = "autonomous region"},
["Jiangsu, จีน"] = {},
["Jiangxi, จีน"] = {},
["Jilin, จีน"] = {},
["Liaoning, จีน"] = {},
["Ningxia, จีน"] = {placetype = "autonomous region"},
["Qinghai, จีน"] = {},
["Shaanxi, จีน"] = {},
["Shandong, จีน"] = {},
["Shanxi, จีน"] = {},
["Sichuan, จีน"] = {},
["Tibet, จีน"] = {placetype = "autonomous region", wp = "Tibet Autonomous Region"},
["Xinjiang, จีน"] = {placetype = "autonomous region"},
["Yunnan, จีน"] = {},
["Zhejiang, จีน"] = {},
}
-- provinces and autonomous regions of China
export.china_group = {
default_container = "จีน",
default_placetype = "มณฑล",
default_divs = {
"จังหวัด", "prefecture-level cities",
"อำเภอ", "ตำบล", "townships",
{type = "เทศมณฑล", cat_as = "counties and county-level cities"},
{type = "county-level cities", cat_as = "counties and county-level cities"},
},
data = export.china_provinces_and_autonomous_regions,
}
export.china_prefecture_level_cities = {
-- In China, a "prefecture-level city" is not a city in any real sense. It is rather a prefecture, which is an
-- administrative unit smaller than a province but bigger than a county, which is administratively controlled by
-- the chief city of the prefecture (which bears the same name as the prefecture), in a unified government. Prior
-- to the mid-1980's, in fact, prefecture-level cities *were* prefectures, and a few of them (especially in the
-- western portion of China) have not yet been converted. Generally a given province is entirely tiled by
-- prefecture-level cities, another indication that they should be treated as prefectures and not cities per se.
-- Yet another indication is that prefecture-level cities can contain counties and county-level cities (which, much
-- like prefecture-level cities, are effectively counties surrounding a chief city of the county, again which bears
-- the same name as the county-level city).
--
-- For this reason, we treat prefecture-level cities as non-city political divisions, and separately enumerate the
-- most populous so we can separately categorize districts and counties under them instead of lumping them at the
-- province level.
--
-- Note also that China separately distinguishes "urban area" from "metro area". Sometimes the two figures are
-- identical but sometimes the metro area is larger (and very occasionally smaller, which I assume is an error). I'm
-- guessing that the "urban area" is the contiguous urban area over a certain density while the metro area includes
-- all urban areas above a certain density; when the latter is greater, it's because of satellite cities in the
-- metro area separated by suburban/exurban or rural land.
-- At first I chose all prefecture/province-level cities with a total prefecture/province-level population of at
-- least 6,000,000 per the 2020 census with data taken from https://www.citypopulation.de/en/china/admin/ (a total
-- of 67, including the four direct-administered municipalities), and also chose all prefecture/province-level
-- cities whose "urban population" was at least 2,000,000 per the 2020 census with data taken from Wikipedia
-- [[w:List of cities in China by population#Cities and towns by population]] (a total of 61 cities; if we cut off
-- at 1.5 million we'd have 84 cities, and if we cut off at 1 million we'd have 105 cities). Merging them produces
-- 87 cities. Note that this leaves off a few well-known cities (Guilin, Qiqihar, Kashgar, Lhasa, ...) but includes
-- a lot of obscure cities.
--
-- At a later date I added all cities from citypopulation.de whose "urban" population per the 2020 China census was
-- >= 1 million, and then finally added all urban agglomerations from citypopulation.de whose 2025-01-01 estimate
-- was >= 1 million. These are sorted below by the urban agglomeration value (which is generally of the "adm-urb" =
-- "administrative area (urban population)" type) and sometimes groups nearby cities into a single agglomeration
-- (most notably in the case of the Pearl River Delta, grouped under Guangzhou with an agglomeration population of
-- 72,700,000 but including a large number of nearby large cities in the agglomeration (although for some reason not
-- Hong Kong, maybe due to the administrative issues involved). In addition, citypopulation.de includes divisions
-- under a prefecture-level city if they are city-like and have an agglomeration population of at least 1 million;
-- this includes several county-level cities, one county and one district (Wanzhou, a "district" of Chongqing
-- despite being 142 miles away). None of the county-level cities or counties have districts under them, only
-- subdistricts, towns and townships.
["Guangzhou"] = {container = "Guangdong"}, -- 18.7 prefectural, 18.8 urban; sub-provincial city; 16.097 urban (72.700 adm-urb including Dongguan, Foshan, Huizhou, Jiangmen, Shenzhen, Zhongshan) per citypopulation.de
["Dongguan"] = {container = "Guangdong"}, -- 10.5 prefectural, 10.5 urban; 9.645 per citypopulation.de; included by citypopulation.de in Guangzhou agglomeration
["Foshan"] = {container = "Guangdong"}, -- 9.5 prefectural, 9.5 urban; 9.043 per citypopulation.de; included by citypopulation.de in Guangzhou agglomeration
["Huizhou"] = {container = "Guangdong"}, -- 6.0 prefectural, 2.5 urban; 2.900 per citypopulation.de; included by citypopulation.de in Guangzhou agglomeration
["Jiangmen"] = {container = "Guangdong"}, -- 4.798 prefectural, 2.7 urban; 1.795 per citypopulation.de; included by citypopulation.de in Guangzhou agglomeration
["Shenzhen"] = {container = "Guangdong"}, -- 17.5 prefectural, 14.7 urban; sub-provincial city; 17.445 per citypopulation.de; included by citypopulation.de in Guangzhou agglomeration
["Zhongshan"] = {container = "Guangdong"}, -- 4.418 prefectural, 4.4 urban; 3.842 per citypopulation.de; included by citypopulation.de in Guangzhou agglomeration
["Shanghai"] = {placetype = {"direct-administered municipality", "เทศบาล", "นคร"}}, -- 24.9 prefectural, 29.9 urban; 21.910 urban (41.600 adm-urb including Changshu, Changzhou, Suzhou, Wuxi) per citypopulation.de
["Changshu"] = {container = "Jiangsu"}, -- 1.231 urban per citypopulation.de; included by citypopulation.de in Shanghai agglomeration
-- NOTE: Not to be confused with Cangzhou in Hebei
["Changzhou"] = {container = "Jiangsu"}, -- 5.278 prefectural, 3.6 urban; 3.187 urban per citypopulation.de; included by citypopulation.de in Shanghai agglomeration
-- NOTE: There is also a prefecture-level city Suzhou in Anhui with 5.3 million prefectural inhabitants
["Suzhou"] = {container = "Jiangsu"}, -- 12.8 prefectural, 4.3 urban; 5.893 urban per citypopulation.de; included by citypopulation.de in Shanghai agglomeration
["Wuxi"] = {container = "Jiangsu"}, -- 7.5 prefectural, 3.3 urban; 3.957 per citypopulation.de; included by citypopulation.de in Shanghai agglomeration
["Beijing"] = {placetype = {"direct-administered municipality", "เทศบาล", "นคร"}}, -- 21.9 prefectural, 21.9 urban; 18.961 urban (21.500 adm-urb) per citypopulation.de
["Chengdu"] = {container = "Sichuan"}, -- 20.9 prefectural, 16.9 urban; sub-provincial city; 13.568 urban (18.100 adm-urb) per citypopulation.de
["Xiamen"] = {container = "Fujian"}, -- 5.163 prefectural, 5.2 urban; sub-provincial city; 4.617 urban (15.400 adm-urb including Jinjiang, Quanzhou, Putian) per citypopulation.de
["Jinjiang"] = {container = "Fujian"}, -- 1.416 urban per citypopulation.de; included by citypopulation.de in Xiamen agglomeration
["Quanzhou"] = {container = "Fujian"}, -- 8.8 prefectural, 1.7 urban (6.7 metro); 1.469 urban per citypopulation.de; included by citypopulation.de in Xiamen agglomeration
["Putian"] = {container = "Fujian"}, -- 3.210 prefectural, 2.0 urban; 1.539 urban per citypopulation.de; included by citypopulation.de in Xiamen agglomeration
["Hangzhou"] = {container = "Zhejiang"}, -- 11.9 prefectural, 10.7 urban; sub-provincial city; 9.236 urban (14.600 adm-urb including Shaoxing) per citypopulation.de
["Shaoxing"] = {container = "Zhejiang"}, -- 5.270 prefectural, 2.5 urban; 2.333 urban per citypopulation.de; included by citypopulation.de in Hangzhou agglomeration
["Xi'an"] = {container = "Shaanxi"}, -- 12.1 prefectural, 11.9 urban; sub-provincial city; 9.393 urban (13.400 adm-urb including Xianyang) per citypopulation.de
["Xianyang"] = {container = "Shaanxi"}, -- 1.193 urban per citypopulation.de; included by citypopulation.de in Xi'an agglomeration
["Chongqing"] = {placetype = {"direct-administered municipality", "เทศบาล", "นคร"}}, -- 32.1 prefectural, 16.9 urban; 9.581 urban (12.900 adm-urb) per citypopulation.de
["Wuhan"] = {container = "Hubei"}, -- 12.4 prefectural, 12.3 urban; sub-provincial city; 10.495 urban (12.600 adm-urb) per citypopulation.de
["Tianjin"] = {placetype = {"direct-administered municipality", "เทศบาล", "นคร"}}, -- 13.9 prefectural, 13.9 urban; 11.052 urban (11.700 adm-urb) per citypopulation.de
["Changsha"] = {container = "Hunan"}, -- 10.0 prefectural, 6.0 urban; 5.630 urban (11.500 adm-urb including Xiangtan, Zhuzhou) per citypopulation.de
-- Changsha County -- 1.024 urban per citypopulation.de
["Zhuzhou"] = {container = "Hunan"}, -- 1.510 urban per citypopulation.de; included by citypopulation.de in Changsha agglomeration
["Zhengzhou"] = {container = "Henan"}, -- 12.6 prefectural, 6.7 urban; 6.461 urban (10.300 adm-urb) per citypopulation.de
["Nanjing"] = {container = "Jiangsu"}, -- 9.3 prefectural, 9.3 urban; sub-provincial city; 7.520 urban (9.500 adm-urb including Ma'anshan) per citypopulation.de
["Shenyang"] = {container = "Liaoning"}, -- 9.1 prefectural, 7.9 urban; sub-provincial city; 7.026 urban (8.800 adm-urb including Fushun) per citypopulation.de
["Fushun"] = {container = "Liaoning"}, -- 1.229 urban per citypopulation.de; included by citypopulation.de in Shenyang agglomeration
["Hefei"] = {container = "Anhui"}, -- 9.4 prefectural, 4.2 urban; 5.056 urban (8.200 adm-urb) per citypopulation.de
["Shantou"] = {container = "Guangdong"}, -- 5.502 prefectural, 4.3 urban; 3.839 urban (8.050 adm-urb including Chaozhou, Jieyang, Puning) per citypopulation.de
["Chaozhou"] = {container = "Guangdong"}, -- 1.254 urban per citypopulation.de; included by citypopulation.de in Shantou agglomeration
["Jieyang"] = {container = "Guangdong"}, -- 1.243 urban per citypopulation.de; included by citypopulation.de in Shantou agglomeration
["Qingdao"] = {container = "Shandong"}, -- 10.1 prefectural, 7.1 urban; sub-provincial city; 6.165 urban (7.700 adm-urb) per citypopulation.de
["Ningbo"] = {container = "Zhejiang"}, -- 9.4 prefectural, 5.1 urban; sub-provincial city; 3.731 urban (7.600 adm-urb including Cixi, Yuyao) per citypopulation.de
["Cixi"] = {container = "Zhejiang"}, -- 1.458 urban per citypopulation.de; included by citypopulation.de in Ningbo agglomeration
["Yuyao"] = {container = "Zhejiang"}, -- 1.014 urban per citypopulation.de; included by citypopulation.de in Ningbo agglomeration
-- Hong Kong 7.500 agglomeration per citypopulation.de 2025-01-01 estimate including Kowloon, Victoria
["Wenzhou"] = {container = "Zhejiang"}, -- 9.6 prefectural, 3.6 urban; 2.582 urban (7.000 adm-urb including Rui'an, Cangnan, Pingyang) per citypopulation.de
-- Rui'an is a "county-level city" of the "prefecture-level city" of Wenzhou but in fact is 19 miles away from Wenzhou city proper (urban core to urban core).
["Rui'an"] = {placetype = "county-level city", container = {key = "Wenzhou", placetype = "prefecture-level city"}, divs = {"ตำบล", "townships"}}, -- 1.013 urban per citypopulation.de; included by citypopulation.de in Wenzhou agglomeration
["Kunming"] = {container = "Yunnan"}, -- 8.5 prefectural, 6.0 urban; 5.273 urban (6.800 adm-urb) per citypopulation.de
-- includes Láiwú city
["Jinan"] = {container = "Shandong", wp = "%l, %c"}, -- 9.2 prefectural, 8.4 urban; sub-provincial city; 5.648 urban (6.750 adm-urb) per citypopulation.de
-- includes Xīnjí city
["Shijiazhuang"] = {container = "Hebei"}, -- 11.2 prefectural, 4.1 urban; 5.090 urban (6.450 adm-urb) per citypopulation.de
["Taiyuan"] = {container = "Shanxi"}, -- 5.304 prefectural, 4.5 urban; 4.304 urban (6.150 adm-urb) per citypopulation.de
["Harbin"] = {container = "Heilongjiang"}, -- 10.0 prefectural, 7.0 urban; sub-provincial city; 5.243 urban (5.550 adm-urb) per citypopulation.de
["Nanning"] = {container = {key = "Guangxi, จีน", placetype = "autonomous region"}}, -- 8.7 prefectural, 3.8 urban; 4.583 urban (5.550 adm-urb) per citypopulation.de
["Dalian"] = {container = "Liaoning"}, -- 7.5 prefectural, 5.7 urban; sub-provincial city; 4.914 urban (5.400 adm-urb) per citypopulation.de
["Guiyang"] = {container = "Guizhou"}, -- 5.987 prefectural, 3.5 urban; 4.021 urban (5.300 adm-urb) per citypopulation.de
["Changchun"] = {container = "Jilin"}, -- 9.1 prefectural, 5.7 urban; sub-provincial city; 4.557 urban (5.200 adm-urb) per citypopulation.de
["Nanchang"] = {container = "Jiangxi"}, -- 6.3 prefectural, 3.6 (3.9?) urban, 5.3 metro; 3.519 urban (5.150 adm-urb) per citypopulation.de
["Ürümqi"] = {container = {key = "Xinjiang, จีน", placetype = "autonomous region"}}, -- 4.054 prefectural, 4.3 urban; 3.843 urban (5.000 adm-urb) per citypopulation.de
["Urumqi"] = {alias_of = "Ürümqi", display = true},
["Fuzhou"] = {container = "Fujian"}, -- 8.3 prefectural, 4.1 urban; 3.723 urban (4.775 adm-urb) per citypopulation.de
["Linyi"] = {container = "Shandong"}, -- 11.0 prefectural, 2.3 urban; 2.744 urban (4.650 adm-urb) per citypopulation.de
["Zibo"] = {container = "Shandong"}, -- 4.704 prefectural, 2.6 urban; 2.750 urban (3.975 adm-urb) per citypopulation.de
["Luoyang"] = {container = "Henan"}, -- 7.1 prefectural, 2.4 urban; 2.231 urban (3.750 adm-urb) per citypopulation.de
["Lanzhou"] = {container = "Gansu"}, -- 4.359 prefectural, 3.1 urban; 3.013 urban (3.575 adm-urb) per citypopulation.de
["Nantong"] = {container = "Jiangsu"}, -- 7.7 prefectural, 2.3 urban; 2.988 urban (3.475 adm-urb) citypopulation.de
["Weifang"] = {container = "Shandong"}, -- 9.4 prefectural, 2.7 urban; 1.998 urban (3.325 adm-urb) per citypopulation.de
["Jiangyin"] = {container = "Jiangsu"}, -- 1.331 urban (3.200 adm-urb including Zhangjiagang) per citypopulation.de
["Zhangjiagang"] = {container = "Jiangsu"}, -- 1.056 urban per citypopulation.de; included in Jiangyin figures
["Xuzhou"] = {container = "Jiangsu"}, -- 9.1 prefectural, 2.6 urban; 2.846 urban (3.150 adm-urb) per citypopulation.de
["Handan"] = {container = "Hebei"}, -- 9.4 prefectural, 2.8 urban; 2.095 urban (2.925 adm-urb) per citypopulation.de
["Hohhot"] = {container = {key = "Inner Mongolia, จีน", placetype = "autonomous region"}}, -- 3.446 prefectural, 2.7 urban; 2.373 urban (2.850 adm-urb) per citypopulation.de
["Haikou"] = {container = "Hainan"}, -- 2.873 prefectural, 2.3 urban; 2.349 urban (2.800 adm-urb) per citypopulation.de
["Tangshan"] = {container = "Hebei"}, -- 7.7 prefectural, 3.4 urban; 2.550 urban (2.750 adm-urb) per citypopulation.de
["Xinxiang"] = {container = "Henan"}, -- 6.3 prefectural, 1.2 urban, 2.7 metro; 1.271 urban (2.700 adm-urb) per citypopulation.de
["Yiwu"] = {container = "Zhejiang"}, -- 1.481 urban (2.700 adm-urb) per citypopulation.de
["Zhuhai"] = {container = "Guangdong"}, -- 2.439 prefectural, 2.4 urban; 2.207 urban (2.675 adm-urb) per citypopulation.de
["Taizhou, Zhejiang"] = {container = "Zhejiang"}, -- 6.6 prefectural, 1.6 urban; 1.486 urban (2.625 adm-urb) per citypopulation.de
["Taizhou"] = {alias_of = "Taizhou, Zhejiang"},
["Yantai"] = {container = "Shandong"}, -- 7.1 prefectural, 2.5 urban; 2.312 urban (2.550 adm-urb) per citypopulation.de
["Yinchuan"] = {container = {key = "Ningxia, จีน", placetype = "autonomous region"}}, -- 1.663 urban (2.525 adm-urb) per citypopulation.de
["Liuzhou"] = {container = {key = "Guangxi, จีน", placetype = "autonomous region"}}, -- 4.157 prefectural, 2.2 urban; 2.205 urban (2.500 adm-urb) per citypopulation.de
["Anshan"] = {container = "Liaoning"}, -- 1.480 urban (2.350 adm-urb including Liáoyáng) per citypopulation.de
["Yangzhou"] = {container = "Jiangsu"}, -- 2.067 urban (2.300 adm-urb) per citypopulation.de
["Jiaxing"] = {container = "Zhejiang"}, -- 1.188 urban (2.275 adm-urb) per citypopulation.de
["Xining"] = {container = "Qinghai"}, -- 1.677 urban (2.250 adm-urb) per citypopulation.de
-- includes Dìngzhōu city and Xióngān Xīnqū
["Baoding"] = {container = "Hebei"}, -- 11.5 prefectural, 2.0 urban; 1.940 urban (2.225 adm-urb) per citypopulation.de
["Baotou"] = {container = {key = "Inner Mongolia, จีน", placetype = "autonomous region"}}, -- 2.709 prefectural, 2.2 urban; 2.104 urban (2.200 adm-urb) per citypopulation.de
["Ganzhou"] = {container = "Jiangxi"}, -- 9.0 prefectural, 1.6 urban; 1.778 urban (2.150 adm-urb) per citypopulation.de
["Pingdingshan"] = {container = "Henan"}, -- 1.046 urban (2.100 adm-urb) per citypopulation.de
["Zunyi"] = {container = "Guizhou"}, -- 6.6 prefectural, 2.4 urban/metro; 1.675 urban (2.025 adm-urb) per citypopulation.de
["Bengbu"] = {container = "Anhui"}, -- 1.078 urban (2.000 adm-urb) per citypopulation.de
["Datong"] = {container = "Shanxi"}, -- 3.105 prefectural, 2.0 urban; 1.810 urban (2.000 adm-urb) per citypopulation.de
["Anyang"] = {container = "Henan"}, -- 1.188 urban (1.960 adm-urb) per citypopulation.de
["Huai'an"] = {container = "Jiangsu"}, -- 4.556 prefectural, 2.6 urban; 1.805 urban (1.940 adm-urb) per citypopulation.de
["Zaozhuang"] = {container = "Shandong"}, -- 1.350 urban (1.900 adm-urb) per citypopulation.de
["Zhanjiang"] = {container = "Guangdong"}, -- 7.0 prefectural, 1.9 urban; 1.401 urban (1.890 adm-urb) per citypopulation.de
["Huainan"] = {container = "Anhui"}, -- 1.256 urban (1.880 adm-urb) per citypopulation.de
["Jining"] = {container = "Shandong"}, -- 8.4 prefectural, 1.5 urban; 1.700 urban (1.880 adm-urb) per citypopulation.de
["Daqing"] = {container = "Heilongjiang"}, -- 1.604 urban (1.860 adm-urb) per citypopulation.de
["Wuhu"] = {container = "Anhui"}, -- 1.598 urban (1.850 adm-urb) per citypopulation.de
["Guilin"] = {container = {key = "Guangxi, จีน", placetype = "autonomous region"}}, -- 1.361 urban (1.830 adm-urb) per citypopulation.de
["Mianyang"] = {container = "Sichuan"}, -- 1.549 urban (1.800 adm-urb) per citypopulation.de
["Xiangyang"] = {container = "Hubei"}, -- 1.686 urban (1.800 adm-urb) per citypopulation.de
["Huzhou"] = {container = "Zhejiang"}, -- 1.084 urban (1.750 adm-urb) per citypopulation.de
["Puyang"] = {container = "Henan"}, -- 0.824 urban (1.750 adm-urb) per citypopulation.de
["Shangqiu"] = {container = "Henan"}, -- 7.8 prefectural, 1.9 urban (2.8 metro); 1.031 urban (1.750 adm-urb) per citypopulation.de
["Qinhuangdao"] = {container = "Hebei"}, -- 1.520 urban (1.740 adm-urb) per citypopulation.de
["Xingtai"] = {container = "Hebei"}, -- 7.1 prefectural, 971,000 urban; 1.5 urban (1.700 adm-urb) per citypopulation.de
["Nanyang"] = {container = "Henan", wp = "%l, %c"}, -- 9.7 prefectural, 2.1 urban/metro; 1.481 urban (1.680 adm-urb) per citypopulation.de
["Jiaozuo"] = {container = "Henan"}, -- 0.875 urban (1.640 adm-urb) per citypopulation.de
["Jilin City"] = {container = "Jilin"}, -- 1.509 urban (1.610 adm-urb) per citypopulation.de
["Jilin"] = {alias_of = "Jilin City"},
["Jinhua"] = {container = "Zhejiang"}, -- 7.1 prefectural, 1.5 urban; 1.041 urban (1.590 adm-urb) per citypopulation.de
["Shangrao"] = {container = "Jiangxi"}, -- 6.5 prefectural, 2.1 urban, 1.3 metro [sic]; 1.342 urban (1.580 adm-urb) per citypopulation.de
["Heze"] = {container = "Shandong"}, -- 8.8 prefectural, 1.3 urban; 1.294 urban (1.570 adm-urb) per citypopulation.de
["Yulin"] = {container = {key = "Guangxi, จีน", placetype = "autonomous region"}, wp = "%l, %c"}, -- 0.878 urban (1.570 adm-urb) per citypopulation.de
["Tai'an"] = {container = "Shandong"}, -- 1.417 urban (1.560 adm-urb) per citypopulation.de
["Weihai"] = {container = "Shandong"}, -- 1.340 urban (1.510 adm-urb) per citypopulation.de
-- Taizhou, Jiangsu would be here (1.490 adm-urb) but moved to china_prefecture_level_cities_2 to avoid clash
["Yancheng"] = {container = "Jiangsu"}, -- 6.7 prefectural, 1.6 urban; 1.353 urban (1.460 adm-urb) per citypopulation.de
["Zhangjiakou"] = {container = "Hebei"}, -- 1.339 urban (1.450 adm-urb) per citypopulation.de
["Maoming"] = {container = "Guangdong"}, -- 6.2 prefectural, 2.5 urban; 1.308 urban (1.440 adm-urb) per citypopulation.de
["Nanchong"] = {container = "Sichuan"}, -- 1.254 urban (1.440 adm-urb) per citypopulation.de
["Fuyang"] = {container = "Anhui", wp = "%l, %c"}, -- 8.2 prefectural, 2.1 urban; 1.191 urban (1.410 adm-urb) per citypopulation.de
["Xuchang"] = {container = "Henan"}, -- 0.850 urban (1.390 adm-urb) per citypopulation.de
["Yichang"] = {container = "Hubei"}, -- 1.284 urban (1.390 adm-urb) per citypopulation.de
["Dazhou"] = {container = "Sichuan"}, -- 1.136 urban (1.380 adm-urb) per citypopulation.de
["Kaifeng"] = {container = "Henan"}, -- 1.194 urban (1.340 adm-urb) per citypopulation.de
["Luzhou"] = {container = "Sichuan"}, -- 1.128 urban (1.340 adm-urb) per citypopulation.de
["Qingyuan"] = {container = "Guangdong"}, -- 1.198 urban (1.340 adm-urb) per citypopulation.de
["Huaibei"] = {container = "Anhui"}, -- 0.831 urban (1.330 adm-urb) per citypopulation.de
["Yibin"] = {container = "Sichuan"}, -- 1.101 urban (1.310 adm-urb) per citypopulation.de
["Lu'an"] = {container = "Anhui"}, -- 1.070 urban (1.300 adm-urb) per citypopulation.de
["Dezhou"] = {container = "Shandong"}, -- 0.843 urban (1.290 adm-urb) per citypopulation.de
["Rizhao"] = {container = "Shandong"}, -- 1.147 urban (1.270 adm-urb) per citypopulation.de
["Changzhi"] = {container = "Shanxi"}, -- 1.047 urban (1.250 adm-urb) per citypopulation.de
["Hengyang"] = {container = "Hunan"}, -- 6.6 prefectural, 1.5 urban; 1.185 urban (1.250 adm-urb) per citypopulation.de
["Jinzhou"] = {container = "Liaoning"}, -- 1.021 urban (1.240 adm-urb) per citypopulation.de
["Liaocheng"] = {container = "Shandong"}, -- 1.020 urban (1.240 adm-urb) per citypopulation.de
["Changde"] = {container = "Hunan"}, -- 1.101 urban (1.230 adm-urb) per citypopulation.de
["Suqian"] = {container = "Jiangsu"}, -- 1.082 urban (1.230 adm-urb) per citypopulation.de
["Xinyang"] = {container = "Henan"}, -- 6.2 prefectural, 1.4 urban/metro; 1.015 urban (1.230 adm-urb) per citypopulation.de
["Baoji"] = {container = "Shaanxi"}, -- 1.108 urban (1.220 adm-urb) per citypopulation.de
["Yueyang"] = {container = "Hunan"}, -- 1.125 urban (1.220 adm-urb) per citypopulation.de
["Zhenjiang"] = {container = "Jiangsu"}, -- 1.124 urban (1.210 adm-urb) per citypopulation.de
-- Wanzhou is a "district" of the "direct-administered municipality" of Chongqing but in fact is 142 miles away from Chongqing city proper.
["Wanzhou"] = {placetype = "district", container = {key = "Chongqing", placetype = "direct-administered municipality"}, divs = {"ตำบล", "townships"}, wp = "%l, %c"}, -- 1.078 urban (1.190 adm-urb) per citypopulation.de
["Ulanhad"] = {container = {key = "Inner Mongolia, จีน", placetype = "autonomous region"}}, -- 1.093 urban (1.180 adm-urb) per citypopulation.de
["Chifeng"] = {alias_of = "Ulanhad"},
["Ulankhad"] = {alias_of = "Ulanhad", display = true},
["Ezhou"] = {container = "Hubei"}, -- < 0.750 urban (1.180 adm-urb) per citypopulation.de
["Zhaoqing"] = {container = "Guangdong"}, -- 1.036 urban (1.160 adm-urb) per citypopulation.de
["Lianyungang"] = {container = "Jiangsu"}, -- 4.599 prefectural, 2.0 urban; 1.071 urban (1.150 adm-urb) per citypopulation.de
["Qujing"] = {container = "Yunnan"}, -- 0.976 urban (1.150 adm-urb) per citypopulation.de
-- Shuyang is a "เทศมณฑล" of the "prefecture-level city" of Suqian but in fact is 38 miles away from Suqian city proper (urban core to urban core).
-- The county itself is 37 miles by 34 miles.
["Shuyang"] = {placetype = "เทศมณฑล", container = {key = "Suqian", placetype = "prefecture-level city"}, divs = {"ตำบล", "townships"}, wp = "%l County"}, -- 0.986 urban (1.120 adm-urb) per citypopulation.de
-- Yongkang is a "county-level city" of the "prefecture-level city" of Jinhua but in fact is 32 miles away from Jinhua city proper (urban core to urban core).
["Yongkang"] = {placetype = "county-level city", container = {key = "Jinhua", placetype = "prefecture-level city"}, divs = {"ตำบล", "townships"}, wp = "%l, Zhejiang"}, -- < 0.750 urban (1.110 adm-urb) per citypopulation.de
["Zhoukou"] = {container = "Henan"}, -- 9.0 prefectural, 721,000 urban (1.6 metro); < 0.750 urban (1.100 adm-urb) per citypopulation.de
["Beihai"] = {container = {key = "Guangxi, จีน", placetype = "autonomous region"}}, -- < 1 urban (1.090 adm-urb) per citypopulation.de
["Jiujiang"] = {container = "Jiangxi"}, -- < 0.750 urban (1.080 adm-urb) per citypopulation.de
["Shaoyang"] = {container = "Hunan"}, -- 6.6 prefectural, 802,000 urban, 1.4 metro; < 1 urban (1.080 adm-urb) per citypopulation.de
["Chuzhou"] = {container = "Anhui"}, -- < 0.750 urban (1.070 adm-urb) per citypopulation.de
["Hengshui"] = {container = "Hebei"}, -- 0.885 urban (1.070 adm-urb) per citypopulation.de
["Shiyan"] = {container = "Hubei"}, -- 0.955 urban (1.070 adm-urb) per citypopulation.de
["Huludao"] = {container = "Liaoning"}, -- 0.764 urban (1.060 adm-urb) per citypopulation.de
["Dongying"] = {container = "Shandong"}, -- 0.961 urban (1.050 adm-urb) per citypopulation.de
["Guigang"] = {container = {key = "Guangxi, จีน", placetype = "autonomous region"}}, -- 0.921 urban (1.050 adm-urb) per citypopulation.de
-- Liuyang is a "county-level city" of the "prefecture-level city" of Changsha but in fact is 47 miles away from Changsha city proper (urban core to urban core).
["Liuyang"] = {placetype = "county-level city", container = {key = "Changsha", placetype = "prefecture-level city"}, divs = {"ตำบล", "townships"}}, -- 0.886 urban (1.040 adm-urb) per citypopulation.de
-- NOTE: Not to be confused with Changzhou in Jiangsu
["Cangzhou"] = {container = "Hebei"}, -- 7.3 prefectural, 621,000 urban; 0.759 urban (1.030 adm-urb) per citypopulation.de
["Liupanshui"] = {container = "Guizhou"}, -- < 0.750 urban (1.030 adm-urb) per citypopulation.de
["Panjin"] = {container = "Liaoning"}, -- 0.980 urban (1.030 adm-urb) per citypopulation.de
["Qiqihar"] = {container = "Heilongjiang"}, -- 1.030 urban (1.030 adm-urb) per citypopulation.de
["Linfen"] = {container = "Shanxi"}, -- < 0.750 urban (1.010 adm-urb) per citypopulation.de
-- Tengzhou is a "county-level city" of the "prefecture-level city" of Zaozhuang but in fact is 30 miles away from Zaozhuang city proper (urban core to urban core).
["Tengzhou"] = {placetype = "county-level city", container = {key = "Zaozhuang", placetype = "prefecture-level city"}, divs = {"ตำบล", "townships"}}, -- 0.937 urban (1.010 adm-urb) per citypopulation.de
-- 3 extra that got added in earlier incarnations and aren't found in the "major agglomerations of the world" page https://citypopulation.de/en/world/agglomerations/ reference date 2025-01-01
["Kunshan"] = {container = "Jiangsu"}, -- 1.652 urban (2020 China census) per citypopulation.de
["Zhumadian"] = {container = "Henan"}, -- 7.0 prefectural, 722,000 urban per Wikipedia; 0.754 urban per citypopulation.de
["Bijie"] = {container = "Guizhou"}, -- 6.9 prefectural, ? urban, ? metro (not listed in Wikipedia); < 0.750 urban per citypopulation.de
}
export.china_prefecture_level_cities_group = {
-- don't do any transformations between key and placename; in particular, don't chop off anything from
-- "Taizhou, Zhejiang" or "Suzhou, Anhui".
key_to_placename = false,
placename_to_key = false, -- don't add ", จีน" to make the key
default_container = "จีน",
canonicalize_key_container = make_canonicalize_key_container(", จีน", "จังหวัด"),
-- Prefecture-level cities aren't really cities but allow them to be identified that way, as many people
-- don't understand how Chinese administrative divisions work.
default_placetype = {"prefecture-level city", "นคร"},
default_divs = {
-- "towns" (but not "townships") are automatically added as they are specified as generic_before_non_cities,
-- and prefecture-level cities (as well as county-level cities) are considered non-cities.
"อำเภอ", "ตำบล", "townships",
{type = "เทศมณฑล", cat_as = "counties and county-level cities"},
{type = "county-level cities", cat_as = "counties and county-level cities"},
},
data = export.china_prefecture_level_cities,
}
-- Needed to avoid problems with two cities called Taizhou and Suzhou.
export.china_prefecture_level_cities_2 = {
-- NOTE: There is also a larger and better-known prefecture-level city Taizhou in Zhejiang.
["Taizhou, Jiangsu"] = {container = "Jiangsu"}, -- 1.3 urban (1.490 adm-urb) per citypopulation.de 2020 census
["Taizhou"] = {alias_of = "Taizhou, Jiangsu"},
-- NOTE: There is also a larger and better-known prefecture-level city Suzhou in Jiangsu.
["Suzhou, Anhui"] = {container = "Anhui"}, -- 5.3 prefectural, 1.766 metro and "urban"; < 1 urban (1.010 adm-urb) per citypopulation.de 2020 census
-- hopefully this will work because we also have Suzhou as a key by itself for the larger, more-well-known Suzhou in Jiangsu
["Suzhou"] = {alias_of = "Suzhou, Anhui"},
}
export.china_prefecture_level_cities_group_2 = {
-- don't do any transformations between key and placename; in particular, don't chop off anything from
-- "Taizhou, Jiangsu".
placename_to_key = false, -- don't add ", จีน" to make the key
default_container = "จีน",
canonicalize_key_container = make_canonicalize_key_container(", จีน", "จังหวัด"),
-- Prefecture-level cities aren't really cities but allow them to be identified that way, as many people
-- don't understand how Chinese administrative divisions work.
default_placetype = {"prefecture-level city", "นคร"},
default_divs = {
-- "towns" (but not "townships") are automatically added as they are specified as generic_before_non_cities,
-- and prefecture-level cities (as well as county-level cities) are considered non-cities.
"อำเภอ", "ตำบล", "townships",
{type = "เทศมณฑล", cat_as = "counties and county-level cities"},
{type = "county-level cities", cat_as = "counties and county-level cities"},
},
data = export.china_prefecture_level_cities_2,
}
export.finland_regions = {
["Lapland, ฟินแลนด์"] = {wp = "%l (%c)"},
["North Ostrobothnia, ฟินแลนด์"] = {},
["Northern Ostrobothnia, ฟินแลนด์"] = {alias_of = "North Ostrobothnia, ฟินแลนด์", display = true},
["Kainuu, ฟินแลนด์"] = {},
["North Karelia, ฟินแลนด์"] = {},
["Northern Savonia, ฟินแลนด์"] = {},
["North Savo, ฟินแลนด์"] = {alias_of = "Northern Savonia, ฟินแลนด์", display = true},
["Southern Savonia, ฟินแลนด์"] = {},
["South Savo, ฟินแลนด์"] = {alias_of = "Southern Savonia, ฟินแลนด์", display = true},
["South Karelia, ฟินแลนด์"] = {},
["Central Finland, ฟินแลนด์"] = {},
["South Ostrobothnia, ฟินแลนด์"] = {},
["Southern Ostrobothnia, ฟินแลนด์"] = {alias_of = "South Ostrobothnia, ฟินแลนด์", display = true},
["Ostrobothnia, ฟินแลนด์"] = {wp = "%l (ภูมิภาค)"},
["Central Ostrobothnia, ฟินแลนด์"] = {},
["Pirkanmaa, ฟินแลนด์"] = {},
["Satakunta, ฟินแลนด์"] = {},
["Päijänne Tavastia, ฟินแลนด์"] = {},
["Päijät-Häme, ฟินแลนด์"] = {alias_of = "Päijänne Tavastia, ฟินแลนด์", display = true},
["Tavastia Proper, ฟินแลนด์"] = {},
["Kanta-Häme, ฟินแลนด์"] = {alias_of = "Tavastia Proper, ฟินแลนด์", display = true},
["Kymenlaakso, ฟินแลนด์"] = {},
["Uusimaa, ฟินแลนด์"] = {},
["Southwest Finland, ฟินแลนด์"] = {},
["Åland Islands, ฟินแลนด์"] = {the = true, wp = "Åland"},
["Åland, ฟินแลนด์"] = {alias_of = "Åland Islands, ฟินแลนด์"}, -- differs in "the"
}
-- regions of Finland
export.finland_group = {
default_container = "ฟินแลนด์",
default_placetype = "ภูมิภาค",
default_divs = "เทศบาล",
data = export.finland_regions,
}
export.france_administrative_regions = {
["Auvergne-Rhône-Alpes, ฝรั่งเศส"] = {},
["Bourgogne-Franche-Comté, ฝรั่งเศส"] = {},
["Brittany, ฝรั่งเศส"] = {wp = "%l (administrative region)"},
["Centre-Val de Loire, ฝรั่งเศส"] = {},
["Corsica, ฝรั่งเศส"] = {},
-- overseas departments are handled in `export.country_like_entities`
-- ["French Guiana"] = {},
["Grand Est, ฝรั่งเศส"] = {},
-- ["Guadeloupe"] = {},
["Hauts-de-France, ฝรั่งเศส"] = {},
["Île-de-France, ฝรั่งเศส"] = {},
-- ["Martinique"] = {},
-- ["Mayotte"] = {},
["Normandy, ฝรั่งเศส"] = {wp = "%l (administrative region)"},
["Nouvelle-Aquitaine, ฝรั่งเศส"] = {},
["Occitania, ฝรั่งเศส"] = {wp = "%l (administrative region)"},
["Occitanie, ฝรั่งเศส"] = {alias_of = "Occitania, ฝรั่งเศส", display = true},
["Pays de la Loire, ฝรั่งเศส"] = {},
["Provence-Alpes-Côte d'Azur, ฝรั่งเศส"] = {},
-- ["Réunion"] = {},
}
-- administrative regions of France
export.france_group = {
default_container = "ฝรั่งเศส",
-- Canonically these are 'administrative regions' but also treat as 'region' ('administrative region' falls back
-- to 'region').
default_placetype = "ภูมิภาค",
default_divs = {
"communes",
{type = "เทศบาล", cat_as = "communes"},
"departments",
{type = "prefectures", cat_as = {"prefectures", "departmental capitals"}},
{type = "French prefectures", cat_as = {"prefectures", "departmental capitals"}},
},
data = export.france_administrative_regions,
}
export.france_departments = {
["Ain, ฝรั่งเศส"] = {container = "Auvergne-Rhône-Alpes"}, -- 01
["Aisne, ฝรั่งเศส"] = {container = "Hauts-de-France"}, -- 02
["Allier, ฝรั่งเศส"] = {container = "Auvergne-Rhône-Alpes"}, -- 03
["Alpes-de-Haute-Provence, ฝรั่งเศส"] = {container = "Provence-Alpes-Côte d'Azur"}, -- 04
["Hautes-Alpes, ฝรั่งเศส"] = {container = "Provence-Alpes-Côte d'Azur"}, -- 05
["Alpes-Maritimes, ฝรั่งเศส"] = {container = "Provence-Alpes-Côte d'Azur"}, -- 06
["Ardèche, ฝรั่งเศส"] = {container = "Auvergne-Rhône-Alpes"}, -- 07
["Ardennes, ฝรั่งเศส"] = {container = "Grand Est", wp = "%l (department)"}, -- 08
["Ariège, ฝรั่งเศส"] = {container = "Occitania", wp = "%l (department)"}, -- 09
["Aube, ฝรั่งเศส"] = {container = "Grand Est"}, -- 10
["Aude, ฝรั่งเศส"] = {container = "Occitania"}, -- 11
["Aveyron, ฝรั่งเศส"] = {container = "Occitania"}, -- 12
["Bouches-du-Rhône, ฝรั่งเศส"] = {container = "Provence-Alpes-Côte d'Azur"}, -- 13
["Calvados, ฝรั่งเศส"] = {container = "Normandy", wp = "%l (department)"}, -- 14
["Cantal, ฝรั่งเศส"] = {container = "Auvergne-Rhône-Alpes"}, -- 15
["Charente, ฝรั่งเศส"] = {container = "Nouvelle-Aquitaine"}, -- 16
["Charente-Maritime, ฝรั่งเศส"] = {container = "Nouvelle-Aquitaine"}, -- 17
["Cher, ฝรั่งเศส"] = {container = "Centre-Val de Loire", wp = "%l (department)"}, -- 18
["Corrèze, ฝรั่งเศส"] = {container = "Nouvelle-Aquitaine"}, -- 19
["Corse-du-Sud, ฝรั่งเศส"] = {container = "Corsica"}, -- 2A
["Haute-Corse, ฝรั่งเศส"] = {container = "Corsica"}, -- 2B
["Côte-d'Or, ฝรั่งเศส"] = {container = "Bourgogne-Franche-Comté"}, -- 21
["Côte d'Or, ฝรั่งเศส"] = {alias_of = "Côte-d'Or, ฝรั่งเศส", display = true},
["Côtes-d'Armor, ฝรั่งเศส"] = {container = "Brittany"}, -- 22
["Côtes d'Armor, ฝรั่งเศส"] = {alias_of = "Côtes-d'Armor, ฝรั่งเศส", display = true},
["Creuse, ฝรั่งเศส"] = {container = "Nouvelle-Aquitaine"}, -- 23
["Dordogne, ฝรั่งเศส"] = {container = "Nouvelle-Aquitaine"}, -- 24
["Doubs, ฝรั่งเศส"] = {container = "Bourgogne-Franche-Comté"}, -- 25
["Drôme, ฝรั่งเศส"] = {container = "Auvergne-Rhône-Alpes"}, -- 26
["Eure, ฝรั่งเศส"] = {container = "Normandy"}, -- 27
["Eure-et-Loir, ฝรั่งเศส"] = {container = "Centre-Val de Loire"}, -- 28
["Finistère, ฝรั่งเศส"] = {container = "Brittany"}, -- 29
["Gard, ฝรั่งเศส"] = {container = "Occitania"}, -- 30
["Haute-Garonne, ฝรั่งเศส"] = {container = "Occitania"}, -- 31
["Gers, ฝรั่งเศส"] = {container = "Occitania"}, -- 32
["Gironde, ฝรั่งเศส"] = {container = "Nouvelle-Aquitaine"}, -- 33
["Hérault, ฝรั่งเศส"] = {container = "Occitania"}, -- 34
["Ille-et-Vilaine, ฝรั่งเศส"] = {container = "Brittany"}, -- 35
["Indre, ฝรั่งเศส"] = {container = "Centre-Val de Loire"}, -- 36
["Indre-et-Loire, ฝรั่งเศส"] = {container = "Centre-Val de Loire"}, -- 37
["Isère, ฝรั่งเศส"] = {container = "Auvergne-Rhône-Alpes"}, -- 38
["Jura, ฝรั่งเศส"] = {container = "Bourgogne-Franche-Comté", wp = "%l (department)"}, -- 39
["Landes, ฝรั่งเศส"] = {container = "Nouvelle-Aquitaine", wp = "%l (department)"}, -- 40
["Loir-et-Cher, ฝรั่งเศส"] = {container = "Centre-Val de Loire"}, -- 41
["Loire, ฝรั่งเศส"] = {container = "Auvergne-Rhône-Alpes", wp = "%l (department)"}, -- 42
["Haute-Loire, ฝรั่งเศส"] = {container = "Auvergne-Rhône-Alpes"}, -- 43
["Loire-Atlantique, ฝรั่งเศส"] = {container = "Pays de la Loire"}, -- 44
["Loiret, ฝรั่งเศส"] = {container = "Centre-Val de Loire"}, -- 45
["Lot, ฝรั่งเศส"] = {container = "Occitania", wp = "%l (department)"}, -- 46
["Lot-et-Garonne, ฝรั่งเศส"] = {container = "Nouvelle-Aquitaine"}, -- 47
["Lozère, ฝรั่งเศส"] = {container = "Occitania"}, -- 48
["Maine-et-Loire, ฝรั่งเศส"] = {container = "Pays de la Loire"}, -- 49
["Manche, ฝรั่งเศส"] = {container = "Normandy"}, -- 50
["Marne, ฝรั่งเศส"] = {container = "Grand Est", wp = "%l (department)"}, -- 51
["Haute-Marne, ฝรั่งเศส"] = {container = "Grand Est"}, -- 52
["Mayenne, ฝรั่งเศส"] = {container = "Pays de la Loire"}, -- 53
["Meurthe-et-Moselle, ฝรั่งเศส"] = {container = "Grand Est"}, -- 54
["Meuse, ฝรั่งเศส"] = {container = "Grand Est", wp = "%l (department)"}, -- 55
["Morbihan, ฝรั่งเศส"] = {container = "Brittany"}, -- 56
["Moselle, ฝรั่งเศส"] = {container = "Grand Est", wp = "%l (department)"}, -- 57
["Nièvre, ฝรั่งเศส"] = {container = "Bourgogne-Franche-Comté"}, -- 58
["Nord, ฝรั่งเศส"] = {container = "Hauts-de-France", wp = "%l (French department)"}, -- 59
["Oise, ฝรั่งเศส"] = {container = "Hauts-de-France"}, -- 60
["Orne, ฝรั่งเศส"] = {container = "Normandy"}, -- 61
["Pas-de-Calais, ฝรั่งเศส"] = {container = "Hauts-de-France"}, -- 62
["Puy-de-Dôme, ฝรั่งเศส"] = {container = "Auvergne-Rhône-Alpes"}, -- 63
["Pyrénées-Atlantiques, ฝรั่งเศส"] = {container = "Nouvelle-Aquitaine"}, -- 64
["Hautes-Pyrénées, ฝรั่งเศส"] = {container = "Occitania"}, -- 65
["Pyrénées-Orientales, ฝรั่งเศส"] = {container = "Occitania"}, -- 66
["Bas-Rhin, ฝรั่งเศส"] = {container = "Grand Est"}, -- 67
["Haut-Rhin, ฝรั่งเศส"] = {container = "Grand Est"}, -- 68
["Rhône, ฝรั่งเศส"] = {container = "Auvergne-Rhône-Alpes", wp = "%l (department)"}, -- 69D
["Metropolis of Lyon, ฝรั่งเศส"] = {container = "Auvergne-Rhône-Alpes", the = true}, -- 69M
["Lyon Metropolis, ฝรั่งเศส"] = {alias_of = "Metropolis of Lyon, ฝรั่งเศส"},
["Lyon, ฝรั่งเศส"] = {alias_of = "Metropolis of Lyon, ฝรั่งเศส"},
["Haute-Saône, ฝรั่งเศส"] = {container = "Bourgogne-Franche-Comté"}, -- 70
["Saône-et-Loire, ฝรั่งเศส"] = {container = "Bourgogne-Franche-Comté"}, -- 71
["Sarthe, ฝรั่งเศส"] = {container = "Pays de la Loire"}, -- 72
["Savoie, ฝรั่งเศส"] = {container = "Auvergne-Rhône-Alpes"}, -- 73
["Haute-Savoie, ฝรั่งเศส"] = {container = "Auvergne-Rhône-Alpes"}, -- 74
["Paris, ฝรั่งเศส"] = {container = "Île-de-France"}, -- 75
["Seine-Maritime, ฝรั่งเศส"] = {container = "Normandy"}, -- 76
["Seine-et-Marne, ฝรั่งเศส"] = {container = "Île-de-France"}, -- 77
["Yvelines, ฝรั่งเศส"] = {container = "Île-de-France"}, -- 78
["Deux-Sèvres, ฝรั่งเศส"] = {container = "Nouvelle-Aquitaine"}, -- 79
["Somme, ฝรั่งเศส"] = {container = "Hauts-de-France", wp = "%l (department)"}, -- 80
["Tarn, ฝรั่งเศส"] = {container = "Occitania", wp = "%l (department)"}, -- 81
["Tarn-et-Garonne, ฝรั่งเศส"] = {container = "Occitania"}, -- 82
["Var, ฝรั่งเศส"] = {container = "Provence-Alpes-Côte d'Azur", wp = "%l (department)"}, -- 83
["Vaucluse, ฝรั่งเศส"] = {container = "Provence-Alpes-Côte d'Azur"}, -- 84
["Vendée, ฝรั่งเศส"] = {container = "Pays de la Loire"}, -- 85
["Vienne, ฝรั่งเศส"] = {container = "Nouvelle-Aquitaine", wp = "%l (department)"}, -- 86
["Haute-Vienne, ฝรั่งเศส"] = {container = "Nouvelle-Aquitaine"}, -- 87
["Vosges, ฝรั่งเศส"] = {container = "Grand Est", wp = "%l (department)"}, -- 88
["Yonne, ฝรั่งเศส"] = {container = "Bourgogne-Franche-Comté"}, -- 89
["Territoire de Belfort, ฝรั่งเศส"] = {container = "Bourgogne-Franche-Comté"}, -- 90
["Essonne, ฝรั่งเศส"] = {container = "Île-de-France"}, -- 91
["Hauts-de-Seine, ฝรั่งเศส"] = {container = "Île-de-France"}, -- 92
["Seine-Saint-Denis, ฝรั่งเศส"] = {container = "Île-de-France"}, -- 93
["Val-de-Marne, ฝรั่งเศส"] = {container = "Île-de-France"}, -- 94
["Val-d'Oise, ฝรั่งเศส"] = {container = "Île-de-France"}, -- 95
--["Guadeloupe"] = {container = "Guadeloupe"}, -- 971
--["Martinique"] = {container = "Martinique"}, -- 972
--["Guyane"] = {container = "French Guiana", wp = "French Guiana"}, -- 973
--["La Réunion"] = {container = "Réunion", wp = "Réunion"}, -- 974
--["Mayotte"] = {container = "Mayotte"}, -- 976
}
export.france_departments_group = {
placename_to_key = make_placename_to_key(", ฝรั่งเศส"),
canonicalize_key_container = make_canonicalize_key_container(", ฝรั่งเศส", "ภูมิภาค"),
default_placetype = "department",
default_divs = {
"communes",
{type = "เทศบาล", cat_as = "communes"},
},
data = export.france_departments,
}
export.germany_states = {
["Baden-Württemberg, เยอรมนี"] = {},
["Bavaria, เยอรมนี"] = {},
-- Berlin, Bremen and Hamburg are effectively city-states and don't have districts ([[Kreise]]), so override
-- the default_divs setting. Better not to include them at all since they're included as cities down below.
-- ["Berlin"] = {divs = {}},
["Brandenburg, เยอรมนี"] = {},
-- ["Bremen"] = {divs = {}},
-- ["Hamburg"] = {divs = {}},
["Hesse, เยอรมนี"] = {},
["Lower Saxony, เยอรมนี"] = {},
["Mecklenburg-Vorpommern, เยอรมนี"] = {},
["Mecklenburg-Western Pomerania, เยอรมนี"] = {alias_of = "Mecklenburg-Vorpommern, เยอรมนี", display = true},
["North Rhine-Westphalia, เยอรมนี"] = {},
["Rhineland-Palatinate, เยอรมนี"] = {},
["Saarland, เยอรมนี"] = {},
["Saxony, เยอรมนี"] = {},
["Saxony-Anhalt, เยอรมนี"] = {},
["Schleswig-Holstein, เยอรมนี"] = {},
["Thuringia, เยอรมนี"] = {},
}
-- states of Germany
export.germany_group = {
default_container = "เยอรมนี",
default_placetype = "รัฐ",
default_divs = {"อำเภอ", "เทศบาล"},
data = export.germany_states,
}
export.greece_regions = {
["Attica, กรีซ"] = {wp = "%l (ภูมิภาค)"},
["Central Greece, กรีซ"] = {wp = "%l (administrative region)"},
["Central Macedonia, กรีซ"] = {},
["Crete, กรีซ"] = {},
["Eastern Macedonia and Thrace, กรีซ"] = {},
["Epirus, กรีซ"] = {wp = "%l (ภูมิภาค)"},
["Ionian Islands, กรีซ"] = {the = true, wp = "%l (ภูมิภาค)"},
["North Aegean, กรีซ"] = {the = true},
-- I would expect 'the Peloponnese' but Wikipedia mostly has categories like [[w:Category:Geography of Peloponnese (ภูมิภาค)]]
-- and [[w:Category:Buildings and structures in Peloponnese (ภูมิภาค)]]; only [[w:Category:People from the Peloponnese (ภูมิภาค)]]
-- has "the" in it.
["Peloponnese, กรีซ"] = {wp = "%l (ภูมิภาค)"},
["South Aegean, กรีซ"] = {the = true},
["Thessaly, กรีซ"] = {},
["Western Greece, กรีซ"] = {},
["Western Macedonia, กรีซ"] = {},
["Mount Athos, กรีซ"] = {placetype = {"autonomous region", "ภูมิภาค"}, wp = "Monastic community of Mount Athos"},
}
-- regions of Greece
export.greece_group = {
default_container = "กรีซ",
default_placetype = "ภูมิภาค",
data = export.greece_regions,
}
local india_polity_with_divisions = {"divisions", "อำเภอ"}
local india_polity_without_divisions = {"อำเภอ"}
-- States and union territories of India. Only some of them are divided into divisions.
export.india_states_and_union_territories = {
["Andaman and Nicobar Islands, อินเดีย"] =
{the = true, placetype = "union territory", divs = india_polity_without_divisions},
["Andhra Pradesh, อินเดีย"] = {divs = india_polity_without_divisions},
["Arunachal Pradesh, อินเดีย"] = {divs = india_polity_with_divisions},
["Assam, อินเดีย"] = {divs = india_polity_with_divisions},
["Bihar, อินเดีย"] = {divs = india_polity_with_divisions},
["Chandigarh, อินเดีย"] = {placetype = "union territory", divs = india_polity_without_divisions},
["Chhattisgarh, อินเดีย"] = {divs = india_polity_with_divisions},
["Dadra and Nagar Haveli and Daman and Diu, อินเดีย"] = {placetype = "union territory", divs = india_polity_without_divisions},
["Delhi, อินเดีย"] = {placetype = "union territory", divs = india_polity_with_divisions},
["Goa, อินเดีย"] = {divs = india_polity_without_divisions},
["Gujarat, อินเดีย"] = {divs = india_polity_without_divisions},
["Haryana, อินเดีย"] = {divs = india_polity_with_divisions},
["Himachal Pradesh, อินเดีย"] = {divs = india_polity_with_divisions},
["Jammu and Kashmir, อินเดีย"] = {placetype = "union territory", divs = india_polity_with_divisions,
wp = "%l (union territory)"},
["Jharkhand, อินเดีย"] = {divs = india_polity_with_divisions},
["Karnataka, อินเดีย"] = {divs = india_polity_with_divisions},
["Kerala, อินเดีย"] = {divs = india_polity_without_divisions},
["Ladakh, อินเดีย"] = {placetype = "union territory", divs = india_polity_with_divisions},
["Lakshadweep, อินเดีย"] = {placetype = "union territory", divs = india_polity_without_divisions},
["Madhya Pradesh, อินเดีย"] = {divs = india_polity_with_divisions},
["Maharashtra, อินเดีย"] = {divs = india_polity_with_divisions},
["Manipur, อินเดีย"] = {divs = india_polity_without_divisions},
["Meghalaya, อินเดีย"] = {divs = india_polity_with_divisions},
["Mizoram, อินเดีย"] = {divs = india_polity_without_divisions},
["Nagaland, อินเดีย"] = {divs = india_polity_with_divisions},
["Odisha, อินเดีย"] = {divs = india_polity_with_divisions},
["Puducherry, อินเดีย"] = {placetype = "union territory", divs = india_polity_without_divisions,
wp = "%l (union territory)"},
["Pondicherry, อินเดีย"] = {alias_of = "Puducherry, อินเดีย", display = true},
["Punjab, อินเดีย"] = {divs = india_polity_with_divisions, wp = "%l, %c"},
["Rajasthan, อินเดีย"] = {divs = india_polity_with_divisions},
["Sikkim, อินเดีย"] = {divs = india_polity_without_divisions},
["Tamil Nadu, อินเดีย"] = {divs = india_polity_without_divisions},
["Telangana, อินเดีย"] = {divs = india_polity_without_divisions},
["Tripura, อินเดีย"] = {divs = india_polity_without_divisions},
["Uttar Pradesh, อินเดีย"] = {divs = india_polity_with_divisions},
["Uttarakhand, อินเดีย"] = {divs = india_polity_with_divisions},
["West Bengal, อินเดีย"] = {divs = india_polity_with_divisions},
}
-- states and union territories of India
export.india_group = {
default_container = "อินเดีย",
default_placetype = "รัฐ",
data = export.india_states_and_union_territories,
}
export.indonesia_provinces = {
["Aceh, อินโดนีเซีย"] = {},
["Bali, อินโดนีเซีย"] = {},
["Bangka Belitung Islands, อินโดนีเซีย"] = {the = true},
["Banten, อินโดนีเซีย"] = {},
["Bengkulu, อินโดนีเซีย"] = {},
["Central Java, อินโดนีเซีย"] = {},
["Central Kalimantan, อินโดนีเซีย"] = {},
["Central Papua, อินโดนีเซีย"] = {},
["Central Sulawesi, อินโดนีเซีย"] = {},
["East Java, อินโดนีเซีย"] = {},
["East Kalimantan, อินโดนีเซีย"] = {},
["East Nusa Tenggara, อินโดนีเซีย"] = {},
["Gorontalo, อินโดนีเซีย"] = {},
["Highland Papua, อินโดนีเซีย"] = {wp = "%l"},
["Special Capital Region of Jakarta, อินโดนีเซีย"] = {the = true, wp = "Jakarta"},
["Jakarta, อินโดนีเซีย"] = {alias_of = "Special Capital Region of Jakarta, อินโดนีเซีย"},
["Jambi, อินโดนีเซีย"] = {},
["Lampung, อินโดนีเซีย"] = {},
["Maluku, อินโดนีเซีย"] = {},
["North Kalimantan, อินโดนีเซีย"] = {},
["North Maluku, อินโดนีเซีย"] = {},
["North Sulawesi, อินโดนีเซีย"] = {},
["North Papua, อินโดนีเซีย"] = {},
["North Sumatra, อินโดนีเซีย"] = {},
["Papua, อินโดนีเซีย"] = {wp = "%l (จังหวัด)"},
["Riau, อินโดนีเซีย"] = {},
["Riau Islands, อินโดนีเซีย"] = {the = true},
["Southeast Sulawesi, อินโดนีเซีย"] = {},
["South Kalimantan, อินโดนีเซีย"] = {},
["South Papua, อินโดนีเซีย"] = {},
["South Sulawesi, อินโดนีเซีย"] = {},
["South Sumatra, อินโดนีเซีย"] = {},
["Southwest Papua, อินโดนีเซีย"] = {},
["West Java, อินโดนีเซีย"] = {},
["West Kalimantan, อินโดนีเซีย"] = {},
["West Nusa Tenggara, อินโดนีเซีย"] = {},
["West Papua, อินโดนีเซีย"] = {wp = "%l (จังหวัด)"},
["West Sulawesi, อินโดนีเซีย"] = {},
["West Sumatra, อินโดนีเซีย"] = {},
["Special Region of Yogyakarta, อินโดนีเซีย"] = {the = true},
["Yogyakarta, อินโดนีเซีย"] = {alias_of = "Special Region of Yogyakarta, อินโดนีเซีย"},
}
-- provinces of Indonesia
export.indonesia_group = {
default_container = "อินโดนีเซีย",
default_placetype = "จังหวัด",
-- per https://www.quora.com/Does-Indonesia-use-British-or-American-English, อินโดนีเซีย tends to use American
-- spellings.
data = export.indonesia_provinces,
}
export.iran_provinces = {
["Alborz, อิหร่าน"] = {}, -- abbreviation AL, capital [[w:Karaj]]
["Ardabil, อิหร่าน"] = {}, -- abbreviation AR, capital [[w:Ardabil]]
["Bushehr, อิหร่าน"] = {}, -- abbreviation BU, capital [[w:Bushehr]]
["Chaharmahal and Bakhtiari, อิหร่าน"] = {}, -- abbreviation CB, capital [[w:Shahr-e Kord]]
["East Azerbaijan, อิหร่าน"] = {}, -- abbreviation EA, capital [[w:Tabriz]]
["Fars, อิหร่าน"] = {}, -- abbreviation FA, capital [[w:Shiraz]]
["Pars, อิหร่าน"] = {alias_of = "Fars, อิหร่าน", display = true},
["Gilan, อิหร่าน"] = {}, -- abbreviation GN, capital [[w:Rasht]]
["Golestan, อิหร่าน"] = {}, -- abbreviation GO, capital [[w:Gorgan]]
["Hamadan, อิหร่าน"] = {}, -- abbreviation HA, capital [[w:Hamadan]]
["Hormozgan, อิหร่าน"] = {}, -- abbreviation HO, capital [[w:Bandar Abbas]]
["Ilam, อิหร่าน"] = {}, -- abbreviation IL, capital [[w:Ilam, อิหร่าน|Ilam]]
["Isfahan, อิหร่าน"] = {}, -- abbreviation IS, capital [[w:Isfahan]]
["Kerman, อิหร่าน"] = {}, -- abbreviation KN, capital [[w:Kerman]]
["Kermanshah, อิหร่าน"] = {}, -- abbreviation KE, capital [[w:Kermanshah]]
["Khuzestan, อิหร่าน"] = {}, -- abbreviation KH, capital [[w:Ahvaz]]
["Kohgiluyeh and Boyer-Ahmad, อิหร่าน"] = {}, -- abbreviation KB, capital [[w:Yasuj]]
["Kurdistan, อิหร่าน"] = {}, -- abbreviation KU, capital [[w:Sanandaj]]
["Lorestan, อิหร่าน"] = {}, -- abbreviation LO, capital [[w:Khorramabad]]
["Markazi, อิหร่าน"] = {}, -- abbreviation MA, capital [[w:Arak, อิหร่าน|Arak]]
["Mazandaran, อิหร่าน"] = {}, -- abbreviation MN, capital [[w:Sari, อิหร่าน|Sari]]
["North Khorasan, อิหร่าน"] = {}, -- abbreviation NK, capital [[w:Bojnord]]
["Qazvin, อิหร่าน"] = {}, -- abbreviation QA, capital [[w:Qazvin]]
["Qom, อิหร่าน"] = {}, -- abbreviation QM, capital [[w:Qom]]
["Razavi Khorasan, อิหร่าน"] = {}, -- abbreviation RK, capital [[w:Mashhad]]
["Semnan, อิหร่าน"] = {}, -- abbreviation SE, capital [[w:Semnan, อิหร่าน|Semnan]]
["Sistan and Baluchestan, อิหร่าน"] = {}, -- abbreviation SB, capital [[w:Zahedan]]
["South Khorasan, อิหร่าน"] = {}, -- abbreviation SK, capital [[w:Birjand]]
["Tehran, อิหร่าน"] = {}, -- abbreviation TE, capital [[w:Tehran]]
["West Azerbaijan, อิหร่าน"] = {}, -- abbreviation WA, capital [[w:Urmia]]
["Yazd, อิหร่าน"] = {}, -- abbreviation YA, capital [[w:Yazd]]
["Zanjan, อิหร่าน"] = {}, -- abbreviation ZA, capital [[w:Zanjan, อิหร่าน|Zanjan]]
}
-- provinces of Iran
export.iran_group = {
key_to_placename = make_key_to_placename(", อิหร่าน$"),
placename_to_key = make_placename_to_key(", อิหร่าน"),
default_container = "อิหร่าน",
default_placetype = "จังหวัด",
-- There aren't nearly enough counties of Iran currently entered in any language to allow for categorizing them
-- per-province. (As of 2025-05-09, there are only 6 counties in each of [[Category:en:Counties of Iran]],
-- [[Category:fa:Counties of Iran]] and [[Category:ar:Counties of Iran]].)
-- default_divs = "เทศมณฑล",
-- For obscure reasons, provinces of Iran, Laos, Thailand and Vietnam use lowercase 'province'
default_wp = "จังหวัด%e",
data = export.iran_provinces,
}
export.ireland_counties = {
["County Carlow, ไอร์แลนด์"] = {},
["County Cavan, ไอร์แลนด์"] = {},
["County Clare, ไอร์แลนด์"] = {},
["County Cork, ไอร์แลนด์"] = {},
["County Donegal, ไอร์แลนด์"] = {},
["County Dublin, ไอร์แลนด์"] = {},
["County Galway, ไอร์แลนด์"] = {},
["County Kerry, ไอร์แลนด์"] = {},
["County Kildare, ไอร์แลนด์"] = {},
["County Kilkenny, ไอร์แลนด์"] = {},
["County Laois, ไอร์แลนด์"] = {},
["County Leitrim, ไอร์แลนด์"] = {},
["County Limerick, ไอร์แลนด์"] = {},
["County Longford, ไอร์แลนด์"] = {},
["County Louth, ไอร์แลนด์"] = {},
["County Mayo, ไอร์แลนด์"] = {},
["County Meath, ไอร์แลนด์"] = {},
["County Monaghan, ไอร์แลนด์"] = {},
["County Offaly, ไอร์แลนด์"] = {},
["County Roscommon, ไอร์แลนด์"] = {},
["County Sligo, ไอร์แลนด์"] = {},
["County Tipperary, ไอร์แลนด์"] = {},
["County Waterford, ไอร์แลนด์"] = {},
["County Westmeath, ไอร์แลนด์"] = {},
["County Wexford, ไอร์แลนด์"] = {},
["County Wicklow, ไอร์แลนด์"] = {},
}
local function make_irish_type_key_to_placename(container_pattern)
return function(key)
key = key:gsub(container_pattern, "")
local elliptical_key = key:gsub("^County ", "")
return key, elliptical_key
end
end
local function make_irish_type_placename_to_key(container_suffix)
return function(placename)
if not placename:find("^County ") and not placename:find("^City ") then
placename = "County " .. placename
end
return placename .. container_suffix
end
end
-- counties of Ireland
export.ireland_group = {
key_to_placename = make_irish_type_key_to_placename(", ไอร์แลนด์$"),
placename_to_key = make_irish_type_placename_to_key(", ไอร์แลนด์"),
default_container = "ไอร์แลนด์",
default_placetype = "เทศมณฑล",
data = export.ireland_counties,
}
export.italy_administrative_regions = {
["Abruzzo, Italy"] = {},
["Aosta Valley, Italy"] = {placetype = {"autonomous region", "administrative region", "ภูมิภาค"}},
["Apulia, Italy"] = {},
["Basilicata, Italy"] = {},
["Calabria, Italy"] = {},
["Campania, Italy"] = {},
["Emilia-Romagna, Italy"] = {},
["Friuli-Venezia Giulia, Italy"] = {placetype = {"autonomous region", "administrative region", "ภูมิภาค"}},
["Lazio, Italy"] = {},
["Liguria, Italy"] = {},
["Lombardy, Italy"] = {},
["Marche, Italy"] = {},
["Molise, Italy"] = {},
["Piedmont, Italy"] = {},
["Sardinia, Italy"] = {placetype = {"autonomous region", "administrative region", "ภูมิภาค"}},
["Sicily, Italy"] = {placetype = {"autonomous region", "administrative region", "ภูมิภาค"}},
["Trentino-Alto Adige, Italy"] = {placetype = {"autonomous region", "administrative region", "ภูมิภาค"}},
["Tuscany, Italy"] = {},
["Umbria, Italy"] = {},
["Veneto, Italy"] = {},
}
-- administrative regions of Italy
export.italy_group = {
default_container = "อิตาลี",
default_placetype = "ภูมิภาค",
data = export.italy_administrative_regions,
}
-- table of Japanese prefectures; interpolated into the main 'places' table, but also needed separately
export.japan_prefectures = {
["ไอจิ, ญี่ปุ่น"] = {},
["อากิตะ, ญี่ปุ่น"] = {},
["อาโอโมริ, ญี่ปุ่น"] = {},
["จิบะ, ญี่ปุ่น"] = {},
["เอฮิเมะ, ญี่ปุ่น"] = {},
["ฟูกูอิ, ญี่ปุ่น"] = {},
["ฟูกูโอกะ, ญี่ปุ่น"] = {},
["ฟูกูชิมะ, ญี่ปุ่น"] = {},
["กิฟุ, ญี่ปุ่น"] = {},
["กุมมะ, ญี่ปุ่น"] = {},
["ฮิโรชิมะ, ญี่ปุ่น"] = {},
["ฮกไกโด, ญี่ปุ่น"] = {divs = "กิ่งจังหวัด", wp = "ฮกไกโด"},
["เฮียวโงะ, ญี่ปุ่น"] = {},
--["Hyogo, ญี่ปุ่น"] = {alias_of = "เฮียวโงะ, ญี่ปุ่น", display = true},
["อิบารากิ, ญี่ปุ่น"] = {},
["อิชิกาวะ, ญี่ปุ่น"] = {},
["อิวาเตะ, ญี่ปุ่น"] = {},
["คางาวะ, ญี่ปุ่น"] = {},
["คาโงชิมะ, ญี่ปุ่น"] = {},
["คานางาวะ, ญี่ปุ่น"] = {},
["โคจิ, ญี่ปุ่น"] = {},
--["Kochi, ญี่ปุ่น"] = {alias_of = "โคจิ, ญี่ปุ่น", display = true},
["คูมาโมโตะ, ญี่ปุ่น"] = {},
["เกียวโต, ญี่ปุ่น"] = {},
["มิเอะ, ญี่ปุ่น"] = {},
["มิยางิ, ญี่ปุ่น"] = {},
["มิยาซากิ, ญี่ปุ่น"] = {},
["นางาโนะ, ญี่ปุ่น"] = {},
["นางาซากิ, ญี่ปุ่น"] = {},
["นาระ, ญี่ปุ่น"] = {},
["นีงาตะ, ญี่ปุ่น"] = {},
["โออิตะ, ญี่ปุ่น"] = {},
--["Oita, ญี่ปุ่น"] = {alias_of = "โออิตะ, ญี่ปุ่น", display = true},
["โอกายามะ, ญี่ปุ่น"] = {},
["โอกินาวะ, ญี่ปุ่น"] = {},
["โอซากะ, ญี่ปุ่น"] = {},
["ซางะ, ญี่ปุ่น"] = {},
["ไซตามะ, ญี่ปุ่น"] = {},
["ชิงะ, ญี่ปุ่น"] = {},
["ชิมาเนะ, ญี่ปุ่น"] = {},
["ชิซูโอกะ, ญี่ปุ่น"] = {},
["โทจิงิ, ญี่ปุ่น"] = {},
["โทกูชิมะ, ญี่ปุ่น"] = {},
["ทตโตริ, ญี่ปุ่น"] = {},
["โทยามะ, ญี่ปุ่น"] = {},
["วากายามะ, ญี่ปุ่น"] = {},
["ยามางาตะ, ญี่ปุ่น"] = {},
["ยามางูจิ, ญี่ปุ่น"] = {},
["ยามานาชิ, ญี่ปุ่น"] = {},
}
-- prefectures of Japan
export.japan_group = {
key_to_placename = make_key_to_placename(", ญี่ปุ่น$"),
placename_to_key = make_placename_to_key(", ญี่ปุ่น"),
default_container = "ญี่ปุ่น",
default_placetype = "จังหวัด",
default_wp = "จังหวัด%e",
data = export.japan_prefectures,
}
export.laos_provinces = {
["Attapeu Province, Laos"] = {},
["Bokeo Province, Laos"] = {},
["Bolikhamxai Province, Laos"] = {},
["Champasak Province, Laos"] = {},
["Houaphanh Province, Laos"] = {},
["Khammouane Province, Laos"] = {},
["Luang Namtha Province, Laos"] = {},
["Luang Prabang Province, Laos"] = {},
["Oudomxay Province, Laos"] = {},
["Phongsaly Province, Laos"] = {},
["Salavan Province, Laos"] = {},
["Savannakhet Province, Laos"] = {},
["Vientiane Province, Laos"] = {},
["Vientiane Prefecture, Laos"] = {placetype = "prefecture", wp = "%l"},
["Sainyabuli Province, Laos"] = {},
["Sekong Province, Laos"] = {},
["Xaisomboun Province, Laos"] = {},
["Xiangkhouang Province, Laos"] = {},
}
local function laos_placename_to_key(placename)
if placename == "Vientiane Prefecture" then
return placename .. ", Laos"
end
if placename:find(" Province$") then
return placename .. ", Laos"
end
return placename .. " Province, Laos"
end
-- provinces of Laos
export.laos_group = {
key_to_placename = make_key_to_placename(", Laos$", {" Province$", " Prefecture$"}),
placename_to_key = laos_placename_to_key,
default_container = "Laos",
default_placetype = "จังหวัด",
-- For obscure reasons, provinces of Iran, Laos, Thailand and Vietnam use lowercase 'province'
default_wp = "%e province",
data = export.laos_provinces,
}
export.lebanon_governorates = {
["Akkar Governorate, Lebanon"] = {},
["Baalbek-Hermel Governorate, Lebanon"] = {},
["Beirut Governorate, Lebanon"] = {},
["Beqaa Governorate, Lebanon"] = {},
["Keserwan-Jbeil Governorate, Lebanon"] = {},
["Mount Lebanon Governorate, Lebanon"] = {},
["Nabatieh Governorate, Lebanon"] = {},
-- These two are generic enough that we don't want to automatically augment a use of `gov/North Governorate` or
-- `gov/South Governorate` with `c/Lebanon`.
["North Governorate, Lebanon"] = {no_auto_augment_container = true},
["South Governorate, Lebanon"] = {no_auto_augment_container = true},
}
-- governorates of Lebanon
export.lebanon_group = {
key_to_placename = make_key_to_placename(", Lebanon$", " Governorate$"),
placename_to_key = make_placename_to_key(", Lebanon", " Governorate"),
default_container = "Lebanon",
default_placetype = "governorate",
data = export.lebanon_governorates,
}
export.malaysia_states = {
["Johor, Malaysia"] = {},
["Kedah, Malaysia"] = {},
["Kelantan, Malaysia"] = {},
["Malacca, Malaysia"] = {},
["Negeri Sembilan, Malaysia"] = {},
["Pahang, Malaysia"] = {},
["Penang, Malaysia"] = {},
["Perak, Malaysia"] = {},
["Perlis, Malaysia"] = {},
["Sabah, Malaysia"] = {},
["Sarawak, Malaysia"] = {},
["Selangor, Malaysia"] = {},
["Terengganu, Malaysia"] = {},
}
-- states of Malaysia
export.malaysia_group = {
default_container = "Malaysia",
default_placetype = "รัฐ",
default_wp = "%l, %c",
data = export.malaysia_states,
}
export.malta_regions = {
-- Some of the regions are generic enough that we don't want to automatically augment a use of e.g.
-- `r/Northern Region` with `c/Malta`. In particular;
-- * "Eastern Region" also occurs at least in Ghana, Uganda, Iceland, Nigeria, Venezuela, North Macedonia and
-- El Salvador;
-- * "Northern Region" also occurs at least in Ghana, Uganda, Malawi, Nigeria, Canada and South Africa;
-- * "Western Region" also occurs at least in Abu Dhabi, Bahrain, South Africa, Ghana, Iceland, Nepal, Nigeria,
-- Serbia and Uganda;
-- * "Southern Region" also occurs at least in Nigeria, Eritrea, Iceland, ไอร์แลนด์, Malawi and Serbia.
["Eastern Region, Malta"] = {no_auto_augment_container = true},
["Gozo Region, Malta"] = {wp = "%l"},
["Northern Region, Malta"] = {no_auto_augment_container = true},
["Port Region, Malta"] = {},
["Southern Region, Malta"] = {no_auto_augment_container = true},
["Western Region, Malta"] = {no_auto_augment_container = true},
}
-- regions of Malta
export.malta_group = {
key_to_placename = make_key_to_placename(", Malta$", " Region"),
placename_to_key = make_placename_to_key(", Malta", " Region"),
default_container = "Malta",
default_placetype = "ภูมิภาค",
default_wp = "%l, %c",
default_the = true,
data = export.malta_regions,
}
export.mexico_states = {
["Aguascalientes, Mexico"] = {},
["Baja California, Mexico"] = {},
-- not display-canonicalizing because the "Norte" could be for emphasis
["Baja California Norte, Mexico"] = {alias_of = "Baja California, Mexico"},
["Baja California Sur, Mexico"] = {},
["Campeche, Mexico"] = {},
["Chiapas, Mexico"] = {},
["Chihuahua, Mexico"] = {wp = "%l (รัฐ)"},
["Coahuila, Mexico"] = {},
["Colima, Mexico"] = {},
["Durango, Mexico"] = {},
["Guanajuato, Mexico"] = {},
["Guerrero, Mexico"] = {},
["Hidalgo, Mexico"] = {wp = "%l (รัฐ)"},
["Jalisco, Mexico"] = {},
["State of Mexico, Mexico"] = {the = true},
["Mexico, Mexico"] = {alias_of = "State of Mexico, Mexico"}, -- differs in "the"
-- ["Mexico City, Mexico"] = {}, doesn't belong here because it's a city
["Michoacán, Mexico"] = {},
["Michoacan, Mexico"] = {alias_of = "Michoacán, Mexico", display = true},
["Morelos, Mexico"] = {},
["Nayarit, Mexico"] = {},
["Nuevo León, Mexico"] = {},
["Nuevo Leon, Mexico"] = {alias_of = "Nuevo León, Mexico", display = true},
["Oaxaca, Mexico"] = {},
["Puebla, Mexico"] = {},
["Querétaro, Mexico"] = {},
["Queretaro, Mexico"] = {alias_of = "Querétaro, Mexico", display = true},
["Quintana Roo, Mexico"] = {},
["San Luis Potosí, Mexico"] = {},
["San Luis Potosi, Mexico"] = {alias_of = "San Luis Potosí, Mexico", display = true},
["Sinaloa, Mexico"] = {},
["Sonora, Mexico"] = {},
["Tabasco, Mexico"] = {},
["Tamaulipas, Mexico"] = {},
["Tlaxcala, Mexico"] = {},
["Veracruz, Mexico"] = {},
["Yucatán, Mexico"] = {},
["Yucatan, Mexico"] = {alias_of = "Yucatán, Mexico", display = true},
["Zacatecas, Mexico"] = {},
}
-- Mexican states
export.mexico_group = {
default_container = "Mexico",
default_placetype = "รัฐ",
data = export.mexico_states,
}
export.moldova_districts_and_autonomous_territorial_units = {
["Anenii Noi District, Moldova"] = {}, -- capital [[Anenii Noi]]
["Basarabeasca District, Moldova"] = {}, -- capital [[Basarabeasca]]
["Briceni District, Moldova"] = {}, -- capital [[Briceni]]
["Cahul District, Moldova"] = {}, -- capital [[Cahul]]
["Cantemir District, Moldova"] = {}, -- capital [[Cantemir, Moldova|Cantemir]]
["Călărași District, Moldova"] = {}, -- capital [[Călărași, Moldova|Călărași]]
["Căușeni District, Moldova"] = {}, -- capital [[Căușeni]]
["Cimișlia District, Moldova"] = {}, -- capital [[Cimișlia]]
["Criuleni District, Moldova"] = {}, -- capital [[Criuleni]]
["Dondușeni District, Moldova"] = {}, -- capital [[Dondușeni]]
["Drochia District, Moldova"] = {}, -- capital [[Drochia]]
["Dubăsari District, Moldova"] = {}, -- capital [[Cocieri]]
["Edineț District, Moldova"] = {}, -- capital [[Edineț]]
["Fălești District, Moldova"] = {}, -- capital [[Fălești]]
["Florești District, Moldova"] = {}, -- capital [[Florești, Moldova|Florești]]
["Glodeni District, Moldova"] = {}, -- capital [[Glodeni]]
["Hîncești District, Moldova"] = {}, -- capital [[Hîncești]]
["Ialoveni District, Moldova"] = {}, -- capital [[Ialoveni]]
["Leova District, Moldova"] = {}, -- capital [[Leova]]
["Nisporeni District, Moldova"] = {}, -- capital [[Nisporeni]]
["Ocnița District, Moldova"] = {}, -- capital [[Ocnița]]
["Orhei District, Moldova"] = {}, -- capital [[Orhei]]
["Rezina District, Moldova"] = {}, -- capital [[Rezina]]
["Rîșcani District, Moldova"] = {}, -- capital [[Rîșcani]]
["Sîngerei District, Moldova"] = {}, -- capital [[Sîngerei]]
["Soroca District, Moldova"] = {}, -- capital [[Soroca]]
["Strășeni District, Moldova"] = {}, -- capital [[Strășeni]]
["Șoldănești District, Moldova"] = {}, -- capital [[Șoldănești]]
["Ștefan Vodă District, Moldova"] = {}, -- capital [[Ștefan Vodă]]
["Taraclia District, Moldova"] = {}, -- capital [[Taraclia]]
["Telenești District, Moldova"] = {}, -- capital [[Telenești]]
["Ungheni District, Moldova"] = {}, -- capital [[Ungheni]]
["Chișinău, Moldova"] = {placetype = "เทศบาล"},
["Bălți, Moldova"] = {placetype = "เทศบาล"},
["Gagauzia, Moldova"] = {placetype = {"autonomous territorial unit", "autonomous region", "ภูมิภาค"}}, -- capital [[Comrat]]
-- the remainder are under the de-facto control of the unrecognized state of Transnistria
["Bender, Moldova"] = {placetype = "เทศบาล"},
["Tighina, Moldova"] = {alias_of = "Bender, Moldova"},
["Transnistria, Moldova"] = {placetype = {"autonomous territorial unit", "autonomous region", "ภูมิภาค"}}, -- capital [[Tiraspol]]
["Left Bank of the Dniester, Moldova"] = {alias_of = "Transnistria, Moldova", the = true},
["Administrative-Territorial Units of the Left Bank of the Dniester, Moldova"] = {alias_of = "Transnistria, Moldova", the = true},
}
local function moldova_placename_to_key(placename)
local elliptical_key = placename .. ", Moldova"
if export.moldova_districts_and_autonomous_territorial_units[elliptical_key] then
return elliptical_key
end
if placename:find(" District$") then
return placename .. ", Moldova"
end
return placename .. " District, Moldova"
end
-- Moldovan districts (raions) and autonomous territorial units
export.moldova_group = {
key_to_placename = make_key_to_placename(", Moldova$", " District"),
placename_to_key = moldova_placename_to_key,
default_container = "Moldova",
default_placetype = {"district", "raion"},
default_divs = "communes",
data = export.moldova_districts_and_autonomous_territorial_units,
}
export.morocco_regions = {
["Tangier-Tetouan-Al Hoceima, Morocco"] = {},
["Oriental, Morocco"] = {wp = "%l (%c)"},
["L'Oriental, Morocco"] = {alias_of = "Oriental, Morocco", display = true},
["Fez-Meknes, Morocco"] = {},
["Rabat-Sale-Kenitra, Morocco"] = {wp = "Rabat-Salé-Kénitra"},
["Rabat-Salé-Kénitra, Morocco"] = {alias_of = "Rabat-Sale-Kenitra, Morocco", display = true},
["Beni Mellal-Khenifra, Morocco"] = {wp = "Béni Mellal-Khénifra"},
["Béni Mellal-Khénifra, Morocco"] = {alias_of = "Beni Mellal-Khenifra, Morocco", display = true},
["Casablanca-Settat, Morocco"] = {},
["Marrakesh-Safi, Morocco"] = {wp = "Marrakesh–Safi"}, -- WP title has en-dash
["Marrakech-Safi, Morocco"] = {alias_of = "Marrakesh-Safi, Morocco", display = true},
["Draa-Tafilalet, Morocco"] = {wp = "Drâa-Tafilalet"},
["Drâa-Tafilalet, Morocco"] = {alias_of = "Draa-Tafilalet, Morocco", display = true},
["Souss-Massa, Morocco"] = {},
["Guelmim-Oued Noun, Morocco"] = {
keydesc = "+++. '''NOTE:''' This region lies partly within the disputed territory of [[Western Sahara]]"
},
["Laayoune-Sakia El Hamra, Morocco"] = {
wp = "Laâyoune-Sakia El Hamra",
keydesc = "+++. '''NOTE:''' This region lies almost completely within the disputed territory of [[Western Sahara]]",
},
["Laâyoune-Sakia El Hamra, Morocco"] = {alias_of = "Laayoune-Sakia El Hamra, Morocco", display = true},
["Dakhla-Oued Ed-Dahab, Morocco"] = {
keydesc = "+++. '''NOTE:''' This region lies completely within the disputed territory of [[Western Sahara]]",
},
}
-- regions of Morocco
export.morocco_group = {
default_container = "Morocco",
default_placetype = "ภูมิภาค",
data = export.morocco_regions,
}
export.egypt_governorates = {
["Cairo Governorate, Egypt"] = {},
["Giza Governorate, Egypt"] = {},
["Sharqia Governorate, Egypt"] = {},
["Dakahlia Governorate, Egypt"] = {},
["Beheira Governorate, Egypt"] = {},
["Minya Governorate, Egypt"] = {},
["Qalyubia Governorate, Egypt"] = {},
["Sohag Governorate, Egypt"] = {},
["Alexandria Governorate, Egypt"] = {},
["Gharbia Governorate, Egypt"] = {},
["Asyut Governorate, Egypt"] = {},
["Monufia Governorate, Egypt"] = {},
["Faiyum Governorate, Egypt"] = {},
["Kafr El Sheikh Governorate, Egypt"] = {},
["Qena Governorate, Egypt"] = {},
["Beni Suef Governorate, Egypt"] = {},
["Damietta Governorate, Egypt"] = {},
["Aswan Governorate, Egypt"] = {},
["Ismailia Governorate, Egypt"] = {},
["Luxor Governorate, Egypt"] = {},
["Suez Governorate, Egypt"] = {},
["Port Said Governorate, Egypt"] = {},
["Matrouh Governorate, Egypt"] = {},
["North Sinai Governorate, Egypt"] = {},
["Red Sea Governorate, Egypt"] = {},
["New Valley Governorate, Egypt"] = {},
["South Sinai Governorate, Egypt"] = {},
}
-- governorates of Egypt
export.egypt_group = {
key_to_placename = make_key_to_placename(", Egypt$", " Governorate$"),
placename_to_key = make_placename_to_key(", Egypt", " Governorate"),
default_container = "อียิปต์",
default_placetype = "governorate",
data = export.egypt_governorates,
}
export.netherlands_provinces = {
["Drenthe, Netherlands"] = {},
["Flevoland, Netherlands"] = {},
["Friesland, Netherlands"] = {},
["Gelderland, Netherlands"] = {},
["Groningen, Netherlands"] = {wp = "%l (จังหวัด)"},
["Limburg, Netherlands"] = {wp = "%l (%c)"},
["North Brabant, Netherlands"] = {},
-- Foreign forms get display-canonicalized.
["Noord-Brabant, Netherlands"] = {alias_of = "North Brabant, Netherlands", display = true},
["North Holland, Netherlands"] = {},
["Noord-Holland, Netherlands"] = {alias_of = "North Holland, Netherlands", display = true},
["Overijssel, Netherlands"] = {},
["South Holland, Netherlands"] = {},
["Zuid-Holland, Netherlands"] = {alias_of = "South Holland, Netherlands", display = true},
["Utrecht, Netherlands"] = {wp = "%l (จังหวัด)"},
["Zeeland, Netherlands"] = {},
}
-- provinces of the Netherlands
export.netherlands_group = {
default_container = "เนเธอร์แลนด์",
default_placetype = "จังหวัด",
default_divs = "เทศบาล",
data = export.netherlands_provinces,
}
export.new_zealand_regions = {
-- North Island regions
["Northland, New Zealand"] = {wp = "%l Region"}, -- ISO 3166-2 code NZ-NTL, number 1, capital [[Whangārei]]
["Auckland, New Zealand"] = {wp = "%l Region"}, -- ISO 3166-2 code NZ-AUK, number 2, capital [[Auckland]]
["Waikato, New Zealand"] = {}, -- ISO 3166-2 code NZ-WKO, number 3, capital [[Hamilton, New Zealand|Hamilton]]
["Bay of Plenty, New Zealand"] = {the = true, wp = "%l Region"}, -- ISO 3166-2 code NZ-BOP, number 4, capital [[Whakatāne]]
["Gisborne, New Zealand"] = {placetype = {"ภูมิภาค", "district"}, wp = "%l District"}, -- ISO 3166-2 code NZ-GIS, number 5, capital [[Gisborne, New Zealand|Gisborne]]
["Hawke's Bay, New Zealand"] = {}, -- ISO 3166-2 code NZ-HKB, number 6, capital [[Napier, New Zealand|Napier]]
["Taranaki, New Zealand"] = {}, -- ISO 3166-2 code NZ-TKI, number 7, capital [[Stratford, New Zealand|Stratford]]
["Manawatū-Whanganui, New Zealand"] = {}, -- ISO 3166-2 code NZ-MWT, number 8, capital [[Palmerston North]]
["Manawatu-Whanganui, New Zealand"] = {alias_of = "Manawatū-Whanganui, New Zealand", display = true},
["Manawatu-Wanganui, New Zealand"] = {alias_of = "Manawatū-Whanganui, New Zealand", display = true},
["Wellington, New Zealand"] = {wp = "%l Region"}, -- ISO 3166-2 code NZ-WGN, number 9, capital [[Wellington]]
-- South Island regions
["Tasman, New Zealand"] = {placetype = {"ภูมิภาค", "district"}, wp = "%l District"}, -- ISO 3166-2 code NZ-TAS, number 10, capital [[Richmond, New Zealand|Richmond]]
["Nelson, New Zealand"] = {placetype = {"ภูมิภาค", "นคร"}, wp = "%l, %c", is_city = true}, -- ISO 3166-2 code NZ-NSN, number 11, capital [[Nelson, New Zealand|Nelson]]
["Marlborough, New Zealand"] = {placetype = {"ภูมิภาค", "district"}, wp = "%l District"}, -- ISO 3166-2 code NZ-MBH, number 12, capital [[Blenheim, New Zealand|Blenheim]]
["West Coast, New Zealand"] = {the = true, wp = "%l Region"}, -- ISO 3166-2 code NZ-WTC, number 13, capital [[Greymouth]]
["Canterbury, New Zealand"] = {wp = "%l Region"}, -- ISO 3166-2 code NZ-CAN, number 14, capital [[Christchurch]]
["Otago, New Zealand"] = {}, -- ISO 3166-2 code NZ-OTA, number 15, capital [[Dunedin]]
["Southland, New Zealand"] = {wp = "%l Region"}, -- ISO 3166-2 code NZ-STL, number 16, capital [[Invercargill]]
}
-- regions of New Zealand
export.new_zealand_group = {
default_container = "New Zealand",
default_placetype = "ภูมิภาค",
data = export.new_zealand_regions,
}
export.nigeria_states = {
["Abia State, Nigeria"] = {},
["Adamawa State, Nigeria"] = {},
["Akwa Ibom State, Nigeria"] = {},
["Anambra State, Nigeria"] = {},
["Bauchi State, Nigeria"] = {},
["Bayelsa State, Nigeria"] = {},
["Benue State, Nigeria"] = {},
["Borno State, Nigeria"] = {},
["Cross River State, Nigeria"] = {},
["Delta State, Nigeria"] = {},
["Ebonyi State, Nigeria"] = {},
["Edo State, Nigeria"] = {},
["Ekiti State, Nigeria"] = {},
["Enugu State, Nigeria"] = {},
["Federal Capital Territory, Nigeria"] = {
-- not a state but allow it to be referenced as one in holonyms
placetype = {"federal territory", "ดินแดน", "รัฐ"}, the = true, wp = "%l (%c)",
},
["Gombe State, Nigeria"] = {},
["Imo State, Nigeria"] = {},
["Jigawa State, Nigeria"] = {},
["Kaduna State, Nigeria"] = {},
["Kano State, Nigeria"] = {},
["Katsina State, Nigeria"] = {},
["Kebbi State, Nigeria"] = {},
["Kogi State, Nigeria"] = {},
["Kwara State, Nigeria"] = {},
["Lagos State, Nigeria"] = {},
["Nasarawa State, Nigeria"] = {},
["Niger State, Nigeria"] = {},
["Ogun State, Nigeria"] = {},
["Ondo State, Nigeria"] = {},
["Osun State, Nigeria"] = {},
["Oyo State, Nigeria"] = {},
["Plateau State, Nigeria"] = {},
["Rivers State, Nigeria"] = {},
["Sokoto State, Nigeria"] = {},
["Taraba State, Nigeria"] = {},
["Yobe State, Nigeria"] = {},
["Zamfara State, Nigeria"] = {},
}
-- states of Nigeria
export.nigeria_group = {
key_to_placename = make_key_to_placename(", Nigeria$", " State$"),
placename_to_key = make_placename_to_key(", Nigeria", " State"),
default_container = "Nigeria",
default_placetype = "รัฐ",
data = export.nigeria_states,
}
export.north_korea_provinces = {
["Chagang Province, North Korea"] = {},
["North Hamgyong Province, North Korea"] = {},
["South Hamgyong Province, North Korea"] = {},
["North Hwanghae Province, North Korea"] = {},
["South Hwanghae Province, North Korea"] = {},
["Kangwon Province, North Korea"] = {wp = "%l (%c)"},
["North Pyongan Province, North Korea"] = {},
["South Pyongan Province, North Korea"] = {},
["Ryanggang Province, North Korea"] = {},
}
-- provinces of North Korea
export.north_korea_group = {
key_to_placename = make_key_to_placename(", North Korea$", " Province$"),
placename_to_key = make_placename_to_key(", North Korea", " Province"),
default_container = "North Korea",
default_placetype = "จังหวัด",
data = export.north_korea_provinces,
}
export.norwegian_counties = {
["Oslo, Norway"] = {},
["Rogaland, Norway"] = {},
["Møre og Romsdal, Norway"] = {},
["Nordland, Norway"] = {},
["Østfold, Norway"] = {},
["Akershus, Norway"] = {},
["Buskerud, Norway"] = {},
-- the following two were merged into Innlandet
-- ["Hedmark, Norway"] = {},
-- ["Oppland, Norway"] = {},
["Innlandet, Norway"] = {},
["Vestfold, Norway"] = {},
["Telemark, Norway"] = {},
-- the following two were merged into Agder
-- ["Aust-Agder, Norway"] = {},
-- ["Vest-Agder, Norway"] = {},
["Agder, Norway"] = {},
-- the following two were merged into Vestland
-- ["Hordaland, Norway"] = {},
-- ["Sogn og Fjordane, Norway"] = {},
["Vestland, Norway"] = {},
["Trøndelag, Norway"] = {},
["Troms, Norway"] = {},
["Finnmark, Norway"] = {},
}
-- counties of Norway
export.norway_group = {
default_container = "Norway",
default_placetype = "เทศมณฑล",
data = export.norwegian_counties,
}
export.pakistan_provinces_and_territories = {
["Azad Kashmir, Pakistan"] = {
placetype = {"administrative territory", "autonomous territory", "ดินแดน"},
},
["Azad Jammu and Kashmir, Pakistan"] = {alias_of = "Azad Kashmir, Pakistan", display = true},
["Balochistan, Pakistan"] = {wp = "%l, %c"},
["Gilgit-Baltistan, Pakistan"] = {
placetype = {"administrative territory", "ดินแดน"},
},
["Islamabad Capital Territory, Pakistan"] = {
the = true,
divs = {}, -- no divisions
placetype = {"federal territory", "administrative territory", "ดินแดน"},
},
-- Islamabad is an accepted alias for Islamabad Capital Territory given the above placetypes
["Islamabad, Pakistan"] = {alias_of = "Islamabad Capital Territory, Pakistan"},
["Khyber Pakhtunkhwa, Pakistan"] = {},
["Punjab, Pakistan"] = {wp = "%l, %c"},
["Sindh, Pakistan"] = {},
}
-- provinces and territories of Pakistan
export.pakistan_group = {
default_container = "Pakistan",
default_placetype = "จังหวัด",
default_divs = "divisions",
data = export.pakistan_provinces_and_territories,
}
export.philippines_provinces = {
["Abra, Philippines"] = {wp = "%l (จังหวัด)"},
["Agusan del Norte, Philippines"] = {},
["Agusan del Sur, Philippines"] = {},
["Aklan, Philippines"] = {},
["Albay, Philippines"] = {},
["Antique, Philippines"] = {wp = "%l (จังหวัด)"},
["Apayao, Philippines"] = {},
["Aurora, Philippines"] = {wp = "%l (จังหวัด)"},
["Basilan, Philippines"] = {},
["Bataan, Philippines"] = {},
["Batanes, Philippines"] = {},
["Batangas, Philippines"] = {},
["Benguet, Philippines"] = {},
["Biliran, Philippines"] = {},
["Bohol, Philippines"] = {},
["Bukidnon, Philippines"] = {},
["Bulacan, Philippines"] = {},
["Cagayan, Philippines"] = {},
["Camarines Norte, Philippines"] = {},
["Camarines Sur, Philippines"] = {},
["Camiguin, Philippines"] = {},
["Capiz, Philippines"] = {},
["Catanduanes, Philippines"] = {},
["Cavite, Philippines"] = {},
["Cebu, Philippines"] = {},
["Cotabato, Philippines"] = {},
["Davao de Oro, Philippines"] = {},
["Davao del Norte, Philippines"] = {},
["Davao del Sur, Philippines"] = {},
["Davao Occidental, Philippines"] = {},
["Davao Oriental, Philippines"] = {},
["Dinagat Islands, Philippines"] = {the = true},
["Eastern Samar, Philippines"] = {},
["Guimaras, Philippines"] = {},
["Ifugao, Philippines"] = {},
["Ilocos Norte, Philippines"] = {},
["Ilocos Sur, Philippines"] = {},
["Iloilo, Philippines"] = {},
["Isabela, Philippines"] = {wp = "%l (จังหวัด)"},
["Kalinga, Philippines"] = {wp = "%l (จังหวัด)"},
["La Union, Philippines"] = {},
["Laguna, Philippines"] = {wp = "%l (จังหวัด)"},
["Lanao del Norte, Philippines"] = {},
["Lanao del Sur, Philippines"] = {},
["Leyte, Philippines"] = {wp = "%l (จังหวัด)"},
["Maguindanao del Norte, Philippines"] = {},
["Maguindanao del Sur, Philippines"] = {},
["Marinduque, Philippines"] = {},
["Masbate, Philippines"] = {},
["Misamis Occidental, Philippines"] = {},
["Misamis Oriental, Philippines"] = {},
["Mountain Province, Philippines"] = {},
["Negros Occidental, Philippines"] = {},
["Negros Oriental, Philippines"] = {},
["Northern Samar, Philippines"] = {},
["Nueva Ecija, Philippines"] = {},
["Nueva Vizcaya, Philippines"] = {},
["Occidental Mindoro, Philippines"] = {},
["Oriental Mindoro, Philippines"] = {},
["Palawan, Philippines"] = {},
["Pampanga, Philippines"] = {},
["Pangasinan, Philippines"] = {},
["Quezon, Philippines"] = {},
["Quirino, Philippines"] = {},
["Rizal, Philippines"] = {wp = "%l (จังหวัด)"},
["Romblon, Philippines"] = {},
["Samar, Philippines"] = {wp = "%l (จังหวัด)"},
["Sarangani, Philippines"] = {},
["Siquijor, Philippines"] = {},
["Sorsogon, Philippines"] = {},
["South Cotabato, Philippines"] = {},
["Southern Leyte, Philippines"] = {},
["Sultan Kudarat, Philippines"] = {},
["Sulu, Philippines"] = {},
["Surigao del Norte, Philippines"] = {},
["Surigao del Sur, Philippines"] = {},
["Tarlac, Philippines"] = {},
["Tawi-Tawi, Philippines"] = {},
["Zambales, Philippines"] = {},
["Zamboanga del Norte, Philippines"] = {},
["Zamboanga del Sur, Philippines"] = {},
["Zamboanga Sibugay, Philippines"] = {},
-- not a province but treated as one; allow it to be referred to as a province in holonyms
["Metro Manila, Philippines"] = {placetype = {"ภูมิภาค", "จังหวัด"}},
}
-- provinces of the Philippines
export.philippines_group = {
default_container = "Philippines",
default_placetype = "จังหวัด",
default_divs = {"เทศบาล", "barangays"},
data = export.philippines_provinces,
}
export.poland_voivodeships = {
["Lower Silesian Voivodeship, Poland"] = {}, -- abbr DS, code 02, capital Wrocław
["Kuyavian-Pomeranian Voivodeship, Poland"] = {}, -- abbr KP, code 04, capital Bydgoszcz (seat of voivode), Toruń (seat of sejmik and marshal)
["Lublin Voivodeship, Poland"] = {}, -- abbr LU, code 06, capital Lublin
["Lubusz Voivodeship, Poland"] = {}, -- abbr LB, code 08, capital Gorzów Wielkopolski (seat of voivode), Zielona Góra (seat of sejmik and marshal)
["Lodz Voivodeship, Poland"] = {wp = "Łódź Voivodeship"}, -- abbr LD, code 10, capital Łódź
["Łódź Voivodeship, Poland"] = {alias_of = "Lodz Voivodeship, Poland", display = true, display_as_full = true},
["Lesser Poland Voivodeship, Poland"] = {}, -- abbr MA, code 12, capital Kraków
["Masovian Voivodeship, Poland"] = {}, -- abbr MZ, code 14, capital Warsaw
["Opole Voivodeship, Poland"] = {}, -- abbr OP, code 16, capital Opole
["Subcarpathian Voivodeship, Poland"] = {}, -- abbr PK, code 18, capital Rzeszów
["Podlaskie Voivodeship, Poland"] = {}, -- abbr PD, code 20, capital Białystok
["Pomeranian Voivodeship, Poland"] = {}, -- abbr PM, code 22, capital Gdańsk
["Silesian Voivodeship, Poland"] = {}, -- abbr SL, code 24, capital Katowice
["Holy Cross Voivodeship, Poland"] = {wp = "Świętokrzyskie Voivodeship"}, -- abbr SK, code 26, capital Kielce
["Świętokrzyskie Voivodeship, Poland"] = {alias_of = "Holy Cross Voivodeship, Poland", display = true, display_as_full = true},
["Warmian-Masurian Voivodeship, Poland"] = {}, -- abbr WN, code 28, capital Olsztyn
["Greater Poland Voivodeship, Poland"] = {}, -- abbr WP, code 30, capital Poznań
["West Pomeranian Voivodeship, Poland"] = {}, -- abbr ZP, code 32, capital Szczecin
}
-- voivodeships of Poland
export.poland_group = {
key_to_placename = make_key_to_placename(", Poland$", " Voivodeship$"),
placename_to_key = make_placename_to_key(", Poland", " Voivodeship"),
default_container = "Poland",
default_placetype = "voivodeship",
default_divs = {
-- "เทศมณฑล", -- not enough of them currently
{type = "Polish colonies", cat_as = {{type = "villages", prep = "ใน"}}},
},
data = export.poland_voivodeships,
}
export.portugal_districts_and_autonomous_regions = {
["Azores, Portugal"] = {the = true, placetype = {"autonomous region", "ภูมิภาค"}},
["Aveiro District, Portugal"] = {},
["Beja District, Portugal"] = {},
["Braga District, Portugal"] = {},
["Bragança District, Portugal"] = {},
["Castelo Branco District, Portugal"] = {},
["Coimbra District, Portugal"] = {},
["Évora District, Portugal"] = {},
["Faro District, Portugal"] = {},
["Guarda District, Portugal"] = {},
["Leiria District, Portugal"] = {},
["Lisbon District, Portugal"] = {},
["Lisboa District, Portugal"] = {alias_of = "Lisbon District, Portugal", display = true},
["Madeira, Portugal"] = {placetype = {"autonomous region", "ภูมิภาค"}},
["Portalegre District, Portugal"] = {},
["Porto District, Portugal"] = {},
["Santarém District, Portugal"] = {},
["Setúbal District, Portugal"] = {},
["Viana do Castelo District, Portugal"] = {},
["Vila Real District, Portugal"] = {},
["Viseu District, Portugal"] = {},
}
local function portugal_placename_to_key(placename)
if placename == "Azores" or placename == "Madeira" then
return placename .. ", Portugal"
end
if placename:find(" District$") then
return placename .. ", Portugal"
end
return placename .. " District, Portugal"
end
-- districts and autonomous regions of Portugal
export.portugal_group = {
key_to_placename = make_key_to_placename(", Portugal$", " District$"),
placename_to_key = portugal_placename_to_key,
default_container = "Portugal",
default_placetype = "district",
default_divs = "เทศบาล",
data = export.portugal_districts_and_autonomous_regions,
}
export.romania_counties = {
["Alba County, Romania"] = {},
["Arad County, Romania"] = {},
["Argeș County, Romania"] = {},
["Bacău County, Romania"] = {},
["Bihor County, Romania"] = {},
["Bistrița-Năsăud County, Romania"] = {},
["Botoșani County, Romania"] = {},
["Brașov County, Romania"] = {},
["Brăila County, Romania"] = {},
-- Bucharest: not in a county
["Buzău County, Romania"] = {},
["Caraș-Severin County, Romania"] = {},
["Cluj County, Romania"] = {},
["Constanța County, Romania"] = {},
["Covasna County, Romania"] = {},
["Călărași County, Romania"] = {},
["Dolj County, Romania"] = {},
["Dâmbovița County, Romania"] = {},
["Galați County, Romania"] = {},
["Giurgiu County, Romania"] = {},
["Gorj County, Romania"] = {},
["Harghita County, Romania"] = {},
["Hunedoara County, Romania"] = {},
["Ialomița County, Romania"] = {},
["Iași County, Romania"] = {},
["Ilfov County, Romania"] = {},
["Maramureș County, Romania"] = {},
["Mehedinți County, Romania"] = {},
["Mureș County, Romania"] = {},
["Neamț County, Romania"] = {},
["Olt County, Romania"] = {},
["Prahova County, Romania"] = {},
["Satu Mare County, Romania"] = {},
["Sibiu County, Romania"] = {},
["Suceava County, Romania"] = {},
["Sălaj County, Romania"] = {},
["Teleorman County, Romania"] = {},
["Timiș County, Romania"] = {},
["Tulcea County, Romania"] = {},
["Vaslui County, Romania"] = {},
["Vrancea County, Romania"] = {},
["Vâlcea County, Romania"] = {},
}
-- counties of Romania
export.romania_group = {
key_to_placename = make_key_to_placename(", Romania$", " County$"),
placename_to_key = make_placename_to_key(", Romania", " County"),
default_container = "Romania",
default_placetype = "เทศมณฑล",
default_divs = "communes",
data = export.romania_counties,
}
local function make_russia_federal_subject_spec(spectype, use_the, wp)
return {
placetype = spectype,
the = not not use_the,
bare_category_parent_type = {"federal subjects", spectype .. "s"},
wp = wp,
}
end
local russia_autonomous_okrug_no_the =
{placetype = {"autonomous okrug", "okrug"}, bare_category_parent_type = {"federal subjects", "autonomous okrugs"}}
local russia_autonomous_okrug_the =
{placetype = {"autonomous okrug", "okrug"}, bare_category_parent_type = {"federal subjects", "autonomous okrugs"},
the = true}
local russia_krai = make_russia_federal_subject_spec("krai")
local russia_oblast = make_russia_federal_subject_spec("oblast")
local russia_republic_the = make_russia_federal_subject_spec("republic", "use the")
local russia_republic_no_the = make_russia_federal_subject_spec("republic")
export.russia_federal_subjects = {
-- autonomous oblasts
["Jewish Autonomous Oblast, Russia"] =
{the = true, placetype = {"autonomous oblast", "oblast"},
bare_category_parent_type = {"federal subjects", "autonomous oblasts"}},
-- autonomous okrugs
["Chukotka Autonomous Okrug, Russia"] = russia_autonomous_okrug_the,
["Chukotka, Russia"] = {alias_of = "Chukotka Autonomous Okrug, Russia"},
["Khanty-Mansi Autonomous Okrug, Russia"] = russia_autonomous_okrug_the,
["Khanty-Mansia, Russia"] = {alias_of = "Khanty-Mansi Autonomous Okrug, Russia"},
["Khantia-Mansia, Russia"] = {alias_of = "Khanty-Mansi Autonomous Okrug, Russia"},
["Yugra, Russia"] = {alias_of = "Khanty-Mansi Autonomous Okrug, Russia"},
["Nenets Autonomous Okrug, Russia"] = russia_autonomous_okrug_the,
["Nenetsia, Russia"] = {alias_of = "Nenets Autonomous Okrug, Russia"},
["Yamalo-Nenets Autonomous Okrug, Russia"] = russia_autonomous_okrug_the,
["Yamalia, Russia"] = {alias_of = "Yamalo-Nenets Autonomous Okrug, Russia"},
-- krais
["Altai Krai, Russia"] = russia_krai,
["Kamchatka Krai, Russia"] = russia_krai,
["Khabarovsk Krai, Russia"] = russia_krai,
["Krasnodar Krai, Russia"] = russia_krai,
["Krasnoyarsk Krai, Russia"] = russia_krai,
["Perm Krai, Russia"] = russia_krai,
["Primorsky Krai, Russia"] = russia_krai,
["Stavropol Krai, Russia"] = russia_krai,
["Zabaykalsky Krai, Russia"] = russia_krai,
-- oblasts
["Amur Oblast, Russia"] = russia_oblast,
["Arkhangelsk Oblast, Russia"] = russia_oblast,
["Astrakhan Oblast, Russia"] = russia_oblast,
["Belgorod Oblast, Russia"] = russia_oblast,
["Bryansk Oblast, Russia"] = russia_oblast,
["Chelyabinsk Oblast, Russia"] = russia_oblast,
["Irkutsk Oblast, Russia"] = russia_oblast,
["Ivanovo Oblast, Russia"] = russia_oblast,
["Kaliningrad Oblast, Russia"] = russia_oblast,
["Kaluga Oblast, Russia"] = russia_oblast,
["Kemerovo Oblast, Russia"] = russia_oblast,
["Kirov Oblast, Russia"] = russia_oblast,
["Kostroma Oblast, Russia"] = russia_oblast,
["Kurgan Oblast, Russia"] = russia_oblast,
["Kursk Oblast, Russia"] = russia_oblast,
["Leningrad Oblast, Russia"] = russia_oblast,
["Lipetsk Oblast, Russia"] = russia_oblast,
["Magadan Oblast, Russia"] = russia_oblast,
["Moscow Oblast, Russia"] = russia_oblast,
["Murmansk Oblast, Russia"] = russia_oblast,
["Nizhny Novgorod Oblast, Russia"] = russia_oblast,
["Novgorod Oblast, Russia"] = russia_oblast,
["Novosibirsk Oblast, Russia"] = russia_oblast,
["Omsk Oblast, Russia"] = russia_oblast,
["Orenburg Oblast, Russia"] = russia_oblast,
["Oryol Oblast, Russia"] = russia_oblast,
["Penza Oblast, Russia"] = russia_oblast,
["Pskov Oblast, Russia"] = russia_oblast,
["Rostov Oblast, Russia"] = russia_oblast,
["Ryazan Oblast, Russia"] = russia_oblast,
["Sakhalin Oblast, Russia"] = russia_oblast,
["Samara Oblast, Russia"] = russia_oblast,
["Saratov Oblast, Russia"] = russia_oblast,
["Smolensk Oblast, Russia"] = russia_oblast,
["Sverdlovsk Oblast, Russia"] = russia_oblast,
["Tambov Oblast, Russia"] = russia_oblast,
["Tomsk Oblast, Russia"] = russia_oblast,
["Tula Oblast, Russia"] = russia_oblast,
["Tver Oblast, Russia"] = russia_oblast,
["Tyumen Oblast, Russia"] = russia_oblast,
["Ulyanovsk Oblast, Russia"] = russia_oblast,
["Vladimir Oblast, Russia"] = russia_oblast,
["Volgograd Oblast, Russia"] = russia_oblast,
["Vologda Oblast, Russia"] = russia_oblast,
["Voronezh Oblast, Russia"] = russia_oblast,
["Yaroslavl Oblast, Russia"] = russia_oblast,
-- republics
--
-- We only need to include cases that aren't just shortened versions of the full federal subject name (i.e. where
-- words like "Republic" and "Oblast" are omitted but the name is not otherwise modified; these are handled by
-- key_to_placename). Non-display-canonicalizing aliases are generally due to differences in the presence or absence
-- of "the".
["Adygea, Russia"] = russia_republic_no_the,
["Republic of Adygea, Russia"] = {alias_of = "Adygea, Russia", the = true},
["Bashkortostan, Russia"] = russia_republic_no_the,
["Republic of Bashkortostan, Russia"] = {alias_of = "Bashkortostan, Russia", the = true},
["Bashkiria, Russia"] = {alias_of = "Bashkortostan, Russia"},
["Buryatia, Russia"] = russia_republic_no_the,
["Republic of Buryatia, Russia"] = {alias_of = "Buryatia, Russia", the = true},
["Dagestan, Russia"] = russia_republic_no_the,
["Republic of Dagestan, Russia"] = {alias_of = "Dagestan, Russia", the = true},
["Ingushetia, Russia"] = russia_republic_no_the,
["Republic of Ingushetia, Russia"] = {alias_of = "Ingushetia, Russia", the = true},
["Kalmykia, Russia"] = russia_republic_no_the,
["Republic of Kalmykia, Russia"] = {alias_of = "Kalmykia, Russia", the = true},
["Karelia, Russia"] = make_russia_federal_subject_spec("republic", nil, "Republic of Karelia"),
["Republic of Karelia, Russia"] = {alias_of = "Karelia, Russia", the = true},
["Khakassia, Russia"] = russia_republic_no_the,
["Republic of Khakassia, Russia"] = {alias_of = "Khakassia, Russia", the = true},
["Mordovia, Russia"] = russia_republic_no_the,
["Republic of Mordovia, Russia"] = {alias_of = "Mordovia, Russia", the = true},
["North Ossetia-Alania, Russia"] = make_russia_federal_subject_spec("republic", nil, "North Ossetia–Alania"), -- with en-dash
["Republic of North Ossetia-Alania, Russia"] = {alias_of = "North Ossetia-Alania, Russia", the = true},
["North Ossetia, Russia"] = {alias_of = "North Ossetia-Alania, Russia", display = true},
["Alania, Russia"] = {alias_of = "North Ossetia-Alania, Russia", display = true},
["Tatarstan, Russia"] = russia_republic_no_the,
["Republic of Tatarstan, Russia"] = {alias_of = "Tatarstan, Russia", the = true},
["Altai Republic, Russia"] = russia_republic_the,
["Chechnya, Russia"] = russia_republic_no_the,
["Chechen Republic, Russia"] = {alias_of = "Chechnya, Russia", the = true},
["Chuvashia, Russia"] = russia_republic_no_the,
["Chuvash Republic, Russia"] = {alias_of = "Chuvashia, Russia", the = true},
["Kabardino-Balkaria, Russia"] = russia_republic_no_the,
["Kabardino-Balkariya, Russia"] = {alias_of = "Kabardino-Balkaria, Russia", display = true},
["Kabardino-Balkarian Republic, Russia"] = {alias_of = "Kabardino-Balkaria, Russia", the = true},
["Kabardino-Balkar Republic, Russia"] = {alias_of = "Kabardino-Balkaria, Russia",
display = "Kabardino-Balkarian Republic, Russia", the = true},
["Karachay-Cherkessia, Russia"] = russia_republic_no_the,
["Karachay-Cherkess Republic, Russia"] = {alias_of = "Karachay-Cherkessia, Russia"},
["Komi, Russia"] = make_russia_federal_subject_spec("republic", nil, "Komi Republic"),
["Komi Republic, Russia"] = {alias_of = "Komi, Russia", the = true},
["Mari El, Russia"] = russia_republic_no_the,
["Mari El Republic, Russia"] = {alias_of = "Mari El, Russia", the = true},
["Sakha, Russia"] = make_russia_federal_subject_spec("republic", nil, "Sakha Republic"),
["Sakha Republic, Russia"] = {alias_of = "Sakha, Russia", the = true},
["Yakutia, Russia"] = {alias_of = "Sakha, Russia"},
["Yakutiya, Russia"] = {alias_of = "Sakha, Russia", display = "Yakutia, Russia"},
["Republic of Yakutia (Sakha), Russia"] = {alias_of = "Sakha, Russia", display = "Sakha Republic, Russia",
the = true},
["Tuva, Russia"] = russia_republic_no_the,
["Tyva, Russia"] = {alias_of = "Tuva, Russia", display = true},
["Tuva Republic, Russia"] = {alias_of = "Tuva, Russia", the = true},
["Tyva Republic, Russia"] = {alias_of = "Tuva, Russia", display= "Tuva Republic, Russia", the = true},
["Udmurtia, Russia"] = russia_republic_no_the,
["Udmurt Republic, Russia"] = {alias_of = "Udmurtia, Russia", the = true},
-- Not included due to being unrecognized and only partly controlled:
-- ["Crimea, Russia"] = make_russia_federal_subject_spec("republic", nil, "Republic of Crimea (Russia)")
-- ["Donetsk People's Republic, Russia"] = russia_republic_the,
-- ["Luhansk People's Republic, Russia"] = russia_republic_the,
-- ["Zaporozhye Oblast, Russia"] = make_russia_federal_subject_spec("oblast", nil, "Russian occupation of Zaporizhzhia Oblast"),
-- ["Kherson Oblast, Russia"] = make_russia_federal_subject_spec("oblast", nil, "Russian occupation of Kherson Oblast"),
-- There are also federal cities (not included because they're cities):
-- Moscow, Saint Petersburg; Sevastopol (unrecognized; same status as for "Crimea, Russia" above)
}
local function russia_key_to_placename(key)
key = key:gsub(",.*", "")
local full_placename = key
if key == "Jewish Autonomous Oblast" then
return full_placename, full_placename
end
local elliptical_placename
for _, suffix in ipairs({"Krai", "Oblast"}) do
elliptical_placename = key:match("^(.*) " .. suffix .. "$")
if elliptical_placename then
return full_placename, elliptical_placename
end
end
return full_placename, full_placename
end
local function russia_placename_to_key(placename)
local key = placename .. ", Russia"
if export.russia_federal_subjects[key] then
return key
end
-- We allow the user to say e.g. "obl/Samara" in place of "obl/Samara Oblast".
for _, suffix in ipairs({"Krai", "Oblast"}) do
local suffixed_key = placename .. " " .. suffix .. ", Russia"
if export.russia_federal_subjects[suffixed_key] then
return suffixed_key
end
end
return placename .. ", Russia"
end
local function construct_russia_federal_subject_keydesc(group, key, spec)
local placename = key:gsub(",.*", "")
local linked_placename = export.construct_linked_placename(spec, placename)
local placetype = spec.placetype
if type(placetype) == "table" then
placetype = placetype[1]
end
if placetype == "oblast" then
-- Hack: Oblasts generally don't have entries under "Foo Oblast"
-- but just under "Foo", so fix the linked key appropriately;
-- doesn't apply to the Jewish Autonomous Oblast
linked_placename = linked_placename:gsub(" Oblast%]%]", "%]%] Oblast")
end
return linked_placename .. ", a [[federal subject]] ([[" .. placetype .. "]]) of [[Russia]]"
end
-- federal subjects of Russia
export.russia_group = {
key_to_placename = russia_key_to_placename,
placename_to_key = russia_placename_to_key,
default_container = "Russia",
default_keydesc = construct_russia_federal_subject_keydesc,
default_overriding_bare_label_parents = {"federal subjects of Russia", "+++"},
data = export.russia_federal_subjects,
}
export.saudi_arabia_provinces = {
["Riyadh Province, Saudi Arabia"] = {},
["Mecca Province, Saudi Arabia"] = {},
-- Name is too generic to assume it's in Saudi Arabia if not specified.
["Eastern Province, Saudi Arabia"] = {no_auto_augment_container = true, wp = "%l, %c"},
["Medina Province, Saudi Arabia"] = {wp = "%l (%c)"},
["Aseer Province, Saudi Arabia"] = {wp = "Asir"},
["Asir Province, Saudi Arabia"] = {alias_of = "Aseer Province, Saudi Arabia", display = true},
["Jazan Province, Saudi Arabia"] = {},
["Qassim Province, Saudi Arabia"] = {wp = "Al-Qassim Province"},
["Al-Qassim Province, Saudi Arabia"] = {alias_of = "Qassim Province, Saudi Arabia", display = true},
["Tabuk Province, Saudi Arabia"] = {},
["Hail Province, Saudi Arabia"] = {wp = "Ḥa'il Province"},
["Ha'il Province, Saudi Arabia"] = {alias_of = "Hail Province, Saudi Arabia", display = true},
["Ḥa'il Province, Saudi Arabia"] = {alias_of = "Hail Province, Saudi Arabia", display = true},
["Al-Jouf Province, Saudi Arabia"] = {wp = "Al-Jawf Province"},
["Al-Jawf Province, Saudi Arabia"] = {alias_of = "Al-Jouf Province, Saudi Arabia", display = true},
["Najran Province, Saudi Arabia"] = {},
["Northern Borders Province, Saudi Arabia"] = {},
["Al-Bahah Province, Saudi Arabia"] = {},
}
-- provinces of Saudi Arabia
export.saudi_arabia_group = {
key_to_placename = make_key_to_placename(", Saudi Arabia$", " Province$"),
placename_to_key = make_placename_to_key(", Saudi Arabia", " Province"),
default_container = "Saudi Arabia",
default_placetype = "จังหวัด",
data = export.saudi_arabia_provinces,
}
export.south_africa_provinces = {
["Eastern Cape, South Africa"] = {the = true},
["Free State, South Africa"] = {the = true, wp = "%l (จังหวัด)"},
["Gauteng, South Africa"] = {},
["KwaZulu-Natal, South Africa"] = {},
["Limpopo, South Africa"] = {},
["Mpumalanga, South Africa"] = {},
-- per Wikipedia and other sources, `North West` doesn't normally have `the` before it
["North West, South Africa"] = {wp = "%l (South African province)"},
["Northern Cape, South Africa"] = {the = true},
["Western Cape, South Africa"] = {the = true},
}
-- provinces of South Africa
export.south_africa_group = {
default_container = "South Africa",
default_placetype = "จังหวัด",
default_divs = "เทศบาล",
data = export.south_africa_provinces,
}
export.south_korea_provinces = {
["North Chungcheong Province, South Korea"] = {},
["South Chungcheong Province, South Korea"] = {},
["Gangwon Province, South Korea"] = {wp = "%l, %c"},
["Gyeonggi Province, South Korea"] = {},
["North Gyeongsang Province, South Korea"] = {},
["South Gyeongsang Province, South Korea"] = {},
["North Jeolla Province, South Korea"] = {},
["South Jeolla Province, South Korea"] = {},
["Jeju Province, South Korea"] = {},
}
-- provinces of South Korea
export.south_korea_group = {
key_to_placename = make_key_to_placename(", South Korea$", " Province$"),
placename_to_key = make_placename_to_key(", South Korea", " Province"),
default_container = "South Korea",
default_placetype = "จังหวัด",
data = export.south_korea_provinces,
}
export.spain_autonomous_communities = {
["Andalusia, Spain"] = {},
["Aragon, Spain"] = {},
["Asturias, Spain"] = {},
["Balearic Islands, Spain"] = {the = true},
["Basque Country, Spain"] = {the = true, wp = "%l (autonomous community)"},
["Canary Islands, Spain"] = {the = true},
["Cantabria, Spain"] = {},
["Castile and León, Spain"] = {},
["Castilla-La Mancha, Spain"] = {wp = "Castilla–La Mancha"}, -- with en-dash
["Catalonia, Spain"] = {},
["Community of Madrid, Spain"] = {the = true},
["Extremadura, Spain"] = {},
["Galicia, Spain"] = {wp = "%l (Spain)"},
["La Rioja, Spain"] = {},
["Murcia, Spain"] = {wp = "Region of %l"},
["Navarre, Spain"] = {},
["Valencia, Spain"] = {wp = "Valencian Community"},
["Valencian Community, Spain"] = {alias_of = "Valencia, Spain", the = true},
}
-- autonomous communities of Spain
export.spain_group = {
default_container = "Spain",
default_placetype = "autonomous community",
default_divs = {"เทศบาล", "comarcas"},
data = export.spain_autonomous_communities,
}
export.taiwan_counties = {
["จางฮว่า, ไต้หวัน"] = {},
["เจียอี้, ไต้หวัน"] = {},
["ซินจู๋, ไต้หวัน"] = {},
["ฮวาเหลียน, ไต้หวัน"] = {},
["จินเหมิน, ไต้หวัน"] = {wp = "หมู่เกาะจินเหมิน"},
["เหลียนเจียง, ไต้หวัน"] = {wp = "หมู่เกาะหมาจู่"},
["เหมียวลี่, ไต้หวัน"] = {},
["หนานโถว, ไต้หวัน"] = {},
["เผิงหู, ไต้หวัน"] = {wp = "เผิงหู"},
["ผิงตง, ไต้หวัน"] = {},
["ไถตง, ไต้หวัน"] = {},
["อี๋หลาน, ไต้หวัน"] = {wp = "%l, %c"},
["ยฺหวินหลิน, ไต้หวัน"] = {},
}
-- counties of Taiwan
export.taiwan_group = {
key_to_placename = make_key_to_placename(", ไต้หวัน$"),
placename_to_key = make_placename_to_key(", ไต้หวัน"),
default_container = "ไต้หวัน",
default_placetype = "เทศมณฑล",
default_divs = {"อำเภอ", "townships"},
data = export.taiwan_counties,
}
export.thailand_provinces = { --ไม่ต้องเติม จังหวัด
-- กรุงเทพมหานคร (Bangkok - special administrative area)
["อำนาจเจริญ, ไทย"] = {},
["อ่างทอง, ไทย"] = {},
["บึงกาฬ, ไทย"] = {},
["บุรีรัมย์, ไทย"] = {},
["ฉะเชิงเทรา, ไทย"] = {},
["ชัยนาท, ไทย"] = {},
["ชัยภูมิ, ไทย"] = {},
["จันทบุรี, ไทย"] = {},
["เชียงใหม่, ไทย"] = {},
["เชียงราย, ไทย"] = {},
["ชลบุรี, ไทย"] = {},
["ชุมพร, ไทย"] = {},
["กาฬสินธุ์, ไทย"] = {},
["กำแพงเพชร, ไทย"] = {},
["กาญจนบุรี, ไทย"] = {},
["ขอนแก่น, ไทย"] = {},
["กระบี่, ไทย"] = {},
["ลำปาง, ไทย"] = {},
["ลำพูน, ไทย"] = {},
["เลย, ไทย"] = {},
["ลพบุรี, ไทย"] = {},
["แม่ฮ่องสอน, ไทย"] = {},
["มหาสารคาม, ไทย"] = {},
["มุกดาหาร, ไทย"] = {},
["นครนายก, ไทย"] = {},
["นครปฐม, ไทย"] = {},
["นครพนม, ไทย"] = {},
["นครราชสีมา, ไทย"] = {},
["นครสวรรค์, ไทย"] = {},
["นครศรีธรรมราช, ไทย"] = {},
["น่าน, ไทย"] = {},
["นราธิวาส, ไทย"] = {},
["หนองบัวลำภู, ไทย"] = {},
["หนองคาย, ไทย"] = {},
["นนทบุรี, ไทย"] = {},
["ปทุมธานี, ไทย"] = {},
["ปัตตานี, ไทย"] = {},
["พังงา, ไทย"] = {},
["พัทลุง, ไทย"] = {},
["พะเยา, ไทย"] = {},
["เพชรบูรณ์, ไทย"] = {},
["เพชรบุรี, ไทย"] = {},
["พิจิตร, ไทย"] = {},
["พิษณุโลก, ไทย"] = {},
["พระนครศรีอยุธยา, ไทย"] = {},
["แพร่, ไทย"] = {},
["ภูเก็ต, ไทย"] = {},
["ปราจีนบุรี, ไทย"] = {},
["ประจวบคีรีขันธ์, ไทย"] = {},
["ระนอง, ไทย"] = {},
["ราชบุรี, ไทย"] = {},
["ระยอง, ไทย"] = {},
["ร้อยเอ็ด, ไทย"] = {},
["สระแก้ว, ไทย"] = {},
["สกลนคร, ไทย"] = {},
["สมุทรปราการ, ไทย"] = {},
["สมุทรสาคร, ไทย"] = {},
["สมุทรสงคราม, ไทย"] = {},
["สระบุรี, ไทย"] = {},
["สตูล, ไทย"] = {},
["สิงห์บุรี, ไทย"] = {},
["ศรีสะเกษ, ไทย"] = {},
["สงขลา, ไทย"] = {},
["สุโขทัย, ไทย"] = {},
["สุพรรณบุรี, ไทย"] = {},
["สุราษฎร์ธานี, ไทย"] = {},
["สุรินทร์, ไทย"] = {},
["ตาก, ไทย"] = {},
["ตรัง, ไทย"] = {},
["ตราด, ไทย"] = {},
["อุบลราชธานี, ไทย"] = {},
["อุดรธานี, ไทย"] = {},
["อุทัยธานี, ไทย"] = {},
["อุตรดิตถ์, ไทย"] = {},
["ยะลา, ไทย"] = {},
["ยโสธร, ไทย"] = {},
}
-- provinces of Thailand
export.thailand_group = {
key_to_placename = make_key_to_placename(", ไทย$"), --ไม่ต้องเติม จังหวัด
placename_to_key = make_placename_to_key(", ไทย"),
default_container = "ไทย",
default_placetype = "จังหวัด",
default_divs = "อำเภอ",
-- For obscure reasons, provinces of Iran, Laos, Thailand and Vietnam use lowercase 'province'
default_wp = "จังหวัด%e",
data = export.thailand_provinces,
}
export.turkey_provinces = {
["Adana Province, Turkey"] = {}, -- code 01
["Adıyaman Province, Turkey"] = {}, -- code 02
["Afyonkarahisar Province, Turkey"] = {}, -- code 03
["Ağrı Province, Turkey"] = {}, -- code 04
["Amasya Province, Turkey"] = {}, -- code 05
["Ankara Province, Turkey"] = {}, -- code 06
["Antalya Province, Turkey"] = {}, -- code 07
["Artvin Province, Turkey"] = {}, -- code 08
["Aydın Province, Turkey"] = {}, -- code 09
["Balıkesir Province, Turkey"] = {}, -- code 10
["Bilecik Province, Turkey"] = {}, -- code 11
["Bingöl Province, Turkey"] = {}, -- code 12
["Bitlis Province, Turkey"] = {}, -- code 13
["Bolu Province, Turkey"] = {}, -- code 14
["Burdur Province, Turkey"] = {}, -- code 15
["Bursa Province, Turkey"] = {}, -- code 16
["Çanakkale Province, Turkey"] = {}, -- code 17
["Çankırı Province, Turkey"] = {}, -- code 18
["Çorum Province, Turkey"] = {}, -- code 19
["Denizli Province, Turkey"] = {}, -- code 20
["Diyarbakır Province, Turkey"] = {}, -- code 21
["Edirne Province, Turkey"] = {}, -- code 22
["Elazığ Province, Turkey"] = {}, -- code 23
["Elâzığ Province, Turkey"] = {alias_of = "Elazığ Province, Turkey", display = true},
["Erzincan Province, Turkey"] = {}, -- code 24
["Erzurum Province, Turkey"] = {}, -- code 25
["Eskişehir Province, Turkey"] = {}, -- code 26
["Gaziantep Province, Turkey"] = {}, -- code 27
["Giresun Province, Turkey"] = {}, -- code 28
["Gümüşhane Province, Turkey"] = {}, -- code 29
["Hakkâri Province, Turkey"] = {}, -- code 30
["Hakkari Province, Turkey"] = {alias_of = "Hakkâri Province, Turkey", display = true},
["Hatay Province, Turkey"] = {}, -- code 31
["Isparta Province, Turkey"] = {}, -- code 32
["Mersin Province, Turkey"] = {}, -- code 33
-- ["Istanbul Province, Turkey"] = {}, -- code 34; this is coextensive with the city itself
["İzmir Province, Turkey"] = {}, -- code 35
["Izmir Province, Turkey"] = {alias_of = "İzmir Province, Turkey", display = true},
["Kars Province, Turkey"] = {}, -- code 36
["Kastamonu Province, Turkey"] = {}, -- code 37
["Kayseri Province, Turkey"] = {}, -- code 38
["Kırklareli Province, Turkey"] = {}, -- code 39
["Kırşehir Province, Turkey"] = {}, -- code 40
["Kocaeli Province, Turkey"] = {}, -- code 41
["Konya Province, Turkey"] = {}, -- code 42
["Kütahya Province, Turkey"] = {}, -- code 43
["Malatya Province, Turkey"] = {}, -- code 44
["Manisa Province, Turkey"] = {}, -- code 45
["Kahramanmaraş Province, Turkey"] = {}, -- code 46
["Mardin Province, Turkey"] = {}, -- code 47
["Muğla Province, Turkey"] = {}, -- code 48
["Muş Province, Turkey"] = {}, -- code 49
["Nevşehir Province, Turkey"] = {}, -- code 50
["Niğde Province, Turkey"] = {}, -- code 51
["Ordu Province, Turkey"] = {}, -- code 52
["Rize Province, Turkey"] = {}, -- code 53
["Sakarya Province, Turkey"] = {}, -- code 54
["Samsun Province, Turkey"] = {}, -- code 55
["Siirt Province, Turkey"] = {}, -- code 56
["Sinop Province, Turkey"] = {}, -- code 57
["Sivas Province, Turkey"] = {}, -- code 58
["Tekirdağ Province, Turkey"] = {}, -- code 59
["Tokat Province, Turkey"] = {}, -- code 60
["Trabzon Province, Turkey"] = {}, -- code 61
["Tunceli Province, Turkey"] = {}, -- code 62
["Şanlıurfa Province, Turkey"] = {}, -- code 63
["Uşak Province, Turkey"] = {}, -- code 64
["Van Province, Turkey"] = {}, -- code 65
["Yozgat Province, Turkey"] = {}, -- code 66
["Zonguldak Province, Turkey"] = {}, -- code 67
["Aksaray Province, Turkey"] = {}, -- code 68
["Bayburt Province, Turkey"] = {}, -- code 69
["Karaman Province, Turkey"] = {}, -- code 70
["Kırıkkale Province, Turkey"] = {}, -- code 71
["Batman Province, Turkey"] = {}, -- code 72
["Şırnak Province, Turkey"] = {}, -- code 73
["Bartın Province, Turkey"] = {}, -- code 74
["Ardahan Province, Turkey"] = {}, -- code 75
["Iğdır Province, Turkey"] = {}, -- code 76
["Yalova Province, Turkey"] = {}, -- code 77
["Karabük Province, Turkey"] = {}, -- code 78
["Kilis Province, Turkey"] = {}, -- code 79
["Osmaniye Province, Turkey"] = {}, -- code 80
["Düzce Province, Turkey"] = {}, -- code 81
}
-- provinces of Turkey
export.turkey_group = {
key_to_placename = make_key_to_placename(", Turkey$", " Province$"),
placename_to_key = make_placename_to_key(", Turkey", " Province"),
default_container = "Turkey",
default_placetype = "จังหวัด",
default_divs = "อำเภอ",
data = export.turkey_provinces,
}
export.ukraine_oblasts = {
["Cherkasy Oblast, Ukraine"] = {}, -- capital [[Cherkasy]], license plate prefix CA, IA
["Chernihiv Oblast, Ukraine"] = {}, -- capital [[Chernihiv]], license plate prefix CB, IB
["Chernivtsi Oblast, Ukraine"] = {}, -- capital [[Chernivtsi]], license plate prefix CE, IE
-- apparently will be renamed to 'Dnipro Oblast'
["Dnipropetrovsk Oblast, Ukraine"] = {}, -- capital [[Dnipro]], license plate prefix AE, KE
["Donetsk Oblast, Ukraine"] = {}, -- capital ''[[Donetsk]] ([[Kramatorsk]])'', license plate prefix AH, KH
["Ivano-Frankivsk Oblast, Ukraine"] = {}, -- capital [[Ivano-Frankivsk]], license plate prefix AT, KT
["Kharkiv Oblast, Ukraine"] = {}, -- capital [[Kharkiv]], license plate prefix AX, KX
["Kherson Oblast, Ukraine"] = {}, -- capital ''[[Kherson]]'', license plate prefix ''BT, HT''
["Khmelnytskyi Oblast, Ukraine"] = {}, -- capital [[Khmelnytskyi]], license plate prefix BX, HX
-- apparently will be renamed to 'Kropyvnytskyi Oblast'
["Kirovohrad Oblast, Ukraine"] = {}, -- capital [[Kropyvnytskyi]], license plate prefix BA, HA
["Kyiv Oblast, Ukraine"] = {}, -- capital [[Kyiv]], license plate prefix AI, KI
["Kiev Oblast, Ukraine"] = {alias_of = "Kyiv Oblast, Ukraine", display = true},
["Luhansk Oblast, Ukraine"] = {}, -- capital ''[[Luhansk]] ([[Sievierodonetsk]])'', license plate prefix BB, HB
["Lviv Oblast, Ukraine"] = {}, -- capital [[Lviv]], license plate prefix BC, HC
["Mykolaiv Oblast, Ukraine"] = {}, -- capital [[Mykolaiv]], license plate prefix BE, HE
["Odesa Oblast, Ukraine"] = {}, -- capital [[Odesa]], license plate prefix BH, HH
["Odessa Oblast, Ukraine"] = {alias_of = "Odesa Oblast, Ukraine", display = true},
["Poltava Oblast, Ukraine"] = {}, -- capital [[Poltava]], license plate prefix BI, HI
["Rivne Oblast, Ukraine"] = {}, -- capital [[Rivne]], license plate prefix BK, HK
["Sumy Oblast, Ukraine"] = {}, -- capital [[Sumy]], license plate prefix BM, HM
["Ternopil Oblast, Ukraine"] = {}, -- capital [[Ternopil]], license plate prefix BO, HO
["Vinnytsia Oblast, Ukraine"] = {}, -- capital [[Vinnytsia]], license plate prefix AB, KB
["Volyn Oblast, Ukraine"] = {}, -- capital [[Lutsk]], license plate prefix AC, KC
["Zakarpattia Oblast, Ukraine"] = {}, -- capital [[Uzhhorod]], license plate prefix AO, KO
["Zaporizhzhia Oblast, Ukraine"] = {}, -- capital ''[[Zaporizhzhia]]'', license plate prefix AP, KP
["Zaporizhia Oblast, Ukraine"] = {alias_of = "Zaporizhzhia Oblast, Ukraine", display = true},
["Zhytomyr Oblast, Ukraine"] = {}, -- capital [[Zhytomyr]], license plate prefix AM, KM
}
-- oblasts of Ukraine
export.ukraine_group = {
key_to_placename = make_key_to_placename(", Ukraine$", " Oblast$"),
placename_to_key = make_placename_to_key(", Ukraine", " Oblast"),
default_container = "Ukraine",
default_placetype = "oblast",
default_divs = {"raions", "hromadas"},
data = export.ukraine_oblasts,
}
export.united_kingdom_constituent_countries = {
["England"] = {divs = {
"เทศมณฑล",
"อำเภอ",
{type = "local government districts", cat_as = "อำเภอ"},
{
type = "local government districts with borough status",
cat_as = {"อำเภอ", "boroughs"},
},
{type = "boroughs", cat_as = {"อำเภอ", "boroughs"}},
{type = "civil parishes", container_parent_type = false},
}},
["Northern Ireland"] = {
placetype = {"constituent country", "จังหวัด", "ประเทศ"},
divs = {"เทศมณฑล", "อำเภอ"},
},
["Scotland"] = {divs = {
{type = "council areas", container_parent_type = false},
"อำเภอ",
}},
["Wales"] = {divs = {
"เทศมณฑล",
{type = "county boroughs", container_parent_type = false},
{type = "communities", container_parent_type = false},
{type = "Welsh communities", cat_as = {{type = "communities", container_parent_type = false}}},
}},
}
-- constituent countries and provinces of the United Kingdom
export.united_kingdom_group = {
placename_to_key = false,
default_container = "สหราชอาณาจักร",
default_placetype = {"constituent country", "ประเทศ"},
addl_divs = {
"traditional counties",
{type = "historical counties", cat_as = "traditional counties"},
},
-- Don't create categories like 'Category:en:Towns in the United Kingdom'
-- or 'Category:en:Places in the United Kingdom'.
default_no_container_cat = true,
data = export.united_kingdom_constituent_countries,
}
export.england_counties = {
-- NOTE: We used to have various other "no longer" counties commented out, which seems to refer to counties that
-- existed officially at some point between 1889 and 1974, which I have removed. I have only kept the three
-- ceremonial counties that existed from 1974 (when ceremonial counties were created) to 1996, as well as those
-- still considered "historic counties" per [[w:Historic counties of England]].
-- ["Avon, England"] = {wp = "%l (county)"}, -- no longer (1974 to 1996)
["Bedfordshire, England"] = {},
["Berkshire, England"] = {},
-- ["Brighton and Hove, England"] = {}, -- city
-- ["Bristol, England"] = {}, -- city
["Buckinghamshire, England"] = {},
["Cambridgeshire, England"] = {},
["Cheshire, England"] = {},
-- ["Cleveland, England"] = {wp = "%l (county)"}, -- no longer (1974 to 1996)
["Cornwall, England"] = {},
-- ["Cumberland, England"] = {}, -- no longer (historic county)
["Cumbria, England"] = {},
["Derbyshire, England"] = {},
["Devon, England"] = {},
["Dorset, England"] = {},
["County Durham, England"] = {},
["East Sussex, England"] = {},
["Essex, England"] = {},
["Gloucestershire, England"] = {},
["Greater London, England"] = {},
["Greater Manchester, England"] = {},
["Hampshire, England"] = {},
["Herefordshire, England"] = {},
["Hertfordshire, England"] = {},
-- ["Humberside, England"] = {}, -- no longer (1974 to 1996)
-- ["Huntingdonshire, England"] = {}, -- no longer (historic county)
["Isle of Wight, England"] = {the = true},
["Kent, England"] = {},
["Lancashire, England"] = {},
["Leicestershire, England"] = {},
["Lincolnshire, England"] = {},
["Merseyside, England"] = {},
-- ["Middlesex, England"] = {}, -- no longer (historic county)
["Norfolk, England"] = {},
["Northamptonshire, England"] = {},
["Northumberland, England"] = {},
["North Yorkshire, England"] = {},
["Nottinghamshire, England"] = {},
["Oxfordshire, England"] = {},
["Rutland, England"] = {},
["Shropshire, England"] = {},
["Somerset, England"] = {},
["South Humberside, England"] = {},
["South Yorkshire, England"] = {},
["Staffordshire, England"] = {},
["Suffolk, England"] = {},
["Surrey, England"] = {},
-- ["Sussex, England"] = {}, -- no longer (historic county)
["Tyne and Wear, England"] = {},
["Warwickshire, England"] = {},
["West Midlands, England"] = {the = true, wp = "%l (county)"},
-- ["Westmorland, England"] = {}, -- no longer (historic county)
["West Sussex, England"] = {},
["West Yorkshire, England"] = {},
["Wiltshire, England"] = {},
["Worcestershire, England"] = {},
-- ["Yorkshire, England"] = {}, -- no longer (historic county)
["East Riding of Yorkshire, England"] = {the = true},
}
-- counties of England
export.england_group = {
default_container = {key = "England", placetype = "constituent country"},
default_placetype = "เทศมณฑล",
default_divs = {
"อำเภอ",
{type = "local government districts", cat_as = "อำเภอ"},
{
type = "local government districts with borough status",
cat_as = {"อำเภอ", "boroughs"},
},
{type = "boroughs", cat_as = {"อำเภอ", "boroughs"}},
"civil parishes",
},
data = export.england_counties,
}
export.northern_ireland_counties = {
["County Antrim, Northern Ireland"] = {},
["County Armagh, Northern Ireland"] = {},
["City of Belfast, Northern Ireland"] = {the = true, is_city = true, wp = "Belfast"},
["County Down, Northern Ireland"] = {},
["County Fermanagh, Northern Ireland"] = {},
["County Londonderry, Northern Ireland"] = {},
["City of Derry, Northern Ireland"] = {the = true, is_city = true, wp = "Derry"},
["County Tyrone, Northern Ireland"] = {},
}
-- counties of Northern Ireland
export.northern_ireland_group = {
key_to_placename = make_irish_type_key_to_placename(", Northern Ireland$"),
placename_to_key = make_irish_type_placename_to_key(", Northern Ireland"),
default_container = {key = "Northern Ireland", placetype = "constituent country"},
default_placetype = "เทศมณฑล",
data = export.northern_ireland_counties,
}
export.scotland_council_areas = {
["Aberdeenshire, Scotland"] = {},
["Angus, Scotland"] = {wp = "%l, %c"},
["Argyll and Bute, Scotland"] = {},
["City of Aberdeen, Scotland"] = {the = true, wp = "Aberdeen"},
["Aberdeen"] = {alias_of = "City of Aberdeen, Scotland"},
["Aberdeen City"] = {alias_of = "City of Aberdeen, Scotland"},
["City of Dundee, Scotland"] = {the = true, wp = "Dundee"},
["Dundee"] = {alias_of = "City of Dundee, Scotland"},
["Dundee City"] = {alias_of = "City of Dundee, Scotland"},
["City of Edinburgh, Scotland"] = {the = true, wp = "%l council area"},
["Edinburgh"] = {alias_of = "City of Edinburgh, Scotland"},
["City of Glasgow, Scotland"] = {the = true, wp = "Glasgow"},
["Glasgow"] = {alias_of = "City of Glasgow, Scotland"},
["Clackmannanshire, Scotland"] = {},
["Dumfries and Galloway, Scotland"] = {},
["East Ayrshire, Scotland"] = {},
["East Dunbartonshire, Scotland"] = {},
["East Lothian, Scotland"] = {},
["East Renfrewshire, Scotland"] = {},
["Falkirk, Scotland"] = {wp = "%l council area"},
["Fife, Scotland"] = {},
["Highland, Scotland"] = {wp = "%l council area"},
["Inverclyde, Scotland"] = {},
["Midlothian, Scotland"] = {},
["Moray, Scotland"] = {},
["North Ayrshire, Scotland"] = {},
["North Lanarkshire, Scotland"] = {},
["Orkney Islands, Scotland"] = {the = true},
["Perth and Kinross, Scotland"] = {},
["Renfrewshire, Scotland"] = {},
["Scottish Borders, Scotland"] = {the = true},
["Shetland Islands, Scotland"] = {the = true},
["South Ayrshire, Scotland"] = {},
["South Lanarkshire, Scotland"] = {},
["Stirling, Scotland"] = {wp = "%l council area"},
["West Dunbartonshire, Scotland"] = {},
["West Lothian, Scotland"] = {},
["Western Isles, Scotland"] = {the = true, wp = "Outer Hebrides"},
["Na h-Eileanan Siar, Scotland"] = {alias_of = "Western Isles, Scotland"},
}
-- council areas of Scotland
export.scotland_group = {
default_container = {key = "Scotland", placetype = "constituent country"},
default_placetype = "council area",
data = export.scotland_council_areas,
}
export.wales_principal_areas = {
["Blaenau Gwent, Wales"] = {},
["Bridgend, Wales"] = {wp = "%l County Borough"},
["Caerphilly, Wales"] = {wp = "%l County Borough"},
-- ["Cardiff, Wales"] = {placetype = "นคร"},
["Carmarthenshire, Wales"] = {placetype = "เทศมณฑล"},
["Ceredigion, Wales"] = {placetype = "เทศมณฑล"},
["Conwy, Wales"] = {wp = "%l County Borough"},
["Denbighshire, Wales"] = {placetype = "เทศมณฑล"},
["Flintshire, Wales"] = {placetype = "เทศมณฑล"},
["Gwynedd, Wales"] = {placetype = "เทศมณฑล"},
["Isle of Anglesey, Wales"] = {the = true, placetype = "เทศมณฑล"},
["Anglesey, Wales"] = {alias_of = "Isle of Anglesey, Wales"}, -- differs in "the"
["Merthyr Tydfil, Wales"] = {wp = "%l County Borough"},
["Monmouthshire, Wales"] = {placetype = "เทศมณฑล"},
["Neath Port Talbot, Wales"] = {},
-- ["Newport, Wales"] = {placetype = "นคร", wp = "%l, %c"},
["Pembrokeshire, Wales"] = {placetype = "เทศมณฑล"},
["Powys, Wales"] = {placetype = "เทศมณฑล"},
["Rhondda Cynon Taf, Wales"] = {},
-- ["Swansea, Wales"] = {placetype = "นคร"},
["Torfaen, Wales"] = {},
["Vale of Glamorgan, Wales"] = {the = true},
["Wrexham, Wales"] = {wp = "%l County Borough"},
}
-- principal areas (cities, counties and county boroughs) of Wales
export.wales_group = {
default_container = {key = "Wales", placetype = "constituent country"},
default_placetype = "county borough",
data = export.wales_principal_areas,
}
export.united_states_states = {
["Alabama, USA"] = {},
["Alaska, USA"] = {divs = {
{type = "boroughs", container_parent_type = "เทศมณฑล"},
{type = "borough seats", container_parent_type = "county seats"},
}},
["Arizona, USA"] = {},
["Arkansas, USA"] = {},
["California, USA"] = {},
["Colorado, USA"] = {divs = {"เทศมณฑล", "county seats", "เทศบาล"}},
["Connecticut, USA"] = {divs = {"เทศมณฑล", "county seats", "เทศบาล"}},
["Delaware, USA"] = {},
["Florida, USA"] = {},
["Georgia, USA"] = {wp = "%l (U.S. state)"},
["Hawaii, USA"] = {addl_parents = {"พอลินีเชีย"}},
["Idaho, USA"] = {},
["Illinois, USA"] = {},
["Indiana, USA"] = {},
["Iowa, USA"] = {},
["Kansas, USA"] = {},
["Kentucky, USA"] = {},
["Louisiana, USA"] = {divs = {
{type = "parishes", container_parent_type = "เทศมณฑล"},
{type = "parish seats", container_parent_type = "county seats"},
}},
["Maine, USA"] = {},
["Maryland, USA"] = {},
["Massachusetts, USA"] = {},
["Michigan, USA"] = {},
["Minnesota, USA"] = {},
["Mississippi, USA"] = {},
["Missouri, USA"] = {},
["Montana, USA"] = {},
["Nebraska, USA"] = {},
["Nevada, USA"] = {},
["New Hampshire, USA"] = {},
["New Jersey, USA"] = {divs = {
"เทศมณฑล", "county seats",
{type = "boroughs", prep = "ใน"},
}},
["New Mexico, USA"] = {},
["New York, USA"] = {wp = "%l (รัฐ)"},
["North Carolina, USA"] = {},
["North Dakota, USA"] = {},
["Ohio, USA"] = {},
["Oklahoma, USA"] = {},
["Oregon, USA"] = {},
["Pennsylvania, USA"] = {divs = {
"เทศมณฑล", "county seats",
{type = "boroughs", prep = "ใน"},
}},
["Rhode Island, USA"] = {},
["South Carolina, USA"] = {},
["South Dakota, USA"] = {},
["Tennessee, USA"] = {},
["Texas, USA"] = {},
["Utah, USA"] = {},
["Vermont, USA"] = {},
["Virginia, USA"] = {},
["Washington, USA"] = {wp = "%l (รัฐ)"},
["West Virginia, USA"] = {},
["Wisconsin, USA"] = {},
["Wyoming, USA"] = {},
}
-- states of the United States
export.united_states_group = {
placename_to_key = make_placename_to_key(", USA"),
default_container = "สหรัฐอเมริกา",
default_placetype = "รัฐ",
default_divs = {"เทศมณฑล", "county seats"},
addl_divs = {
{type = "census-designated places", prep = "ใน"},
{type = "unincorporated communities", prep = "ใน"},
},
data = export.united_states_states,
}
export.vietnam_provinces = {
-- [[Northeast (Vietnam)|Northeast]] region
["Bắc Giang, เวียดนาม"] = {}, -- capital [[Bắc Giang]]
["Bắc Kạn, เวียดนาม"] = {}, -- capital [[Bắc Kạn]]
["Cao Bằng, เวียดนาม"] = {}, -- capital [[Cao Bằng]]
["Hà Giang, เวียดนาม"] = {}, -- capital [[Hà Giang]]
["Lạng Sơn, เวียดนาม"] = {}, -- capital [[Lạng Sơn]]
["Phú Thọ, เวียดนาม"] = {}, -- capital [[Việt Trì]]
["Quảng Ninh, เวียดนาม"] = {}, -- capital [[Hạ Long]]
["Thái Nguyên, เวียดนาม"] = {}, -- capital [[Thái Nguyên]]
["Tuyên Quang, เวียดนาม"] = {}, -- capital [[Tuyên Quang]]
-- [[Northwest (Vietnam)|Northwest]] region
["Lào Cai, เวียดนาม"] = {}, -- capital [[Lào Cai]]
["Yên Bái, เวียดนาม"] = {}, -- capital [[Yên Bái]]
["Điện Biên, เวียดนาม"] = {}, -- capital [[Điện Biên Phủ]]
["Hoà Bình, เวียดนาม"] = {}, -- capital [[Hoà Bình City|Hoà Bình]]
["Hòa Bình, เวียดนาม"] = {alias_of = "Hoà Bình, เวียดนาม", display = true},
["Lai Châu, เวียดนาม"] = {}, -- capital [[Lai Châu]]
["Sơn La, เวียดนาม"] = {}, -- capital [[Sơn La]]
-- [[Red River Delta]] region
["Bắc Ninh, เวียดนาม"] = {}, -- capital [[Bắc Ninh]]
["Hà Nam, เวียดนาม"] = {}, -- capital [[Phủ Lý]]
["Hải Dương, เวียดนาม"] = {}, -- capital [[Hải Dương]]
["Hưng Yên, เวียดนาม"] = {}, -- capital [[Hưng Yên]]
["Nam Định, เวียดนาม"] = {}, -- capital [[Nam Định]]
["Ninh Bình, เวียดนาม"] = {}, -- capital [[Ninh Bình|Hoa Lư]]
["Thái Bình, เวียดนาม"] = {}, -- capital [[Thái Bình]]
["Vĩnh Phúc, เวียดนาม"] = {}, -- capital [[Vĩnh Yên]]
-- ["Hanoi"] = {placetype = {"เทศบาล", "นคร"}}, -- capital [[Hoàn Kiếm district]]
-- ["Haiphong"] = {placetype = {"เทศบาล", "นคร"}}, -- capital [[Hồng Bàng district]]
-- [[North Central Coast]] region
["Hà Tĩnh, เวียดนาม"] = {}, -- capital [[Hà Tĩnh]]
["Nghệ An, เวียดนาม"] = {}, -- capital [[Vinh]]
["Quảng Bình, เวียดนาม"] = {}, -- capital [[Đồng Hới]]
["Quảng Trị, เวียดนาม"] = {}, -- capital [[Đông Hà]]
["Thanh Hoá, เวียดนาม"] = {}, -- capital [[Thanh Hoá]]
["Thanh Hóa, เวียดนาม"] = {alias_of = "Thanh Hoá, เวียดนาม", display = true},
-- ["Hue"] = {placetype = {"เทศบาล", "นคร"}, wp = "Huế"}, -- capital [[Thuận Hoá district]]
-- [[Central Highlands (Vietnam)|Central Highlands]] region
["Đắk Lắk, เวียดนาม"] = {}, -- capital [[Buôn Ma Thuột]]
["Đăk Nông, เวียดนาม"] = {}, -- capital [[Gia Nghĩa]]
["Gia Lai, เวียดนาม"] = {}, -- capital [[Pleiku]]
["Kon Tum, เวียดนาม"] = {}, -- capital [[Kon Tum]]
["Lâm Đồng, เวียดนาม"] = {}, -- capital [[Đà Lạt]]
-- [[South Central Coast]] region
["Bình Định, เวียดนาม"] = {}, -- capital [[Quy Nhon]]
["Bình Thuận, เวียดนาม"] = {}, -- capital [[Phan Thiết]]
["Khánh Hoà, เวียดนาม"] = {}, -- capital [[Nha Trang]]
["Khánh Hòa, เวียดนาม"] = {alias_of = "Khánh Hoà, เวียดนาม", display = true},
["Ninh Thuận, เวียดนาม"] = {}, -- capital [[Phan Rang–Tháp Chàm]]
["Phú Yên, เวียดนาม"] = {}, -- capital [[Tuy Hoà]]
["Quảng Nam, เวียดนาม"] = {}, -- capital [[Tam Kỳ]]
["Quảng Ngãi, เวียดนาม"] = {}, -- capital [[Quảng Ngãi]]
-- ["Da Nang"] = {placetype = {"เทศบาล", "นคร"}}, -- capital [[Hải Châu district]]
-- [[Southeast (Vietnam)|Southeast]] region
["Bà Rịa–Vũng Tàu, เวียดนาม"] = {}, -- capital [[Bà Rịa]]
["Bình Dương, เวียดนาม"] = {}, -- capital [[Thủ Dầu Một]]
["Bình Phước, เวียดนาม"] = {}, -- capital [[Đồng Xoài]]
["Đồng Nai, เวียดนาม"] = {}, -- capital [[Biên Hoà]]
["Tây Ninh, เวียดนาม"] = {}, -- capital [[Tây Ninh]]
-- ["Ho Chi Minh City"] = {placetype = {"เทศบาล", "นคร"}}, -- capital [[District 1, Ho Chi Minh City|'''District 1''']]
-- [[Mekong Delta]] region
["An Giang, เวียดนาม"] = {}, -- capital [[Long Xuyên]]
["Bạc Liêu, เวียดนาม"] = {}, -- capital [[Bạc Liêu]]
["Bến Tre, เวียดนาม"] = {}, -- capital [[Bến Tre]]
["Cà Mau, เวียดนาม"] = {}, -- capital [[Cà Mau]]
["Đồng Tháp, เวียดนาม"] = {}, -- capital [[Cao Lãnh City|Cao Lãnh]]
["Hậu Giang, เวียดนาม"] = {}, -- capital [[Vị Thanh]]
["Kiên Giang, เวียดนาม"] = {}, -- capital [[Rạch Giá]]
["Long An, เวียดนาม"] = {}, -- capital [[Tân An]]
["Sóc Trăng, เวียดนาม"] = {}, -- capital [[Sóc Trăng]]
["Tiền Giang, เวียดนาม"] = {}, -- capital [[Mỹ Tho]]
["Trà Vinh, เวียดนาม"] = {}, -- capital [[Trà Vinh]]
["Vĩnh Long, เวียดนาม"] = {}, -- capital [[Vĩnh Long]]
-- ["Can Tho"] = {placetype = {"เทศบาล", "นคร"}, wp = "Cần Thơ"}, -- capital [[Ninh Kiều district]]
}
-- provinces of Vietnam
export.vietnam_group = {
key_to_placename = make_key_to_placename(", เวียดนาม$"),
placename_to_key = make_placename_to_key(", เวียดนาม"),
default_container = "เวียดนาม",
default_placetype = "จังหวัด",
-- There may not be enough districts to subcategorize like this.
-- default_divs = "อำเภอ",
-- For obscure reasons, provinces of Iran, Laos, Thailand and Vietnam use lowercase 'province'
default_wp = "จังหวัด%e",
data = export.vietnam_provinces,
}
-----------------------------------------------------------------------------------
-- City data --
-----------------------------------------------------------------------------------
export.australia_cities = {
["Adelaide"] = {container = "South Australia"}, -- 1,450,000 (Agglomeration)
["Brisbane"] = {container = "Queensland"}, -- 3,450,000 (Conglomeration; including the Gold Coast [750,997 2024 estiamte])
["Canberra"] = {container = {key = "Australian Capital Territory, ออสเตรเลีย", placetype = "ดินแดน"}}, -- 510,641 (2024 estimate)
["Melbourne"] = {container = "Victoria"}, -- 5,200,000 (Agglomeration)
["Newcastle, New South Wales"] = {container = "New South Wales", wp = "%l, %c"}, -- 534,033 (2024 estimate)
["Newcastle"] = {alias_of = "Newcastle, New South Wales"},
["Perth"] = {container = "Western Australia"}, -- 2,350,000 (Agglomeration)
["Sydney"] = {container = "New South Wales"}, -- 5,100,000 (Agglomeration)
}
export.australia_cities_group = {
canonicalize_key_container = make_canonicalize_key_container(", ออสเตรเลีย", "รัฐ"),
default_placetype = "นคร",
data = export.australia_cities,
}
export.brazil_cities = {
-- Figures from citypopulation.de; retrieved 2025-04-27; reference date 2025-01-01.
["São Paulo"] = {container = "São Paulo"}, -- 22,600,000 (Consolidated Urban Area; including Guarulhos)
["Sao Paulo"] = {alias_of = "São Paulo", display = true},
["Rio de Janeiro"] = {container = "Rio de Janeiro"}, -- 13,600,000 (Consolidated Urban Area)
["Belo Horizonte"] = {container = "Minas Gerais"}, -- 5,300,000
["Recife"] = {container = "Pernambuco"}, -- 4,100,000
["Porto Alegre"] = {container = "Rio Grande do Sul"}, -- 3,950,000 (Consolidated Urban Area)
["Brasília"] = {container = "Distrito Federal"}, -- 3,850,000
["Brasilia"] = {alias_of = "Brasília", display = true},
["Fortaleza"] = {container = "Ceará"}, -- 3,825,000
["Salvador"] = {container = "Bahia", wp = "%l, %c", commonscat = "%l (%c)"}, -- 3,400,000
["Curitiba"] = {container = "Paraná"}, -- 3,375,000
["Campinas"] = {container = "São Paulo"}, -- 3,250,000
["Goiânia"] = {container = "Goiás"}, -- 2,525,000
["Goiania"] = {alias_of = "Goiânia", display = true},
["Manaus"] = {container = "Amazonas"}, -- 2,275,000
["Belém"] = {container = "Pará"}, -- 2,200,000
["Belem"] = {alias_of = "Belém", display = true},
["Vitória"] = {container = "Espírito Santo", wp = "%l, %c"}, -- 1,870,000
["Vitoria"] = {alias_of = "Vitória", display = true},
["Santos"] = {container = "São Paulo", wp = "%l, %c"}, -- 1,760,000
["São Luís"] = {container = "Maranhão", wp = "%l, %c"}, -- 1,530,000
["Sao Luis"] = {alias_of = "São Luís", display = true},
["Natal"] = {container = "Rio Grande do Norte", wp = "%l, %c"}, -- 1,360,000
["Florianópolis"] = {container = "Santa Catarina"}, -- 1,260,000
["Florianopolis"] = {alias_of = "Florianópolis", display = true},
["Maceió"] = {container = "Alagoas"}, -- 1,220,000
["Maceio"] = {alias_of = "Maceió", display = true},
["João Pessoa"] = {container = "Paraíba", wp = "%l, %c"}, -- 1,210,000
["Joao Pessoa"] = {alias_of = "João Pessoa", display = true},
["São José dos Campos"] = {container = "São Paulo"}, -- 1,090,000
["Sao Jose dos Campos"] = {alias_of = "São José dos Campos", display = true},
["Londrina"] = {container = "Paraná"}, -- 1,050,000
["Teresina"] = {container = "Piauí"}, -- 1,040,000
}
export.brazil_cities_group = {
canonicalize_key_container = make_canonicalize_key_container(", บราซิล", "รัฐ"),
default_placetype = "นคร",
data = export.brazil_cities,
}
export.canada_cities = {
-- Figures from citypopulation.de; retrieved 2025-04-27; reference date 2025-01-01.
["Toronto"] = {container = "Ontario"}, -- 7,850,000 (Consolidated Urban Area; including Hamilton)
["Montreal"] = {container = "Quebec"}, -- 4,500,000 (Consolidated Urban Area)
["Vancouver"] = {container = "British Columbia"}, -- 3,175,000 (Consolidated Urban Area)
["Calgary"] = {container = "Alberta"}, -- 1,510,000 (Consolidated Urban Area)
["Edmonton"] = {container = "Alberta"}, -- 1,460,000 (Consolidated Urban Area)
["Ottawa"] = {container = "Ontario"}, -- 1,390,000 (Consolidated Urban Area)
["Quebec City"] = {container = "Quebec"}, -- 839,311 metro per Wikipedia (2021 census)
["Winnipeg"] = {container = "Manitoba"}, -- 834,678 metro per Wikipedia (2021 census)
["Hamilton"] = {container = "Ontario", wp = "%l, %c"}, -- 785,184 metro per Wikipedia (2021 census)
["Kitchener"] = {container = "Ontario", wp = "%l, %c"}, -- 575,847 metro per Wikipedia (2021 census)
}
export.canada_cities_group = {
canonicalize_key_container = make_canonicalize_key_container(", Canada", "จังหวัด"),
default_placetype = "นคร",
data = export.canada_cities,
}
export.france_cities = {
-- Figures from citypopulation.de unless otherwise indicated; retrieved 2025-04-26; reference date 2025-01-01.
["Paris"] = {container = "Île-de-France"}, -- 11,500,000 (Conglomeration)
["Lyon"] = {container = "Auvergne-Rhône-Alpes"}, -- 2,050,000 (Conglomeration)
["Lyons"] = {alias_of = "Lyon", display = true},
["Marseille"] = {container = "Provence-Alpes-Côte d'Azur"}, -- 1,710,000 (Conglomeration)
["Marseilles"] = {alias_of = "Marseille", display = true},
["Lille"] = {container = "Hauts-de-France"}, -- 1,320,000 (Conglomeration)
["Bordeaux"] = {container = "Nouvelle-Aquitaine"}, -- 1,160,000 (Conglomeration)
["Toulouse"] = {container = "Occitania"}, -- 1,150,000 (Conglomeration)
["Nice"] = {container = "Provence-Alpes-Côte d'Azur"},
["Nantes"] = {container = "Pays de la Loire"},
["Strasbourg"] = {container = "Grand Est"},
["Rennes"] = {container = "Brittany"},
}
export.france_cities_group = {
canonicalize_key_container = make_canonicalize_key_container(", ฝรั่งเศส", "ภูมิภาค"),
default_placetype = "นคร",
data = export.france_cities,
}
export.germany_cities = {
-- Figures from citypopulation.de unless otherwise indicated; retrieved 2025-04-26; reference date 2025-01-01.
-- listed under Rhein-Ruhr Area, total population 10,900,000 (Consolidated Urban Area)
["Cologne"] = {container = "North Rhine-Westphalia"},
["Köln"] = {alias_of = "Cologne", display = true},
["Düsseldorf"] = {container = "North Rhine-Westphalia"},
["Dusseldorf"] = {alias_of = "Düsseldorf", display = true},
["Dortmund"] = {container = "North Rhine-Westphalia"},
["Essen"] = {container = "North Rhine-Westphalia"},
["Duisberg"] = {container = "North Rhine-Westphalia"},
["Berlin"] = {}, -- 4,700,000
["Frankfurt"] = {container = "Hesse"}, -- 3,225,000
["Frankfurt am Main"] = {alias_of = "Frankfurt"}, -- not a display alias as it's longer
["Hamburg"] = {}, -- 2,900,000
["Munich"] = {container = "Bavaria"}, -- 2,300,000
["Stuttgart"] = {container = "Baden-Württemberg"}, -- 2,300,000
["Mannheim"] = {container = "Baden-Württemberg"}, -- 1,550,000
["Nuremberg"] = {container = "Bavaria"}, -- 1,120,000
["Hanover"] = {"Lower Saxony"}, -- 1,090,000
["Bielefeld"] = {container = "North Rhine-Westphalia"}, -- 1,080,000
["Leipzig"] = {container = "Saxony"}, -- 1,080,000
["Aachen"] = {container = "North Rhine-Westphalia"}, -- 1,000,000
["Aix-la-Chapelle"] = {alias_of = "Aachen"}, -- historical; not a display alias
["Bremen"] = {},
}
export.germany_cities_group = {
default_container = "เยอรมนี",
canonicalize_key_container = make_canonicalize_key_container(", เยอรมนี", "รัฐ"),
default_placetype = "นคร",
data = export.germany_cities,
}
export.india_cities = {
-- This lists the 65 metro areas per Demographia's 2023 estimates, as found in
-- [[w:List_of_million-plus_urban_agglomerations_in_India]]. The last census in India (as of April 2025) was
-- conducted in 2011, and the results are not accurate any more.
["Delhi"] = {container = {key = "Delhi, อินเดีย", placetype = "union territory"}}, -- 31,190,000
["Mumbai"] = {container = "Maharashtra"}, -- 25,189,000
["Kolkata"] = {container = "West Bengal"}, -- 21,747,000
["Bangalore"] = {container = "Karnataka", wp = "Bengaluru"}, -- 15,257,000
["Bengaluru"] = {alias_of = "Bangalore"},
["Chennai"] = {container = "Tamil Nadu"}, -- 11,570,000
["Hyderabad"] = {container = "Telangana"}, -- 9,797,000
["Ahmedabad"] = {container = "Gujarat"}, -- 8,006,000
["Pune"] = {container = "Maharashtra"}, -- 6,819,000
["Surat"] = {container = "Gujarat"}, -- 6,601,000
["Lucknow"] = {container = "Uttar Pradesh"}, -- 4,661,000
["Jaipur"] = {container = "Rajasthan"}, -- 4,360,000
["Kanpur"] = {container = "Uttar Pradesh"}, -- 4,350,000
["Indore"] = {container = "Madhya Pradesh"}, -- 3,765,000
["Nagpur"] = {container = "Maharashtra"}, -- 3,493,000
["Patna"] = {container = "Bihar"}, -- 3,331,000
["Varanasi"] = {container = "Uttar Pradesh"}, -- 3,229,000
["Kozhikode"] = {container = "Kerala"}, -- 3,049,000
["Thiruvananthapuram"] = {container = "Kerala"}, -- 2,851,000
["Agra"] = {container = "Uttar Pradesh"}, -- 2,737,000
["Bhopal"] = {container = "Madhya Pradesh"}, -- 2,562,000
["Coimbatore"] = {container = "Tamil Nadu"}, -- 2,551,000
["Allahabad"] = {container = "Uttar Pradesh", wp = "Prayagraj"}, -- 2,438,000
["Prayagraj"] = {alias_of = "Allahabad"},
["Kochi"] = {container = "Kerala"}, -- 2,381,000
["Ludhiana"] = {container = "Punjab"}, -- 2,205,000
["Vadodara"] = {container = "Gujarat"}, -- 2,182,000
["Chandigarh"] = {container = {key = "Chandigarh, อินเดีย", placetype = "union territory"}}, -- 2,168,000
["Madurai"] = {container = "Tamil Nadu"}, -- 2,048,000
["Meerut"] = {container = "Uttar Pradesh"}, -- 2,011,000
["Visakhapatnam"] = {container = "Andhra Pradesh"}, -- 2,005,000
["Jamshedpur"] = {container = "Jharkhand"}, -- 1,925,000
["Malappuram"] = {container = "Kerala"}, -- 1,868,000
["Nashik"] = {container = "Maharashtra"}, -- 1,810,000
["Asansol"] = {container = "West Bengal"}, -- 1,720,000
["Aligarh"] = {container = "Uttar Pradesh"}, -- 1,660,000
["Ranchi"] = {container = "Jharkhand"}, -- 1,638,000
["Thrissur"] = {container = "Kerala"}, -- 1,578,000
["Kollam"] = {container = "Kerala"}, -- 1,576,000
["Jabalpur"] = {container = "Madhya Pradesh"}, -- 1,533,000
["Dhanbad"] = {container = "Jharkhand"}, -- 1,503,000
["Jodhpur"] = {container = "Rajasthan"}, -- 1,497,000
["Aurangabad"] = {container = "Maharashtra"}, -- 1,490,000
["Chhatrapati Sambhajinagar"] = {alias_of = "Aurangabad"},
["Rajkot"] = {container = "Gujarat"}, -- 1,487,000
["Gwalior"] = {container = "Madhya Pradesh"}, -- 1,477,000
["Raipur"] = {container = "Chhattisgarh"}, -- 1,429,000
["Gorakhpur"] = {container = "Uttar Pradesh"}, -- 1,410,000
["Kannur"] = {container = "Kerala"}, -- 1,360,000
["Bareilly"] = {container = "Uttar Pradesh"}, -- 1,355,000
["Guwahati"] = {container = "Assam"}, -- 1,355,000
["Moradabad"] = {container = "Uttar Pradesh"}, -- 1,345,000
["Amritsar"] = {container = "Punjab"}, -- 1,313,000
["Mysore"] = {container = "Karnataka"}, -- 1,296,000
["Bhilai"] = {container = "Chhattisgarh"}, -- 1,293,000
["Durg-Bhilainagar"] = {alias_of = "Bhilai"},
["Durg-Bhilai"] = {alias_of = "Bhilai"},
["Durg"] = {alias_of = "Bhilai"},
["Bhilainagar"] = {alias_of = "Bhilai"},
["Vijayawada"] = {container = "Andhra Pradesh"}, -- 1,232,000
["Srinagar"] = {container = {key = "Jammu and Kashmir, อินเดีย", placetype = "union territory"}}, -- 1,212,000
["Salem"] = {container = "Tamil Nadu", wp = "%l, %c"}, -- 1,189,000
["Kota"] = {container = "Rajasthan"}, -- 1,172,000
["Jalandhar"] = {container = "Punjab"}, -- 1,165,000
["Saharanpur"] = {container = "Uttar Pradesh"}, -- 1,152,000
["Dehradun"] = {container = "Uttarakhand"}, -- 1,136,000
["Tiruchirappalli"] = {container = "Tamil Nadu"}, -- 1,131,000
["Bhubaneswar"] = {container = "Odisha"}, -- 1,112,000
["Jammu"] = {container = {key = "Jammu and Kashmir, อินเดีย", placetype = "union territory"}}, -- 1,103,000
["Solapur"] = {container = "Maharashtra"}, -- 1,082,000
["Hubli-Dharwad"] = {container = "Karnataka", wp = "Hubli–Dharwad"}, -- 1,062,000; wp with en dash
["Hubli"] = {alias_of = "Hubli-Dharwad"},
["Dharwad"] = {alias_of = "Hubli-Dharwad"},
["Puducherry"] = {container = {key = "Puducherry, อินเดีย", placetype = "union territory"}}, -- 1,024,000
["Pondicherry"] = {alias_of = "Puducherry", display = true},
-- satellite/secondary cities of metro area (none in citypopulation.de)
["Ghaziabad"] = {container = "Uttar Pradesh"}, -- 1,729,000 city, 2,358,525 urban agglomeration per 2011 census; 3,406,061 2025 estimate from official website; part of Delhi metro area
["Faridabad"] = {container = "Haryana"}, -- 1,414,050 city per 2011 census; part of Delhi metro area
["Thane"] = {container = "Maharashtra"}, -- 1,841,488 city per 2011 census; part of Mumbai metro area
["Kalyan-Dombivli"] = {container = "Maharashtra"}, -- 1,246,381 city per 2011 census; part of Mumbai metro area
["Kalyan-Dombivali"] = {alias_of = "Kalyan-Dombivli", display = true},
["Kalyan"] = {alias_of = "Kalyan-Dombivli"},
["Dombivli"] = {alias_of = "Kalyan-Dombivli"},
["Dombivali"] = {alias_of = "Kalyan-Dombivli"},
["Vasai-Virar"] = {container = "Maharashtra"}, -- 1,221,233 city per 2011 census; part of Mumbai metro area
["Vasai"] = {alias_of = "Vasai-Virar"},
["Virar"] = {alias_of = "Vasai-Virar"},
["Navi Mumbai"] = {container = "Maharashtra"}, -- 1,120,547 city per 2011 census; part of Mumbai metro area
["Howrah"] = {container = "West Bengal"}, -- 1,077,075 city ("metropolis"), 2,811,344 "metro" per 2011 census; part of Kolkata metro area
["Pimpri-Chinchwad"] = {container = "Maharashtra"}, -- 1,727,692 per 2011 census; part of Pune metro area
["Pimpri Chinchwad"] = {alias_of = "Pimpri-Chinchwad", display = true},
}
export.india_cities_group = {
canonicalize_key_container = make_canonicalize_key_container(", อินเดีย", "รัฐ"),
default_placetype = "นคร",
data = export.india_cities,
}
export.indonesia_cities = {
-- cities where the city proper has more than 1,000,000 people as of mid-2023 estimate
["Jakarta"] = {container = "Special Capital Region of Jakarta", divs = {
{type = "ตำบล", container_parent_type = false},
}},
["Surabaya"] = {container = "East Java"},
["Bekasi"] = {container = "West Java"}, -- part of Jakarta metro area
["Bandung"] = {container = "West Java"},
["Medan"] = {container = "North Sumatra"},
["Depok"] = {container = "West Java"}, -- part of Jakarta metro area
["Tangerang"] = {container = "Banten"}, -- part of Jakarta metro area
["Palembang"] = {container = "South Sumatra"},
["Semarang"] = {container = "Central Java"},
["Makassar"] = {container = "South Sulawesi"},
["South Tangerang"] = {container = "Banten"}, -- part of Jakarta metro area
["Batam"] = {container = "Riau Islands"},
["Bogor"] = {container = "West Java"}, -- part of Jakarta metro area
["Pekanbaru"] = {container = "Riau"},
["Bandar Lampung"] = {container = "Lampung"},
-- other metro areas over 1,000,000 people
["Padang"] = {container = "West Sumatra"},
["Samarinda"] = {container = "East Kalimantan"},
["Malang"] = {container = "East Java"},
["Yogyakarta"] = {container = "Special Region of Yogyakarta"},
["Denpasar"] = {container = "Bali"},
["Cirebon"] = {container = "West Java"},
["Surakarta"] = {container = "Central Java"},
["Banjarmasin"] = {container = "South Kalimantan"},
["Tasikmalaya"] = {container = "West Java"},
}
export.indonesia_cities_group = {
canonicalize_key_container = make_canonicalize_key_container(", อินโดนีเซีย", "จังหวัด"),
default_placetype = "นคร",
data = export.indonesia_cities,
}
export.italy_cities = {
-- Data per [[w:List_of_metropolitan_areas_of_Italy]]. There are several lists given; the most recent one, used
-- here, only gives estimates as of Jan 1, 2014.
["Milan"] = {container = "Lombardy"}, -- 6,623,798
["Naples"] = {container = "Campania"}, -- 5,294,546
["Rome"] = {container = "Lazio"}, -- 4,447,881
["Turin"] = {container = "Piedmont"}, -- 1,865,284
["Venice"] = {container = "Veneto"}, -- 1,645,900
["Florence"] = {container = "Tuscany"}, -- 1,485,030
["Bari"] = {container = "Apulia"}, -- 1,257,459
["Palermo"] = {container = "Sicily"}, -- 1,183,084
-- include a few just below 1,000,000 metro area that may be above it by now (depending on the definition).
["Catania"] = {container = "Sicily"}, -- 988,240
["Brescia"] = {container = "Lombardy"}, -- 924,090
["Genoa"] = {container = "Liguria"}, -- 861,318
}
export.italy_cities_group = {
canonicalize_key_container = make_canonicalize_key_container(", Italy", "ภูมิภาค"),
default_placetype = "นคร",
data = export.italy_cities,
}
export.japan_cities = {
-- Population figures from [[w:List of cities in Japan]]. Metro areas from
-- [[w:List of metropolitan areas in Japan]].
["Tokyo"] = {keydesc = "[[Tokyo]] Metropolis, the [[capital city]] and a [[prefecture]] of [[Japan]] (which is a country in [[Asia]])",
placetype = {"นคร", "จังหวัด"},
divs = {
{type = "special wards", container_parent_type = false},
{type = "นคร", prep = "ใน"},
},
},
["Yokohama"] = {container = "Kanagawa"}, -- 3,697,894
["Osaka"] = {container = "Osaka"}, -- 2,668,586
["Nagoya"] = {container = "Aichi"}, -- 2,283,289
-- FIXME, Hokkaido is handled specially.
["Sapporo"] = {container = "Hokkaido"}, -- 1,918,096
["Fukuoka"] = {container = "Fukuoka"}, -- 1,581,527
["Kobe"] = {container = "Hyōgo"}, -- 1,530,847
["Kyoto"] = {container = "Kyoto"}, -- 1,474,570
["Kawasaki"] = {container = "Kanagawa", wp = "%l, Kanagawa"}, -- 1,373,630
["Saitama"] = {container = "Saitama", wp = "%l (city)", commonscat = "%l, %c"}, -- 1,192,418
["Hiroshima"] = {container = "Hiroshima"}, -- 1,163,806
["Sendai"] = {container = "Miyagi"}, -- 1,029,552
-- the remaining cities are considered "central cities" in a 1,000,000+ metro area
-- (sometimes there is more than one central city in the area).
["Kitakyushu"] = {container = "Fukuoka"}, -- 986,998
["Chiba"] = {container = "Chiba", wp = "%l (city)", commonscat = "%l, %c"}, -- 938,695
["Sakai"] = {container = "Osaka"}, -- 835,333
["Niigata"] = {container = "Niigata", wp = "%l (city)", commonscat = "%l, %c"}, -- 813,053
["Hamamatsu"] = {container = "Shizuoka"}, -- 811,431
["Shizuoka"] = {container = "Shizuoka", wp = "%l (city)", commonscat = "%l, %c"}, -- 710,944
["Sagamihara"] = {container = "Kanagawa"}, -- 706,342
["Okayama"] = {container = "Okayama"}, -- 701,293
["Kumamoto"] = {container = "Kumamoto"}, -- 670,348
["Kagoshima"] = {container = "Kagoshima"}, -- 605,196
-- skipped 6 cities (Funabashi, Hachiōji, Kawaguchi, Himeji, Matsuyama, Higashiōsaka)
-- with population in the range 509k - 587k because not central cities in any
-- 1,000,000+ metro area.
["Utsunomiya"] = {container = "Tochigi"}, -- 507,833
}
export.japan_cities_group = {
default_container = "ญี่ปุ่น",
canonicalize_key_container = make_canonicalize_key_container(", ญี่ปุ่น", "จังหวัด"),
default_placetype = "นคร",
data = export.japan_cities,
}
export.mexico_cities = {
["Mexico City"] = {}, -- its own state
["Monterrey"] = {container = "Nuevo León"},
["Guadalajara"] = {container = "Jalisco"},
["Puebla"] = {container = "Puebla", wp = "%l (city)"},
["Toluca"] = {container = "State of Mexico"},
["Tijuana"] = {container = "Baja California"},
-- Include the state in the category for León due to possible confusion with León, Spain.
["León, Guanajuato"] = {container = "Guanajuato", wp = "%l, %c"},
["León"] = {alias_of = "León, Guanajuato"},
["Leon"] = {alias_of = "León, Guanajuato", display = true},
["Querétaro"] = {container = "Querétaro", wp = "%l (city)"},
["Queretaro"] = {alias_of = "Querétaro", display = true},
["Ciudad Juárez"] = {container = "Chihuahua"},
["Juárez"] = {alias_of = "Ciudad Juárez"},
["Juarez"] = {alias_of = "Ciudad Juárez", display = "Juárez"},
["Torreón"] = {container = "Coahuila"},
["Torreon"] = {alias_of = "Torreón", display = true},
-- Include the state in the category for Mérida due to possible confusion with Mérida, Spain or
-- Mérida, Venezuela.
["Mérida, Yucatán"] = {container = "Yucatán", wp = "%l, %c"},
["Mérida"] = {alias_of = "Mérida, Yucatán"},
["Merida"] = {alias_of = "Mérida, Yucatán", display = true},
["San Luis Potosí"] = {container = "San Luis Potosí", wp = "%l (city)"},
["San Luis Potosi"] = {alias_of = "San Luis Potosí", display = true},
["Aguascalientes"] = {container = "Aguascalientes", wp = "%l (city)"},
["Mexicali"] = {container = "Baja California"},
}
export.mexico_cities_group = {
default_container = "Mexico",
canonicalize_key_container = make_canonicalize_key_container(", Mexico", "รัฐ"),
default_placetype = "นคร",
data = export.mexico_cities,
}
export.nigeria_cities = {
-- Figures from citypopulation.de unless otherwise indicated; retrieved 2025-04-26; reference date 2025-01-01.
["Lagos"] = {container = "Lagos"}, -- 21,300,000 (unindicated; population of low reliability)
["Kano"] = {container = "Kano", wp = "%l (city)"}, -- 5,350,000 (unindicated; population of low reliability)
["Ibadan"] = {container = "Oyo"}, -- 3,400,000 (unindicated; population of low reliability)
["Abuja"] = {container = {key = "Federal Capital Territory, Nigeria", placetype = "federal territory"}}, -- 3,050,000 (unindicated; population of low reliability)
["Port Harcourt"] = {container = "Rivers"}, -- 2,250,000 (unindicated; population of low reliability)
["Kaduna"] = {container = "Kaduna"}, -- 1,980,000 (unindicated; population of low reliability)
["Benin City"] = {container = "Edo"}, -- 1,790,000 (unindicated; population of low reliability)
["Aba"] = {container = "Abia", wp = "%l, Nigeria"}, -- 1,280,000 (unindicated; population of low reliability)
["Onitsha"] = {container = "Anambra"}, -- 1,230,000 (unindicated; population of low reliability)
["Maiduguri"] = {container = "Borno"}, -- 1,190,000 (unindicated; population of low reliability)
["Ilorin"] = {container = "Kwara"}, -- 1,160,000 (unindicated; population of low reliability)
["Sokoto"] = {container = "Sokoto", wp = "%l (city)"}, -- 1,140,000 (unindicated; population of low reliability)
["Jos"] = {container = "Plateau"}, -- 1,110,000 (unindicated; population of low reliability)
["Zaria"] = {container = "Kaduna"}, -- 1,050,000 (unindicated; population of low reliability)
["Enugu"] = {container = "Enugu", wp = "%l (city)"}, -- 1,010,000 (unindicated; population of low reliability)
}
export.nigeria_cities_group = {
default_container = "Nigeria",
canonicalize_key_container = make_canonicalize_key_container(" State, Nigeria", "รัฐ"),
default_placetype = "นคร",
data = export.nigeria_cities,
}
export.pakistan_cities = {
-- Figures from citypopulation.de; retrieved 2025-04-26; reference date 2025-01-01.
["Karachi"] = {container = "Sindh"}, -- 21,000,000 (Consolidated Urban Area)
["Lahore"] = {container = "Punjab"}, -- 14,600,000 (Consolidated Urban Area)
["Rawalpindi"] = {container = "Punjab"}, -- 5,600,000 (Consolidated Urban Area; including Islamabad)
["Islamabad"] = {container = {key = "Islamabad Capital Territory, Pakistan", placetype = "federal territory"}}, -- 5,600,000 (Consolidated Urban Area; including Rawalpindi)
["Faisalabad"] = {container = "Punjab"}, -- 4,125,000 (Consolidated Urban Area)
["Gujranwala"] = {container = "Punjab"}, -- 3,450,000 (Consolidated Urban Area)
-- there is also Hyderabad in India (very confusing)
["Hyderabad, Pakistan"] = {container = "Sindh", wp = "%l, %c"}, -- 2,475,000 (Consolidated Urban Area)
["Hyderabad"] = {alias_of = "Hyderabad, Pakistan"},
["Multan"] = {container = "Punjab"}, -- 2,425,000 (Consolidated Urban Area)
["Peshawar"] = {container = "Khyber Pakhtunkhwa"}, -- 2,150,000 (Consolidated Urban Area)
["Quetta"] = {container = "Balochistan"}, -- 1,720,000 (Urban Area)
["Sargodha"] = {container = "Punjab"}, -- 1,080,000 (Urban Area)
["Sialkot"] = {container = "Punjab"}, -- 1,050,000 (Consolidated Urban Area)
}
export.pakistan_cities_group = {
canonicalize_key_container = make_canonicalize_key_container(", Pakistan", "จังหวัด"),
default_placetype = "นคร",
data = export.pakistan_cities,
}
export.philippines_cities = {
-- Skipped some cities in Metro Manila (Taguig, Pasig) which don't have districts.
-- Other cities outside Metro Manila skipped as not central city in their urban area.
["Quezon City"] = {container = {key = "Metro Manila, Philippines", placetype = "ภูมิภาค"}},
-- Don't display-canonicalize Foo to Foo City as it may make the display weird.
["Quezon"] = {alias_of = "Quezon City"},
["Manila"] = {container = {key = "Metro Manila, Philippines", placetype = "ภูมิภาค"}},
["Davao City"] = {container = "Davao del Sur"},
["Davao"] = {alias_of = "Davao City"},
["Caloocan"] = {container = {key = "Metro Manila, Philippines", placetype = "ภูมิภาค"}},
["Zamboanga City"] = {container = "Zamboanga del Sur"},
["Zamboanga"] = {alias_of = "Zamboanga City"},
["Cebu City"] = {container = "Cebu"},
["Cebu"] = {alias_of = "Cebu City"},
["Antipolo"] = {container = "Rizal"},
["Cagayan de Oro"] = {container = "Misamis Oriental"},
["Dasmariñas"] = {container = "Cavite"},
["Dasmarinas"] = {alias_of = "Dasmariñas", display = true},
["General Santos"] = {container = "South Cotabato"},
["San Jose del Monte"] = {container = "Bulacan"},
["Bacolod"] = {container = "Negros Occidental"},
["Calamba"] = {container = "Laguna", wp = "%l, %c"},
["Angeles"] = {container = "Pampanga", wp = "Angeles City"},
["Angeles City"] = {alias_of = "Angeles"},
["Iloilo City"] = {container = "Iloilo"},
["Iloilo"] = {alias_of = "Iloilo City"},
}
export.philippines_cities_group = {
canonicalize_key_container = make_canonicalize_key_container(", Philippines", "จังหวัด"),
default_placetype = "นคร",
data = export.philippines_cities,
}
export.russia_cities = {
-- Figures from citypopulation.de; retrieved 2025-04-26; reference date 2025-01-01.
["Moscow"] = {}, -- 18,800,000 (Agglomeration)
["Saint Petersburg"] = {}, -- 6,350,000 (Agglomeration)
["Novosibirsk"] = {container = "Novosibirsk Oblast"}, -- 1,820,000 (Agglomeration)
["Yekaterinburg"] = {container = "Sverdlovsk Oblast"}, -- 1,810,000 (Agglomeration)
["Nizhny Novgorod"] = {container = "Nizhny Novgorod Oblast"}, -- 1,620,000 (Agglomeration)
["Kazan"] = {container = {key = "Tatarstan, Russia", placetype = "republic"}}, -- 1,560,000 (Agglomeration)
["Chelyabinsk"] = {container = "Chelyabinsk Oblast"}, -- 1,430,000 (Agglomeration)
["Rostov-on-Don"] = {container = "Rostov Oblast"}, -- 1,390,000 (Agglomeration)
["Rostov-na-Donu"] = {alias_of = "Rostov-on-Don", display = true},
["Krasnodar"] = {container = {key = "Krasnodar Krai, Russia", placetype = "krai"}}, -- 1,370,000 (Agglomeration)
["Samara"] = {container = "Samara Oblast"}, -- 1,350,000 (Agglomeration)
["Krasnoyarsk"] = {container = {key = "Krasnoyarsk Krai, Russia", placetype = "krai"}}, -- 1,270,000 (Agglomeration)
["Ufa"] = {container = {key = "Bashkortostan, Russia", placetype = "republic"}}, -- 1,230,000 (Agglomeration)
["Saratov"] = {container = "Saratov Oblast"}, -- 1,170,000 (Agglomeration)
["Omsk"] = {container = "Omsk Oblast"}, -- 1,140,000 (Agglomeration)
["Voronezh"] = {container = "Voronezh Oblast"}, -- 1,130,000 (Agglomeration)
["Volgograd"] = {container = "Volgograd Oblast"}, -- 1,080,000 (Agglomeration)
["Perm"] = {container = {key = "Perm Krai, Russia", placetype = "krai"}, wp = "%l, Russia"}, -- 1,070,000 (Agglomeration)
}
export.russia_cities_group = {
canonicalize_key_container = make_canonicalize_key_container(", Russia", "oblast"),
default_container = "Russia",
default_placetype = "นคร",
data = export.russia_cities,
}
export.saudi_arabia_cities = {
-- Figures for the first five from [[w:List of cities and towns in Saudi Arabia]] as of 2022. Unclear if these are
-- metro, urban or city proper figures.
["Riyadh"] = {container = "Riyadh"}, -- 7,000,100; 7,700,000 per citypopulation.de 2025-01-01 (Agglomeration)
["Jeddah"] = {container = "Mecca"}, -- 3,751,917; 3,950,000 per citypopulation.de 2025-01-01 (Agglomeration)
["Jedda"] = {alias_of = "Jeddah", display = true},
["Jiddah"] = {alias_of = "Jeddah", display = true},
["Jidda"] = {alias_of = "Jeddah", display = true},
["Dammam"] = {container = "Eastern"}, -- 2,638,166; 2,925,000 per citypopulation.de 2025-01-01 (Agglomeration)
["Mecca"] = {container = "Mecca"}, -- 2,385,509; 2,675,000 per citypopulation.de 2025-01-01 (Agglomeration)
["Makkah"] = {alias_of = "Mecca", display = true},
["Medina"] = {container = "Medina"}, -- 1,477,023; 1,530,000 per citypopulation.de 2025-01-01 (City)
["Hofuf"] = {container = "Eastern"}, -- 1,060,000 per citypopulation.de 2025-01-01 (Agglomeration)
["Khamis Mushait"] = {container = "Aseer"}, -- 1,030,000 per citypopulation.de 2025-01-01 (Agglomeration)
["Khamis Mushayt"] = {alias_of = "Khamis Mushait", display = true},
}
export.saudi_arabia_cities_group = {
canonicalize_key_container = make_canonicalize_key_container(" Province, Saudi Arabia", "จังหวัด"),
default_placetype = "นคร",
data = export.saudi_arabia_cities,
}
export.south_korea_cities = {
-- All cities listed are not associated with any county.
["Seoul"] = {},
["Busan"] = {},
["Incheon"] = {},
["Daegu"] = {},
["Daejeon"] = {},
["Gwangju"] = {},
["Ulsan"] = {},
}
export.south_korea_cities_group = {
default_container = "South Korea",
canonicalize_key_container = make_canonicalize_key_container(" County, South Korea", "จังหวัด"),
default_placetype = "นคร",
data = export.south_korea_cities,
}
export.spain_cities = {
["Madrid"] = {container = "Community of Madrid"},
["Barcelona"] = {container = "Catalonia"},
["Valencia"] = {container = "Valencia"},
["Seville"] = {container = "Andalusia"},
["Bilbao"] = {container = "Basque Country"},
}
export.spain_cities_group = {
canonicalize_key_container = make_canonicalize_key_container(", Spain", "autonomous community"),
default_placetype = "นคร",
data = export.spain_cities,
}
export.taiwan_cities = {
["New Taipei City"] = {},
["New Taipei"] = {alias_of = "New Taipei City", display = true},
["Taichung"] = {},
["Kaohsiung"] = {wp = "%l, ไต้หวัน"},
["Taipei"] = {},
["Taoyuan"] = {},
["Tainan"] = {},
-- these last three are not special municipalities
["Chiayi"] = {placetype = "นคร"},
["Hsinchu"] = {placetype = "นคร"},
["Keelung"] = {placetype = "นคร"},
}
export.taiwan_cities_group = {
placename_to_key = false, -- don't add ", ไต้หวัน" to make the key
canonicalize_key_container = make_canonicalize_key_container(", ไต้หวัน", "เทศมณฑล"),
default_container = "ไต้หวัน",
default_placetype = {"special municipality", "เทศบาล", "นคร"},
default_is_city = true,
default_divs = {"อำเภอ"},
data = export.taiwan_cities,
}
-- NOTE: It's OK to mix cities from different constituent countries; as long as the immediate container is correct,
-- everything else will be figured out.
export.united_kingdom_cities = {
["London"] = {container = "Greater London"},
["Manchester"] = {container = "Greater Manchester"},
["Birmingham"] = {container = "West Midlands"},
["Liverpool"] = {container = "Merseyside"},
["Glasgow"] = {container = {key = "City of Glasgow, Scotland", placetype = "council area"}},
["Leeds"] = {container = "West Yorkshire"},
["Newcastle upon Tyne"] = {container = "Tyne and Wear"},
["Newcastle"] = {alias_of = "Newcastle upon Tyne"},
["Bristol"] = {container = {key = "England", placetype = "constituent country"}},
["Cardiff"] = {container = {key = "Wales", placetype = "constituent country"}},
["Portsmouth"] = {container = "Hampshire"},
["Edinburgh"] = {container = {key = "City of Edinburgh, Scotland", placetype = "council area"}},
-- under 1,000,000 people but principal areas of Wales; requested by [[User:Donnanz]]
["Swansea"] = {container = {key = "Wales", placetype = "constituent country"}},
["Newport"] = {container = {key = "Wales", placetype = "constituent country"}, wp = "Newport, Wales"},
}
export.united_kingdom_cities_group = {
canonicalize_key_container = make_canonicalize_key_container(", England", "เทศมณฑล"),
default_placetype = "นคร",
data = export.united_kingdom_cities,
}
export.united_states_cities = {
-- top 50 CSA's by population, with the top and sometimes 2nd or 3rd city listed
["New York City"] = {container = "New York", wp = "%l", divs = {
{type = "boroughs", container_parent_type = false},
}},
-- Don't display-canonicalize as it may make the display weird (e.g. in the context New York, New York).
["New York"] = {alias_of = "New York City"},
["Newark"] = {container = "New Jersey"},
["Los Angeles"] = {container = "California", wp = "%l"},
["Long Beach"] = {container = "California"},
["Riverside"] = {container = "California"},
["Chicago"] = {container = "Illinois", wp = "%l"},
["Washington, D.C."] = {wp = "%l"},
["Washington, DC"] = {alias_of = "Washington, D.C.", display = true},
["Washington D.C."] = {alias_of = "Washington, D.C.", display = true},
["Washington DC"] = {alias_of = "Washington, D.C.", display = true},
-- Don't display-canonicalize as it may make the display weird (e.g. if the holonym is followed by a District of
-- Columbia holonym).
["Washington"] = {alias_of = "Washington, D.C."},
["Baltimore"] = {container = "Maryland", wp = "%l"},
-- to avoid conflict with San Jose in Costa Rica
["San Jose, California"] = {container = "California"},
["San Jose"] = {alias_of = "San Jose, California"},
["San Francisco"] = {container = "California", wp = "%l"},
["Oakland"] = {container = "California"},
["Boston"] = {container = "Massachusetts", wp = "%l"},
["Providence"] = {container = "Rhode Island"},
["Dallas"] = {container = "Texas", wp = "%l", commonscat = "%l, %c"},
["Fort Worth"] = {container = "Texas"},
["Philadelphia"] = {container = "Pennsylvania", wp = "%l"},
["Houston"] = {container = "Texas", wp = "%l"},
["Miami"] = {container = "Florida", wp = "%l", commonscat = "%l, %c"},
["Atlanta"] = {container = "Georgia", wp = "%l"},
["Detroit"] = {container = "Michigan", wp = "%l"},
["Phoenix"] = {container = "Arizona", wp = "%l", commonscat = "%l, %c"},
["Mesa"] = {container = "Arizona"},
["Seattle"] = {container = "Washington", wp = "%l"},
["Orlando"] = {container = "Florida"},
["Minneapolis"] = {container = "Minnesota", wp = "%l"},
["Cleveland"] = {container = "Ohio", wp = "%l", commonscat = "%l, %c"},
["Denver"] = {container = "Colorado", wp = "%l", commonscat = "%l, %c"},
["San Diego"] = {container = "California", wp = "%l", commonscat = "%l, %c"},
["Portland"] = {container = "Oregon"},
["Tampa"] = {container = "Florida"},
["St. Louis"] = {container = "Missouri", wp = "%l", commonscat = "%l, %c"},
["Saint Louis"] = {alias_of = "St. Louis", display = true},
["Charlotte"] = {container = "North Carolina"},
["Sacramento"] = {container = "California"},
["Pittsburgh"] = {container = "Pennsylvania", wp = "%l"},
["Salt Lake City"] = {container = "Utah", wp = "%l"},
["San Antonio"] = {container = "Texas", wp = "%l", commonscat = "%l, %c"},
["Columbus"] = {container = "Ohio"},
["Kansas City"] = {container = "Missouri", wp = "%l metropolitan area", commonscat = "%l, %c"},
["Indianapolis"] = {container = "Indiana", wp = "%l"},
["Las Vegas"] = {container = "Nevada", wp = "%l"},
["Cincinnati"] = {container = "Ohio", wp = "%l", commonscat = "%l, %c"},
["Austin"] = {container = "Texas"},
["Milwaukee"] = {container = "Wisconsin", wp = "%l", commonscat = "%l, %c"},
["Raleigh"] = {container = "North Carolina"},
["Nashville"] = {container = "Tennessee"},
["Virginia Beach"] = {container = "Virginia"},
["Norfolk"] = {container = "Virginia"},
["Greensboro"] = {container = "North Carolina"},
["Winston-Salem"] = {container = "North Carolina"},
["Jacksonville"] = {container = "Florida"},
["New Orleans"] = {container = "Louisiana", wp = "%l"},
["Louisville"] = {container = "Kentucky"},
["Greenville"] = {container = "South Carolina"},
["Hartford"] = {container = "Connecticut"},
["Oklahoma City"] = {container = "Oklahoma", wp = "%l"},
["Grand Rapids"] = {container = "Michigan"},
["Memphis"] = {container = "Tennessee"},
["Birmingham, Alabama"] = {container = "Alabama"},
["Birmingham"] = {alias_of = "Birmingham, Alabama"},
["Fresno"] = {container = "California"},
["Richmond"] = {container = "Virginia"},
["Harrisburg"] = {container = "Pennsylvania"},
-- any major city of top 50 MSA's that's missed by previous
["Buffalo"] = {container = "New York"},
-- any of the top 50 city by city population that's missed by previous
["El Paso"] = {container = "Texas"},
["Albuquerque"] = {container = "New Mexico"},
["Tucson"] = {container = "Arizona"},
["Colorado Springs"] = {container = "Colorado"},
["Omaha"] = {container = "Nebraska"},
["Tulsa"] = {container = "Oklahoma"},
-- skip Arlington, Texas; too obscure and likely to be interpreted as Arlington, Virginia
}
export.united_states_cities_group = {
default_container = "สหรัฐอเมริกา",
canonicalize_key_container = make_canonicalize_key_container(", USA", "รัฐ"),
default_placetype = "นคร",
default_wp = "%l, %c",
data = export.united_states_cities,
}
export.new_york_boroughs = {
["Bronx"] = {the = true, wp = "The Bronx"},
["Brooklyn"] = {},
["Manhattan"] = {},
["Queens"] = {},
["Staten Island"] = {},
}
export.new_york_boroughs_group = {
default_container = {key = "New York City", placetype = "นคร"},
default_placetype = "borough",
default_is_city = true,
data = export.new_york_boroughs,
}
export.vietnam_cities = {
-- Figures from citypopulation.de (retrieved 2025-04-26; reference date 2025-01-01) unless otherwise indicated.
["Ho Chi Minh City"] = {}, -- 14,300,000 (Agglomeration; inclunding Bien Hoa)
["Saigon"] = {alias_of = "Ho Chi Minh City"},
["Hanoi"] = {}, -- 7,350,000 (Agglomeration)
["Da Nang"] = {}, -- 1,500,000 (Agglomeration)
["Danang"] = {alias_of = "Da Nang", display = true},
["Haiphong"] = {}, -- 1,450,000 (Agglomeration)
["Hai Phong"] = {alias_of = "Haiphong", display = true},
-- This is the one entry in this list that is not a province-level municipality; instead it's a "provincial city"
-- meaning it is directly under its province as opposed to being contained in a district.
["Bien Hoa"] = {placetype = "นคร", container = "Đồng Nai", wp = "Biên Hòa"}, -- 1,272,235 (2022 city population per Wikipedia)
["Biên Hòa"] = {alias_of = "Bien Hoa", display = true},
["Biên Hoà"] = {alias_of = "Bien Hoa", display = true},
-- These two not in citypopulation.de because the urban population may be slightly under 1,000,000, but they are
-- both province-level municipalities and close to the 1,000,000 mark.
["Can Tho"] = {wp = "Cần Thơ"}, -- 1,456,000 municipality (2019 census), 994,704 urban (2022 General Statistics Office of Vietnam estimate); capital [[Ninh Kiều district]]
["Cần Thơ"] = {alias_of = "Can Tho", display = true},
["Hue"] = {wp = "Huế"}, -- 1,257,000 municipality (2019 census), 840,000 urban (2022 General Statistics Office of Vietnam estimate); -- capital [[Thuận Hóa district]]
["Huế"] = {alias_of = "Hue", display = true},
}
export.vietnam_cities_group = {
placename_to_key = false, -- don't add ", เวียดนาม" to make the key
default_container = "เวียดนาม",
canonicalize_key_container = make_canonicalize_key_container(", เวียดนาม", "จังหวัด"),
-- Most of the cities listed are province-level municipalities in addition, which contain a certain amount of
-- rural territory surrounding the city, but not enough to separate the municipality from the city as distinct
-- known locations.
default_placetype = {"เทศบาล", "นคร"},
default_is_city = true,
-- There may not be enough districts to subcategorize like this.
-- default_divs = "อำเภอ",
data = export.vietnam_cities,
}
export.misc_cities = {
------------------ Africa -------------------
-- Sorted by country and then within the country, by decreasing population; figures from citypopulation.de
-- (retrieved 2025-04-26; reference date 2025-01-01) unless otherwise indicated; combined with data from
-- [[w:List of urban areas in Africa by population]].
["Algiers"] = {container = "แอลจีเรีย"}, -- 4,325,000 (Consolidated Urban Area)
["Oran"] = {container = "แอลจีเรีย"}, -- 1,640,000 (Consolidated Urban Area)
["Luanda"] = {container = "แองโกลา"}, -- 9,650,000 (Urban Area)
["Benguela"] = {container = "แองโกลา"}, -- 1,420,000 (Urban Area)
["Cotonou"] = {container = "เบนิน"}, -- 2,150,000 (Agglomeration)
["Ouagadougou"] = {container = "บูร์กินาฟาโซ"}, -- 3,425,000 (Agglomeration)
["Bobo-Dioulasso"] = {container = "บูร์กินาฟาโซ"}, -- 1,100,000 (Agglomeration)
["Bujumbura"] = {container = "บุรุนดี"}, -- 1,143,202 (Urban Area 2023 per PopulationStat, cited in Wikipedia)
["Yaoundé"] = {container = "แคเมอรูน"}, -- 3,975,000 (City)
["Yaounde"] = {alias_of = "Yaoundé", display = true},
["Douala"] = {container = "แคเมอรูน"}, -- 3,900,000 (City)
["Bangui"] = {container = "สาธารณรัฐแอฟริกากลาง"}, -- 1,680,000 (Agglomeration)
["N'Djamena"] = {container = "ชาด"}, -- 1,950,000 (City)
["Ndjamena"] = {alias_of = "N'Djamena", display = true},
["Kinshasa"] = {container = "สาธารณรัฐประชาธิปไตยคองโก"}, -- 16,300,000 (City; population of low reliability)
["Lubumbashi"] = {container = "สาธารณรัฐประชาธิปไตยคองโก"}, -- 2,875,000 (City; population of low reliability)
["Mbuji-Mayi"] = {container = "สาธารณรัฐประชาธิปไตยคองโก"}, -- 2,500,000 (City; population of low reliability)
["Kananga"] = {container = "สาธารณรัฐประชาธิปไตยคองโก"}, -- 1,370,000 (City; population of low reliability)
["Kisangani"] = {container = "สาธารณรัฐประชาธิปไตยคองโก"}, -- 1,300,000 (City; population of low reliability)
["Bukavu"] = {container = "สาธารณรัฐประชาธิปไตยคองโก"}, -- 1,100,000 (City; population of low reliability)
["Goma"] = {container = "สาธารณรัฐประชาธิปไตยคองโก"}, -- 1,010,000 (City; population of low reliability)
["Tshikapa"] = {container = "สาธารณรัฐประชาธิปไตยคองโก"}, -- 1,020,468 (2023 Wikipedia [[w:List of cities with over one million inhabitants]] from populationstat.com; not in citypopulation.de)
["Cairo"] = {container = "อียิปต์"}, -- 22,800,000 (Agglomeration, including Giza and Subhra El Kheima)
["Alexandria"] = {container = "อียิปต์"}, -- 6,250,000 (Agglomeration)
["Giza"] = {container = "อียิปต์"}, -- 4,458,135 (2023 from citypopulation.de)
["Shubra El Kheima"] = {container = "อียิปต์"}, -- 1,240,239 (2021 from citypopulation.de)
["Asmara"] = {container = "เอริเทรีย"}, -- 1,090,000 (City; population of low reliability)
["Asmera"] = {alias_of = "Asmara", display = true},
["Addis Ababa"] = {container = "เอธิโอเปีย"}, -- 4,825,000 (Agglomeration)
["Banjul"] = {container = "Gambia"}, -- 1,170,000 (Agglomeration)
["Accra"] = {container = "กานา"}, -- 6,800,000 (Agglomeration)
["Kumasi"] = {container = "กานา"}, -- 2,900,000 (Agglomeration)
["Conakry"] = {container = "กินี"}, -- 2,975,000 (Consolidated Urban Area)
["Abidjan"] = {container = "โกตดิวัวร์"}, -- 7,050,000 (Agglomeration)
["Nairobi"] = {container = "Kenya"}, -- 6,900,000 (unindicated)
["Mombasa"] = {container = "Kenya"}, -- 1,370,000 (City)
["Monrovia"] = {container = "Liberia"}, -- 1,940,000 (Urban Area)
["Tripoli"] = {container = "Libya", wp = "%l, %c"}, -- 1,870,000 (unindicated)
["Antananarivo"] = {container = "Madagascar"}, -- 3,150,000 (Agglomeration)
["Lilongwe"] = {container = "Malawi"}, -- 1,210,000 (City)
["Bamako"] = {container = "Mali"}, -- 5,700,000 (Agglomeration)
["Nouakchott"] = {container = "Mauritania"}, -- 1,500,000 (City)
["Casablanca"] = {container = {key = "Casablanca-Settat, Morocco", placetype = "ภูมิภาค"}}, -- 4,450,000 (Municipality (urban population))
["Rabat"] = {container = {key = "Rabat-Sale-Kenitra, Morocco", placetype = "ภูมิภาค"}}, -- 2,125,000 (Municipality (urban population))
["Tangier"] = {container = {key = "Tangier-Tetouan-Al Hoceima, Morocco", placetype = "ภูมิภาค"}}, -- 1,410,000 (Municipality (urban population))
["Tanger"] = {alias_of = "Tangier", display = true},
["Tangiers"] = {alias_of = "Tangier", display = true},
["Fez"] = {container = {key = "Fez-Meknes, Morocco", placetype = "ภูมิภาค"}, wp = "%l, Morocco"}, -- 1,310,000 (Municipality (urban population))
["Fes"] = {alias_of = "Fez", display = true},
["Fès"] = {alias_of = "Fez", display = true},
["Agadir"] = {container = {key = "Souss-Massa, Morocco", placetype = "ภูมิภาค"}}, -- 1,270,000 (Municipality (urban population))
["Marrakesh"] = {container = {key = "Marrakesh-Safi, Morocco", placetype = "ภูมิภาค"}}, -- 1,140,000 (Municipality (urban population))
["Marrakech"] = {alias_of = "Marrakesh", display = true},
["Maputo"] = {container = "Mozambique"}, -- 2,575,000 (Agglomeration)
["Niamey"] = {container = "Niger"}, -- 1,530,000 (City)
["Brazzaville"] = {container = "Republic of the Congo"}, -- 2,475,000 (Agglomeration)
["Pointe-Noire"] = {container = "Republic of the Congo"}, -- 1,480,000 (City)
["Kigali"] = {container = "Rwanda"}, -- 1,960,000 (Municipality (urban population))
["Dakar"] = {container = "Senegal"}, -- 4,225,000 (Agglomeration)
["Touba"] = {container = "Senegal"}, -- 1,320,000 (Agglomeration)
["Freetown"] = {container = "Sierra Leone"}, -- 1,420,000 (Agglomeration)
["Mogadishu"] = {container = "โซมาเลีย"}, -- 2,250,000 (unindicated; population of low reliability)
["Johannesburg"] = {container = {key = "Gauteng, South Africa", placetype = "จังหวัด"}}, -- 14,800,000 (Consolidated Urban Area; including Pretoria, Soweto, etc.)
["Cape Town"] = {container = {key = "Western Cape, South Africa", placetype = "จังหวัด"}}, -- 5,100,000 (Consolidated Urban Area)
["Durban"] = {container = {key = "KwaZulu-Natal, South Africa", placetype = "จังหวัด"}}, -- 3,900,000 (Consolidated Urban Area)
["Pretoria"] = {container = {key = "Gauteng, South Africa", placetype = "จังหวัด"}}, -- 2,921,488 (2011 census)
["Port Elizabeth"] = {container = {key = "Eastern Cape, South Africa", placetype = "จังหวัด"}, wp = "Gqeberha"}, -- 1,200,000 (Consolidated Urban Area)
["Gqeberha"] = {alias_of = "Port Elizabeth"}, -- official name; not a display alias
["Khartoum"] = {container = "Sudan"}, -- 7,200,000 (unindicated; population of low reliability)
["Dar es Salaam"] = {container = "Tanzania"}, -- 6,650,000 (Agglomeration)
["Mwanza"] = {container = "Tanzania"}, -- 1,340,000 (Agglomeration)
["Mwanza City"] = {alias_of = "Mwanza", display = true},
["Arusha"] = {container = "Tanzania"}, -- 1,190,000 (Agglomeration)
["Zanzibar"] = {container = "Tanzania"}, -- 1,030,000 (Agglomeration)
["Lomé"] = {container = "Togo"}, -- 2,625,000 (unindicated)
["Lome"] = {alias_of = "Lomé", display = true},
["Tunis"] = {container = "Tunisia"}, -- 2,725,000 (Municipality (urban population))
["Sousse"] = {container = "Tunisia"}, -- 1,180,000 (Municipality (urban population))
["Soussa"] = {alias_of = "Sousse", display = true},
["Kampala"] = {container = "Uganda"}, -- 4,300,000 (unindicated)
["Lusaka"] = {container = "Zambia"}, -- 3,000,000 (Consolidated Urban Area)
["Harare"] = {container = "Zimbabwe"}, -- 2,675,000 (Agglomeration)
------------------ Asia -------------------
-- sorted by country and then within the country, by decreasing population; figures from citypopulation.de
-- (retrieved 2025-04-26; reference date 2025-01-01) unless otherwise indicated.
["Kabul"] = {container = "อัฟกานิสถาน"}, -- 5,250,000 (Agglomeration)
["Baku"] = {container = "อาเซอร์ไบจาน"}, -- 3,725,000 (Administrative Area (urban population))
["Manama"] = {container = "บาห์เรน"}, -- 1,560,000 (unindicated)
["Dhaka"] = {container = {key = "Dhaka Division, บังกลาเทศ", placetype = "division"}}, -- 23,100,000 (Agglomeration)
["Dacca"] = {alias_of = "Dhaka", display = true},
["Chittagong"] = {container = {key = "Chittagong Division, บังกลาเทศ", placetype = "division"}}, -- 5,050,000 (Agglomeration)
["Gazipur"] = {container = {key = "Dhaka Division, บังกลาเทศ", placetype = "division"}}, -- 2,674,697 (City per 2022; countied in citypopulation.de as part of Dhaka metro area)
["Khulna"] = {container = {key = "Khulna Division, บังกลาเทศ", placetype = "division"}}, -- 1,210,000 (Agglomeration)
["Phnom Penh"] = {container = "กัมพูชา"}, -- 2,925,000 (Agglomeration)
["Tehran"] = {container = {key = "Tehran, อิหร่าน", placetype = "จังหวัด"}}, -- 16,800,000 (Agglomeration)
["Teheran"] = {alias_of = "Tehran", display = true},
["Mashhad"] = {container = {key = "Razavi Khorasan, อิหร่าน", placetype = "จังหวัด"}}, -- 3,475,000 (Agglomeration)
["Mashad"] = {alias_of = "Mashhad", display = true},
["Meshhed"] = {alias_of = "Mashhad", display = true},
["Meshed"] = {alias_of = "Mashhad", display = true},
["Isfahan"] = {container = {key = "Isfahan, อิหร่าน", placetype = "จังหวัด"}}, -- 3,425,000 (Agglomeration)
["Esfahan"] = {alias_of = "Isfahan", display = true},
["Tabriz"] = {container = {key = "East Azerbaijan, อิหร่าน", placetype = "จังหวัด"}}, -- 1,970,000 (Agglomeration)
["Shiraz"] = {container = {key = "Fars, อิหร่าน", placetype = "จังหวัด"}}, -- 1,950,000 (Agglomeration)
["Ahvaz"] = {container = {key = "Khuzestan, อิหร่าน", placetype = "จังหวัด"}}, -- 1,550,000 (Agglomeration)
["Qom"] = {container = {key = "Qom, อิหร่าน", placetype = "จังหวัด"}}, -- 1,450,000 (City)
["Kermanshah"] = {container = {key = "Kermanshah, อิหร่าน", placetype = "จังหวัด"}}, -- 1,130,000 (City)
["Baghdad"] = {container = "อิรัก"}, -- 7,800,000 (Administrative Area (urban population))
["Basra"] = {container = "อิรัก"}, -- 1,710,000 (Administrative Area (urban population))
["Mosul"] = {container = "อิรัก"}, -- 1,550,000 (Administrative Area (urban population))
["Erbil"] = {container = "อิรัก"}, -- 1,220,000 (Administrative Area (urban population))
["Kirkuk"] = {container = "อิรัก"}, -- 1,160,000 (Administrative Area (urban population))
["Najaf"] = {container = "อิรัก"}, -- 1,050,000 (Administrative Area (urban population))
["Tel Aviv"] = {container = "อิสราเอล"}, -- 3,000,000 (Agglomeration)
-- Jerusalem is not recognized internationally as part of either Israel or Palestine, but as a
-- [[w:corpus separatum]], so put the container as "เอเชีย" and list Israel and Palestine as additional parents for
-- categorization purposes.
["Jerusalem"] = {container = {key = "เอเชีย", placetype = "ทวีป"},
addl_parents = {"อิสราเอล", "Palestine"}}, -- 1,080,000 (Agglomeration)
["Amman"] = {container = "Jordan"}, -- 6,150,000 (unindicated)
["Irbid"] = {container = "Jordan"}, -- 1,070,000 (unindicated)
["Almaty"] = {container = "Kazakhstan"}, -- 2,700,000 (Agglomeration)
["Alma-Ata"] = {alias_of = "Almaty"}, -- former name, sometimes still used; don't display-canonicalize
["Astana"] = {container = "Kazakhstan"}, -- 1,600,000 (Agglomeration)
["Shymkent"] = {container = "Kazakhstan"}, -- 1,370,000 (Agglomeration)
["Kuwait City"] = {container = "Kuwait"}, -- 5,050,000 (Agglomeration)
["Bishkek"] = {container = "Kyrgyzstan"}, -- 1,540,000 (Agglomeration)
["Beirut"] = {container = "Lebanon"}, -- 1,930,000 (unindicated; population of low reliability)
-- Kuala Lumpur is a federal capital city, not in any state
["Kuala Lumpur"] = {container = "Malaysia"}, -- 9,550,000 (Agglomeration)
-- there are various George Towns and Georgetowns
["George Town, Malaysia"] = {container = {key = "Penang, Malaysia", placetype = "รัฐ"}, wp = "%l, %c"}, -- 2,075,000 (Agglomeration)
["George Town"] = {alias_of = "George Town, Malaysia"},
["Ulaanbaatar"] = {container = "Mongolia"}, -- 1,610,000 (City)
["Ulan Bator"] = {alias_of = "Ulaanbaatar", display = true},
["Yangon"] = {container = "Myanmar"}, -- 5,650,000 (Municipality (urban population))
["Rangoon"] = {alias_of = "Yangon", display = true},
["Mandalay"] = {container = "Myanmar"}, -- 1,600,000 (Municipality (urban population))
["Kathmandu"] = {container = "Nepal"}, -- 3,175,000 (Agglomeration)
-- Pyongyang is a directly governed city, not in any province
["Pyongyang"] = {container = "North Korea"}, -- 3,025,000 (Administrative Area (urban population))
["Muscat"] = {container = "Oman"}, -- 1,620,000 (Agglomeration)
["Gaza"] = {container = "Palestine", wp = "Gaza City"}, -- 2,275,000 (unindicated)
["Gaza City"] = {alias_of = "Gaza"},
["Doha"] = {container = "Qatar"}, -- 2,650,000 (Agglomeration)
["Colombo"] = {container = "Sri Lanka"}, -- 4,975,000 (unindicated)
["Damascus"] = {container = "Syria"}, -- 3,975,000 (unindicated; population of low reliability)
["Aleppo"] = {container = "Syria"}, -- 1,980,000 (unindicated; population of low reliability)
["Dushanbe"] = {container = "Tajikistan"}, -- 1,270,000 (City)
["Bangkok"] = {container = "Thailand"}, -- 21,800,000 (Agglomeration)
-- Chiang Mai not in citypopulation.de, but 1,198,000 urban population in 2021 per Wikipedia
-- [[w:List_of_municipalities_in_Thailand#Largest_cities_by_urban_population]]
["Chiang Mai"] = {container = {key = "Chiang Mai Province, Thailand", placetype = "จังหวัด"}},
["Chonburi"] = {container = {key = "Chonburi Province, Thailand", placetype = "จังหวัด"}}, -- 1,570,000 (Agglomeration; including Pattaya)
-- metro area population stats from https://www.statista.com/statistics/255483/biggest-cities-in-turkey/ as of 2021;
-- second source is citypopulation.de reference date 2025-01-01.
["Istanbul"] = {placetype = {"นคร", "จังหวัด"}, divs = {"อำเภอ"}, container = "Turkey"}, -- 15.2 million; 16,000,000 (Agglomeration)
["İstanbul"] = {alias_of = "Istanbul", display = true},
["Ankara"] = {container = {key = "Ankara Province, Turkey", placetype = "จังหวัด"}}, -- 5.15 million; 5,200,000 (Agglomeration)
["Izmir"] = {container = {key = "İzmir Province, Turkey", placetype = "จังหวัด"}, wp = "İzmir"}, -- 2.95 million; 3,025,000 (Agglomeration)
["İzmir"] = {alias_of = "Izmir", display = true},
["Bursa"] = {container = {key = "Bursa Province, Turkey", placetype = "จังหวัด"}}, -- 2.02 million; 2,200,000 (Agglomeration)
["Adana"] = {container = {key = "Adana Province, Turkey", placetype = "จังหวัด"}}, -- 1.77 million; 1,780,000 (Agglomeration)
["Gaziantep"] = {container = {key = "Gaziantep Province, Turkey", placetype = "จังหวัด"}}, -- 1.71 million; 1,750,000 (Agglomeration)
["Antalya"] = {container = {key = "Antalya Province, Turkey", placetype = "จังหวัด"}}, -- 1.3 million; 1,400,000 (Agglomeration)
["Konya"] = {container = {key = "Konya Province, Turkey", placetype = "จังหวัด"}}, -- 1.35 million; 1,390,000 (Agglomeration)
["Diyarbakır"] = {container = {key = "Diyarbakır Province, Turkey", placetype = "จังหวัด"}}, -- 1.07 million; 1,100,000 (Agglomeration)
-- Diyarbakır is more common per Ngrams and Google Scholar, but Diyarbakir is the Kurdish form, so we should not
-- display-canonicalize to the Turkish form Diyarbakır.
["Diyarbakir"] = {alias_of = "Diyarbakır"},
["Mersin"] = {container = {key = "Mersin Province, Turkey", placetype = "จังหวัด"}}, -- 1.03 million; 1,060,000 (Agglomeration)
["Ashgabat"] = {container = "Turkmenistan"}, -- 1,150,000 (Agglomeration)
["Dubai"] = {container = "United Arab Emirates"}, -- 6,050,000 (Agglomeration; including Sharjah)
["Abu Dhabi"] = {container = "United Arab Emirates"}, -- 1,850,000 (City)
["Sharjah"] = {container = "United Arab Emirates"}, -- 1,800,000 (Metro area 2022-2023 per Wikipedia; separate from Dubai)
["Tashkent"] = {container = "Uzbekistan"}, -- 3,850,000 (unindicated)
["Sanaa"] = {container = "Yemen"}, -- 3,275,000 (City; population of low reliability)
["Sana'a"] = {alias_of = "Sanaa", display = true},
["Aden"] = {container = "Yemen"}, -- 1,079,060 (?; 2023 estimate from World Population Review per Wikipedia)
------------------ Europe or Europe-like (Caucasus etc.) ---------------------
["Yerevan"] = {container = "อาร์มีเนีย"}, -- 1,520,000 (Agglomeration)
["Vienna"] = {container = "ออสเตรีย"}, -- 2,375,000 (Agglomeration)
["Minsk"] = {container = "เบลารุส"}, -- 2,100,000 (unindicated)
["Brussels"] = {container = "เบลเยียม"}, -- 2,800,000 (Consolidated Urban Area)
["Antwerp"] = {container = "เบลเยียม"}, -- 1,270,000 (Consolidated Urban Area)
["Sofia"] = {container = "บัลแกเรีย"}, -- 1,260,000 (Agglomeration)
["Zagreb"] = {container = "โครเอเชีย"},
["Prague"] = {container = "สาธารณรัฐเช็ก"}, -- 1,470,000 (Agglomeration)
["Brno"] = {container = "สาธารณรัฐเช็ก"}, -- 729,405 (metro area per Wikipedia as of 2024-01-01 Czech Statistical Office)
["Olomouc"] = {container = "สาธารณรัฐเช็ก"}, -- 102,293 (city; included only because someone went crazy creating Olomouc-related terms)
["Copenhagen"] = {container = "เดนมาร์ก"}, -- 1,800,000 (Consolidated Urban Area)
["Helsinki"] = {container = {key = "Uusimaa, ฟินแลนด์", placetype = "ภูมิภาค"}}, -- 1,560,000 (Consolidated Urban Area)
["Tbilisi"] = {container = "Georgia"}, -- 1,430,000 (Agglomeration)
["Athens"] = {container = "กรีซ"},
["Thessaloniki"] = {container = "กรีซ"},
["Budapest"] = {container = "ฮังการี"},
-- FIXME, per Wikipedia "County Dublin" is now the "Dublin Region"
["Dublin"] = {container = {key = "County Dublin, ไอร์แลนด์", placetype = "เทศมณฑล"}},
["Riga"] = {container = "Latvia"},
["Amsterdam"] = {container = {key = "North Holland, Netherlands", placetype = "จังหวัด"}},
["Rotterdam"] = {container = {key = "South Holland, Netherlands", placetype = "จังหวัด"}},
["The Hague"] = {container = {key = "South Holland, Netherlands", placetype = "จังหวัด"}},
-- Christchurch (metro 546,600) and Wellington (metro 439,800) are too small to make it.
["Auckland"] = {container = {key = "Auckland, New Zealand", placetype = "ภูมิภาค"}},
["Oslo"] = {container = {key = "Oslo, Norway", placetype = "เทศมณฑล"}},
["Warsaw"] = {container = {key = "Masovian Voivodeship, Poland", placetype = "voivodeship"}},
["Katowice"] = {container = {key = "Silesian Voivodeship, Poland", placetype = "voivodeship"}},
--- Ngrams (up through 2022) and Google Scholar (>= 2024) confirms the common form "Krakow" without accent.
["Krakow"] = {container = {key = "Lesser Poland Voivodeship, Poland", placetype = "voivodeship"}, wp = "Kraków"},
["Kraków"] = {alias_of = "Krakow", display = true},
["Cracow"] = {alias_of = "Krakow", display = true},
--- Ngrams (up through 2022) and Google Scholar (>= 2024) confirm "Gdańsk" and "Poznań" with accent.
["Gdańsk"] = {container = {key = "Pomeranian Voivodeship, Poland", placetype = "voivodeship"}},
["Gdansk"] = {alias_of = "Gdańsk", display = true},
["Poznań"] = {container = {key = "Greater Poland Voivodeship, Poland", placetype = "voivodeship"}},
["Poznan"] = {alias_of = "Poznań", display = true},
--- Ngrams (up through 2022) and Google Scholar (>= 2024) confirms the common form "Lodz" without accents.
["Lodz"] = {container = {key = "Lodz Voivodeship, Poland", placetype = "voivodeship"}, wp = "Łódź"},
["Łódź"] = {alias_of = "Lodz", display = true},
["Lisbon"] = {container = {key = "Lisbon District, Portugal", placetype = "district"}},
["Porto"] = {container = {key = "Porto District, Portugal", placetype = "district"}},
["Oporto"] = {alias_of = "Porto", display = true},
["Bucharest"] = {container = "Romania"},
["Belgrade"] = {container = "Serbia"},
["Stockholm"] = {container = "Sweden"},
["Zurich"] = {container = "Switzerland"},
--- Ngrams (up through 2022) and Google Scholar (>= 2024) confirms the common form "Zurich" without umlaut.
--- Even Wikipedia uses the form without umlaut.
["Zürich"] = {alias_of = "Zurich", display = true},
["Kyiv"] = {container = "Ukraine"}, -- not in Kyiv Oblast
-- Don't display-canonicalize Kiev -> Kyiv because in ancient contexts, Kiev is still more common.
["Kiev"] = {alias_of = "Kyiv"},
["Kharkiv"] = {container = {key = "Kharkiv Oblast, Ukraine", placetype = "oblast"}},
["Odessa"] = {container = {key = "Odesa Oblast, Ukraine", placetype = "oblast"}, wp = "Odesa"},
-- Don't display-canonicalize Odesa -> Odessa because it may be interpreted as a political statement.
["Odesa"] = {alias_of = "Odessa"},
------------------ North America, South America ---------------------
-- Primary figures from citypopulation.de retrieved on 2025-04-26 (reference date 2025-01-01);
-- Wikipedia metropolitan figures from [[w:List of metropolitan areas in the Americas]] based on per-country data;
-- Wikipedia city limits figures from [[w:List of largest cities in the Americas]].
["Buenos Aires"] = {container = "อาร์เจนตินา"}, -- 16,800,000 (Consolidated Urban Area; 13,985,794 metropolitan area per Wikipedia)
["Córdoba, Argentina"] = {container = "อาร์เจนตินา", wp = "%l, %c"}, -- 1,810,000 (Consolidated Urban Area; 1,505,25 city limits per Wikipedia)
-- to avoid confusion with Córdoba in Spain
["Córdoba"] = {alias_of = "Córdoba, Argentina"},
["Cordoba"] = {alias_of = "Córdoba, Argentina", display = "Córdoba"},
["Rosario"] = {container = "อาร์เจนตินา", wp = "%l, Santa Fe"}, -- 1,510,000 (Consolidated Urban Area; 1,348,725 metropolitan area per Wikipedia)
["Mendoza"] = {container = "อาร์เจนตินา", wp = "%l, %c"}, -- 1,180,000 (Consolidated Urban Area)
["San Miguel de Tucumán"] = {container = "อาร์เจนตินา"}, -- 1,110,000 (Consolidated Urban Area)
["Tucumán"] = {alias_of = "San Miguel de Tucumán"},
["Tucuman"] = {alias_of = "San Miguel de Tucumán", display = "Tucumán"},
["Santa Cruz de la Sierra"] = {container = "โบลิเวีย"}, -- 1,960,000 (Consolidated Urban Area); 1,606,671 (city limits per Wikipedia)
["Santa Cruz"] = {alias_of = "Santa Cruz de la Sierra"},
["La Paz"] = {container = "โบลิเวีย"}, -- 1,870,000 (Consolidated Urban Area; composed of El Alto, now slightly larger, and La Paz)
["El Alto"] = {container = "โบลิเวีย"},
["Cochabamba"] = {container = "โบลิเวีย"}, -- 1,280,000 (Consolidated Urban Area)
["Santiago"] = {container = "ชิลี"}, -- 8,400,000 (Consolidated Urban Area; 6,903,479 city limits? per Wikipedia)
["Valparaíso"] = {container = "ชิลี"}, -- 1,060,000 (Consolidated Urban Area)
["Valparaiso"] = {alias_of = "Valparaíso"}, -- 1,060,000 (Consolidated Urban Area)
["Bogotá"] = {container = "โคลอมเบีย"}, -- 10,600,000 (Agglomeration; 12,772,828 metropolitan area per Wikipedia)
["Bogota"] = {alias_of = "Bogotá", display = true},
["Medellín"] = {container = "โคลอมเบีย"}, -- 4,350,000 (Agglomeration; 4,068,000 metropolitan area per Wikipedia)
["Medellin"] = {alias_of = "Medellín", display = true},
["Cali"] = {container = "โคลอมเบีย"}, -- 2,975,000 (Agglomeration; 2,837,000 metropolitan area per Wikipedia)
["Barranquilla"] = {container = "โคลอมเบีย"}, -- 2,375,000 (Agglomeration; 1,341,160 city limits per Wikipedia)
["Bucaramanga"] = {container = "โคลอมเบีย"}, -- 1,380,000 (Agglomeration)
["Cartagena, Colombia"] = {container = "โคลอมเบีย", wp = "%l, %c"}, -- 1,250,000 (Agglomeration)
-- to avoid confusion with Cartagena, Spain
["Cartagena"] = {alias_of = "Cartagena, Colombia"},
["Cúcuta"] = {container = "โคลอมเบีย"}, -- 1,130,000 (Agglomeration)
["Cucuta"] = {alias_of = "Cúcuta", display = true},
-- to avoid conflict with San Jose, California
["San José, Costa Rica"] = {container = "คอสตาริกา", wp = "%l, %c"}, -- 2,450,000 (Municipality (urban population); 3,160,000 metropolitan area per Wikipedia)
["San José"] = {alias_of = "San José, Costa Rica"},
["San Jose"] = {alias_of = "San José, Costa Rica"}, -- display = "San José"; causes error due to San Jose alias for California city; FIXME
["Havana"] = {container = "คิวบา"}, -- 2,150,000 (City; 2,137,847 city limits? per Wikipedia)
["Santo Domingo"] = {container = "สาธารณรัฐโดมินิกัน"}, -- 3,900,000 (Municipality (urban population); 4,274,651 ??? per Wikipedia)
["Guayaquil"] = {container = "เอกวาดอร์"}, -- 3,350,000 (Agglomeration; 3,092,000 metro area? per Wikipedia)
["Quito"] = {container = "เอกวาดอร์"}, -- 2,875,000 (Agglomeration; 2,889,703 metro area? per Wikipedia)
["San Salvador"] = {container = "เอลซัลวาดอร์"}, -- 1,580,000 (Municipality (urban population))
["Guatemala City"] = {container = "กัวเตมาลา"}, -- 3,375,000 (Municipality (urban population); 3,160,000 metro area? per Wikipedia)
["Port-au-Prince"] = {container = "เฮติ"}, -- 3,050,000 (Agglomeration; population of low reliability; 2,915,000 metro area? per Wikipedia)
["San Pedro Sula"] = {container = "ฮอนดูรัส"}, -- 1,330,000 (Consolidated Urban Area)
["Tegucigalpa"] = {container = "ฮอนดูรัส"}, -- 1,220,000 (Urban Area)
["Managua"] = {container = "Nicaragua"}, -- 1,400,000 (Consolidated Urban Area)
["Panama City"] = {container = "Panama"}, -- 1,430,000 (Urban Area)
["Asunción"] = {container = "Paraguay"}, -- 2,350,000 (Municipality (urban population))
["Lima"] = {container = "Peru"}, -- 12,000,000 (Agglomeration; 11,283,787 ??? per Wikipedia)
["Arequipa"] = {container = "Peru"}, -- 1,210,000 (Agglomeration)
["San Juan"] = {container = {key = "Puerto Rico", placetype = "commonwealth"}, wp = "%l, %c"}, -- 1,910,000 (Consolidated Urban Area)
["Montevideo"] = {container = "Uruguay"}, -- 1,810,000 (Agglomeration; 1,302,954 ??? per Wikipedia)
["Caracas"] = {container = "Venezuela"}, -- 3,850,000 (Consolidated Urban Area; 5,243,301 ??? per Wikipedia)
["Maracaibo"] = {container = "Venezuela"}, -- 2,825,000 (Consolidated Urban Area; 5,278,448 ??? per Wikipedia)
-- to avoid confusion with Valencia (city and autonomous community of Spain)
["Valencia, Venezuela"] = {container = "Venezuela", wp = "%l, %c"}, -- 2,100,000 (Consolidated Urban Area)
["Valencia"] = {alias_of = "Valencia, Venezuela"},
["Maracay"] = {container = "Venezuela"}, -- 1,480,000 (Consolidated Urban Area)
["Barquisimeto"] = {container = "Venezuela"}, -- 1,360,000 (Consolidated Urban Area)
}
export.misc_cities_group = {
canonicalize_key_container = make_canonicalize_key_container(nil, "ประเทศ"),
default_placetype = "นคร",
data = export.misc_cities,
}
--[==[ var:
List of all known locations, in groups. The first group lists continents and continental regions, followed by three
groups listing top-level locations: countries, "country-like entities" (de-facto/unrecognized/etc. countries and
dependent territories) and former polities (countries, empires, etc.). After that come first-level subpolities
(administrative divisions) of several, mostly large, countries, followed by groups of cities. China and the United
Kingdom include second-level subpolities (in the case of China, only the largest ones as the full list runs in the
hundreds).
]==]
export.locations = {
export.continents_group,
export.countries_group,
export.country_like_entities_group,
export.former_countries_group,
export.australia_group,
export.austria_group,
export.bangladesh_group,
export.brazil_group,
export.canada_group,
export.china_group,
export.china_prefecture_level_cities_group,
export.china_prefecture_level_cities_group_2,
export.egypt_group,
export.finland_group,
export.france_group,
export.france_departments_group,
export.germany_group,
export.greece_group,
export.india_group,
export.indonesia_group,
export.iran_group,
export.ireland_group,
export.italy_group,
export.japan_group,
export.laos_group,
export.lebanon_group,
export.malaysia_group,
export.malta_group,
export.mexico_group,
export.moldova_group,
export.morocco_group,
export.netherlands_group,
export.new_zealand_group,
export.nigeria_group,
export.north_korea_group,
export.norway_group,
export.pakistan_group,
export.philippines_group,
export.poland_group,
export.portugal_group,
export.romania_group,
export.russia_group,
export.saudi_arabia_group,
export.south_africa_group,
export.south_korea_group,
export.spain_group,
export.taiwan_group,
export.thailand_group,
export.turkey_group,
export.ukraine_group,
export.united_kingdom_group,
export.united_states_group,
export.england_group,
export.northern_ireland_group,
export.scotland_group,
export.wales_group,
export.vietnam_group,
export.australia_cities_group,
export.brazil_cities_group,
export.canada_cities_group,
export.france_cities_group,
export.germany_cities_group,
export.india_cities_group,
export.indonesia_cities_group,
export.italy_cities_group,
export.japan_cities_group,
export.mexico_cities_group,
export.nigeria_cities_group,
export.pakistan_cities_group,
export.philippines_cities_group,
export.russia_cities_group,
export.saudi_arabia_cities_group,
export.south_korea_cities_group,
export.spain_cities_group,
export.taiwan_cities_group,
export.united_kingdom_cities_group,
export.united_states_cities_group,
export.new_york_boroughs_group,
export.vietnam_cities_group,
export.misc_cities_group,
}
return export
pcb3s52s1ts3o7kiu4uflgy2c7ek2pw
มอดูล:place/placetypes
828
2297280
5720686
5715287
2026-04-21T01:22:10Z
OctraBot
3198
5720686
Scribunto
text/plain
local export = {}
export.force_cat = false -- set to true for testing
local m_locations = require("Module:place/locations")
local m_links = require("Module:links")
local m_table = require("Module:table")
local m_strutils = require("Module:string utilities")
local debug_track_module = "Module:debug/track"
local en_utilities_module = "Module:en-utilities"
local dump = mw.dumpObject
local insert = table.insert
local concat = table.concat
local internal_error = m_locations.internal_error
export.internal_error = internal_error
local process_error = m_locations.process_error
export.process_error = process_error
local unpack = unpack or table.unpack -- Lua 5.2 compatibility
local ucfirst = m_strutils.ucfirst
local ulower = m_strutils.lower
local rmatch = m_strutils.match
local split = m_strutils.split
--[==[ intro:
This module contains placetype data used by [[Module:place]] and {{tl|place}}, along with a significant amount of code
to work with both placetypes and locations, as well as some placename-related info (FIXME: Consider moving it to
[[Module:place/locations]]). See also [[Module:place/locations]], which has definitions of all known locations. You must
currently load this module using {{cd|require()}}, not using {{cd|mw.loadData()}}.
In particular, it contains two fundamental and tricky functions:
# `get_placetype_equivs`, which finds the equivalent placetypes to look under in order to find a given property, and in
the process correctly handles placetypes with qualifiers (including qualifiers that act similar to "type-raising"
operators in that they do something non-trivial to the placetype to their right) as well as form-of directives and
fallbacks.
# `find_matching_holonym_location`, which looks up a holonym to find a matching known location, but in the process
checks holonyms to the right to make sure there isn't a clash between the user-specified containing holonyms and the
containers of the known location being considered. This is done to prevent overcategorizing when either there are two
known locations with the same name (e.g. Birmingham in England and Birmingham, Alabama in the US), or more generally
two locations with the same name, one of which is a known location but where the other is not (e.g. we're processing
non-known-location Mérida, Spain and don't want it categorized like known location Mérida, Yucatán, Mexico).
Both of these functions are invoked repeatedly, and probably are invoked several times on the same inputs and as a
result are candidates for memoization to speed up the operation of {{tl|place}}.
]==]
------------------------------------------------------------------------------------------
-- Basic utilities --
------------------------------------------------------------------------------------------
--[==[
Return true if `force_cat` is set either in this module or in [[Module:place/locations]].
]==]
function export.get_force_cat()
return export.force_cat or m_locations.force_cat
end
-- Add the page to a tracking "category". To see the pages in the "category",
-- go to [[Wiktionary:Tracking/place/PAGE]] and click on "What links here".
local function track(page)
require(debug_track_module)("place/" .. page)
return true
end
function export.remove_links_and_html(text)
text = m_links.remove_links(text)
return text:gsub("<.->", "")
end
--[==[
Return the singular version of a maybe-plural placetype, or nil if not plural. This correctly handles placetypes with
irregular plurals such as `kibbutzim` plural of `kibbutz` by looking up in a table constructed from the `plural` values
specified in `placetype_data`. If a special plural value is not found, the regular singularization algorithm in
[[Module:en-utilities]] is invoked, which reverses the y -> ies change after vowels and the 'es' addition after sh/ch/x,
and otherwise just subtracts a final 's' (which will incorrectly generate 'passe' for plural 'passes'; FIXME: consider
changing this for words ending in '-sses'). If the generated singular is the same as the passed-in value, nil is
returned.
]==]
function export.maybe_singularize_placetype(placetype)
if not placetype then
return nil
end
if export.plural_placetype_to_singular[placetype] then
return export.plural_placetype_to_singular[placetype]
end
local retval = --[[require(en_utilities_module).singularize(placetype)]] placetype
if retval == placetype then
return nil
end
return retval
end
-- Return the correct plural of a placetype, and (if `do_ucfirst` is given) make the first letter uppercase. We first
-- look up the plural in `placetype_data`, falling back to pluralize() in [[Module:en-utilities]], which is almost
-- always correct.
function export.pluralize_placetype(placetype, do_ucfirst)
local ptdata = export.placetype_data[placetype]
if ptdata and ptdata.plural then
placetype = ptdata.plural
else
placetype = --[[require(en_utilities_module).pluralize(placetype)]] placetype
end
if do_ucfirst then
return ucfirst(placetype)
else
return placetype
end
end
--[==[
Get the data associated with a placetype, which may be in its singular or plural form. If `from_category` is specified,
we also look for category-only placetypes (generally plural) followed by `!`. Return three values: (a) the placetype
under which the data can be looked up (i.e. in its singular form if the passed-in `placetype` is plural and did not
match a category-only placetype followed by `!`); (b) the placetype data structure; (c) the type of `placetype` match
that occurred, one of `"direct"` if the canonical placetype is the same as the passed-in `placetype` and also the same
as the key under which `ptdata` was looked up, or `"direct-category"` if the `ptdata` was looked up under a key formed
from the passed-in `placetype` by adding `!`, or `"plural"` if the `ptdata` was looked up under the singularized version
of the plural passed-in `placetype`.
]==]
function export.get_placetype_data(placetype, from_category)
local ptdata = export.placetype_data[placetype]
if ptdata then
return placetype, ptdata, "direct"
end
if from_category then
ptdata = export.placetype_data[placetype .. "!"]
if ptdata then
return placetype .. "!", ptdata, "direct-category"
end
end
local sg_placetype = export.maybe_singularize_placetype(placetype)
if sg_placetype then
ptdata = export.placetype_data[sg_placetype]
if ptdata then
return sg_placetype, ptdata, "plural"
end
end
return nil
end
--[==[
Check for special pseudo-placetypes that should be ignored for categorization purposes.
]==]
function export.placetype_is_ignorable(placetype)
return placetype == "and" or placetype == "or" or placetype == "และ" or placetype == "หรือ" or placetype:find("^%(")
end
function export.resolve_placetype_aliases(placetype)
return export.placetype_aliases[placetype] or placetype
end
--[==[
Return a property from `placetype_data` for a given placetype. If the placetype isn't found in `placetype_data`, or the
key isn't found in the placetype's entry in `placetype_data`, return nil.
]==]
function export.get_placetype_prop(placetype, key)
-- Usually we are called on equivalent placetypes returned from `get_placetype_equivs`, in which case placetype
-- aliases have been resolved, but sometimes not, e.g. when fetching the indefinite article in
-- get_placetype_article(). `resolve_placetype_aliases` is just a simple lookup and it doesn't hurt to do it twice.
placetype = export.resolve_placetype_aliases(placetype)
if export.placetype_data[placetype] then
return export.placetype_data[placetype][key]
else
return nil
end
end
--[==[
Given a placetype, split the placetype into one or more potential ''splits'', each consisting of a three-element list
{ {``prev_qualifiers``, ``this_qualifier``, ``reduced_placetype``}}, i.e.
# the concatenation of zero or more previously-recognized qualifiers on the left, normally canonicalized (if there are
zero such qualifiers, the value will be nil);
# a single recognized qualifier, normally canonicalized (if there is no qualifier, the value will be nil);
# the "reduced placetype" on the right.
Splitting between the qualifier in (2) and the reduced placetype in (3) happens at each space character, proceeding from
left to right, and stops if a qualifier isn't recognized. All placetypes are canonicalized by checking for aliases
in `placetype_aliases`, but no other checks are made as to whether the reduced placetype is recognized. Canonicalization
of qualifiers does not happen if `no_canon_qualifiers` is specified.
For example, given the placetype `"small beachside unincorporated community"`, the return value will be
{ {
{nil, nil, "small beachside unincorporated community"},
{nil, "small", "beachside unincorporated community"},
{"small", "[[beachfront]]", "unincorporated community"},
{"small [[beachfront]]", "[[unincorporated]]", "community"},
}}
Here, `"beachside"` is canonicalized to `"[[beachfront]]"` and `"unincorporated"` is canonicalized to
`"[[unincorporated]]"`, in both cases according to the entry in `placetype_qualifiers`.
On the other hand, if given `"small former haunted community"`, the return value will be
{ {
{nil, nil, "small former haunted community"},
{nil, "small", "former haunted community"},
{"small", "former", "haunted community"},
}}
because `"small"` and `"former"` but not `"haunted"` are recognized as qualifiers.
Finally, if given `"former adr"`, the return value will be
{ {
{nil, nil, "former adr"},
{nil, "former", "administrative region"},
}}
because `"adr"` is a recognized placetype alias for `"administrative region"`.
]==]
function export.split_qualifiers_from_placetype(placetype, no_canon_qualifiers)
local splits = {{nil, nil, export.resolve_placetype_aliases(placetype)}}
local prev_qualifier = nil
while true do
local qualifier, reduced_placetype = placetype:match("^(.-) (.*)$")
if qualifier then
local canon = export.placetype_qualifiers[qualifier]
if canon == nil then
break
end
local new_qualifier = qualifier
if type(canon) == "table" then
canon = canon.link
end
if not no_canon_qualifiers and canon ~= false then
if canon == true then
new_qualifier = "[[" .. qualifier .. "]]"
else
new_qualifier = canon
end
end
insert(splits, {prev_qualifier, new_qualifier, export.resolve_placetype_aliases(reduced_placetype)})
prev_qualifier = prev_qualifier and prev_qualifier .. " " .. new_qualifier or new_qualifier
placetype = reduced_placetype
else
break
end
end
return splits
end
--[==[
Given a `placetype` (which may be pluralized), return an ordered list of equivalent placetypes to look under to find the
placetype's properties (such as the category or categories to be inserted). The return value is actually an ordered list
of objects of the form `{qualifier=``qualifier``, placetype=``equiv_placetype``}` where ``equiv_placetype`` is a
placetype whose properties to look up, derived from the passed-in placetype or from a contiguous subsequence of the
words in the passed-in placetype (always including the rightmost word in the placetype, i.e. we successively chop off
qualifier words from the left and use the remainder to find equivalent placetypes). ``qualifier`` is the remaining words
not part of the subsequence used to find ``equiv_placetype``; or nil if all words in the passed-in placetype were used
to find ``equiv_placetype``. (FIXME: This qualifier is not currently used anywhere.) Only placetypes for which there is
an entry in `placetype_data` are included. The placetype passed in is always checked first, and will form the first
entry if it exists in `placetype_data`.
'''NOTE:''' This is a tricky function as it implements handling of (a) qualifiers, (b) fallback logic, (c)
"type-raising" qualifiers such as `former`/`ancient`/etc. as well as `fictional` and `mythological`, and (d) form-of
directives, which act somewhat similarly to `former`, and allows interaction between more than one of these
simultaneously (e.g. official names of former places, which have their own categorization).
If {{tl|place}} gets too slow, one potential speedup is to memoize the results of this function, as it appears to be
getting called more than once on the same inputs. Another similar potential speedup is to memoize the results of
`iterate_matching_holonym_location()`.
For example, given the placetype `left tributary`, the following placetype/qualifier combinations are checked in turn:
```
{qualifier = nil, placetype="left tributary"}
{qualifier = "left", placetype="tributary"}
{qualifier = "left", placetype="แม่น้ำ"}
```
and the return value will be
{ {
{qualifier = "left", placetype="tributary"},
{qualifier = "left", placetype="แม่น้ำ"},
}}
The algorithm first enters the placetype itself into the list, then checks for `left tributary` as a recognized
placetype in `placetype_data` and doesn't find it, so it doesn't enter it into the returned list (if it found it, it
would add it as well as any fallbacks directly after it). It then splits off the recognized qualifier `left` to form the
''reduced placetype'' `tributary`, which is entered into the list because it is found in `placetype_data`. Then, because
it has a fallback `river`, which exists in `placetype_data`, the fallback is entered next.
Another example is `small rural fraziones` (where a ''frazione'' is type of subdivision of a ''comune'' or municipality,
often specifically an outlying hamlet). the placetype/qualifier combinations checked are:
```
{qualifier = nil, placetype="small rural fraziones"}
{qualifier = nil, placetype="small rural frazione"}
{qualifier = "small", placetype="rural fraziones"}
{qualifier = "small", placetype="rural frazione"}
{qualifier = "small [[rural]]", placetype="fraziones"}
{qualifier = "small [[rural]]", placetype="frazione"}
{qualifier = "small [[rural]]", placetype="hamlet"}
{qualifier = "small [[rural]]", placetype="village"}
```
The return value ends up as
{qualifier = "small [[rural]]", placetype="frazione"},
{qualifier = "small [[rural]]", placetype="hamlet"},
{qualifier = "small [[rural]]", placetype="village"},
}}
Here, because the result of singularizing `fraziones` returns a different value from the placetype itself, that
singularized value is checked after the original plural value. Also, in the process of splitting off qualifiers,
they are canonicalized if the entry in `placetype_qualifiers` says to do so; in this case, links are placed around
`rural`. Finally, `frazione` has `hamlet` as its fallback, which in turn has `village` as its fallback, so both
fallbacks end up being returned.
`no_fallback`, if set, disables returning equivalent placetypes based on the `fallback` setting for a placetype. This is
used in the first of two loops in find_placetype_cat_specs() in [[Module:place]] to prefer exact matches for placetypes
such as barangays with later holonyms to matches based on a fallback such as `neighborhood` with an earlier holonym.
See the comment in that function in [[Module:place]] for a more detailed explanation of why this is needed. Only the
placetype itself, and any reduced placetypes created by chopping off recognized qualifiers at the beginning, are
returned; but we do not return reduced placetypes if a containing placetype exists in `placetype_data`. (For example,
`"overseas territory"` has a fallback `"dependent territory"`, and `"overseas"` is also a recognized qualifier. When
`no_fallback` is in place, without the above proviso, we would return `"overseas territory"` followed by `"ดินแดน"`
with the incorrect effect of classifying an `"overseas territory"` of the United Kingdom such as `"Gibraltar"` under
[[:Category:Territories of the United Kingdom]] instead of [[:Category:Dependent territories of the United Kingdom]].)
As an exception, if `historical`, `ancient`, `former` or the like are found, they proceed ignoring `no_fallback`,
because it seems tricky to handle them correctly in the presence of `no_fallback`, and historical/former placetypes
rarely occur with exact match category specs anyway.
`no_split_qualifiers` prevents splitting off recognized qualifiers and returning the remainder of the placetype as an
equivalent placetype. Only the passed-in placetype, and any fallbacks, will be returned. This is used in
[[Module:category tree/topic cat/data/Places]] when looking up placetypes found in categories. Such placetypes won't
have qualifiers and so it doesn't make sense to try and look for them.
`from_category`, if set, causes category-only placetypes (those ending in `!`) to also be checked.
`form_of_directive`, if set, causes the specified form-of directive (e.g. `FORMER_NAME_OF`) to be prepended to checked
placetypes, their directive-specific type (e.g. `FORMER_NAME_OF_type`), and their classes (`class`) to get the
appropriate placetypes to check for form-of-directive categories. It falls back to the prepended generic `place` as a
placetype, e.g. `FORMER_NAME_OF place`, if nothing else matches.
`no_check_for_inherently_former` is used internally to prevent an infinite loop when checking for `inherently_former`.
`register_former_as_non_former` is a major hack used in `get_bare_categories` to deal with the mismatch between e.g.
known location `Yugoslavia` declaring itself a `country` but definitions of it declaring it a `former country`. It
causes the non-former version of the specified placetype to be included in the returned equivalents along with the
former placetypes. [FIXME: This should apply only to the entries in `former_countries` but it's tricky to do that now;
fix this in the known-location refactor. -- The known-location refactor is already done but we haven't yet fixed this.]
]==]
function export.get_placetype_equivs(placetype, props)
local no_fallback, no_split_qualifiers, no_check_for_inherently_former, from_category, register_former_as_non_former
local form_of_directive
if props then
no_fallback, no_split_qualifiers, no_check_for_inherently_former, from_category, register_former_as_non_former =
props.no_fallback, props.no_split_qualifiers, props.no_check_for_inherently_former, props.from_category,
props.register_former_as_non_former
form_of_directive = props.form_of_directive
end
local equivs = {}
-- Insert `placetype` into `equivs`, along with any fallback placetypes listed in `placetype_data`. `qualifier` is
-- the preceding qualifier to insert into `equivs` along with the placetype (see comment at top of function). If
-- `from_category` is given, we also check for a category-specific entry consisting of the placetype followed by
-- `!`, and in all cases we also check to see if `placetype` is plural, and if so, insert the singularized version
-- along with its fallbacks (if any) in `placetype_data`. `form_of_prefix` is a form-of prefix such as
-- `OFFICIAL_NAME_OF`. If specified, we check the fallbacks of `placetype` without the prefix but then insert into
-- `equivs` the prefixed placetype. This way, if the user says e.g. {{tl|place|pt|@official name of:Cuba|island country|r/Caribbean}},
-- we will correctly categorize into [[:Category:Official names of countries]], rather than only trying to look up
-- `OFFICIAL_NAME_OF island country` and failing, falling back ultimately to [[:Category:Official names of places]].
local function insert_placetype_and_fallbacks(qualifier, placetype, form_of_prefix)
local function insert_equiv(pt)
if form_of_prefix then
-- Let's say the user says {{tl|place|pt|@official name of:Cuba|island country|r/Caribbean}} and we have
-- no entry for `OFFICIAL_NAME_OF island country` but we do for `OFFICIAL_NAME_OF country` (which we end
-- up processing because `island country` falls back to `country`), and that entry in turn is defined
-- using a fallback. We have to insert that fallback-of-fallback, and the easiest/cleanest way of
-- handling this is by calling ourselves recursively.
insert_placetype_and_fallbacks(qualifier, form_of_prefix .. " " .. pt)
else
insert(equivs, {qualifier=qualifier, placetype=pt})
end
end
-- Insert the placetype, along with any fallbacks.
local canon_placetype, ptdata, ptmatch = export.get_placetype_data(placetype, from_category)
if ptdata then
insert_equiv(canon_placetype)
if no_fallback then
return
end
local first_placetype = #equivs + 1
local prev_placetype = nil
while true do
local pt_value = export.placetype_data[canon_placetype]
if not pt_value then
internal_error("Fallback value %s specified for placetype %s but is not in `placetype_data`",
canon_placetype, prev_placetype)
end
if pt_value.fallback then
insert_equiv(pt_value.fallback)
local last_placetype = #equivs
if last_placetype - first_placetype >= 10 then
local fallback_loop = {}
for i = first_placetype, last_placetype do
insert(fallback_loop, equivs[i].placetype)
end
internal_error("Apparent loop in fallback chain: %s", table.concat(fallback_loop, " -> "))
end
prev_placetype = canon_placetype
canon_placetype = pt_value.fallback
else
break
end
end
end
end
-- Insert `placetype` into `equivs`, along with any fallback placetypes listed in `placetype_data`. This is a
-- wrapper around the more basic `insert_placetype_and_fallbacks()` which handles form-of directives. If there is no
-- form-of directive, this function directly calls `insert_placetype_and_fallbacks()`. We do things this way so that
-- form-of directives correctly combine with `former`-type qualifiers. Note that we also have special backups for
-- form-of directives that check `DIRECTIVE place` (and before that, `DIRECTIVE FORMER/ANCIENT place` is there's a
-- `former`-type directive); these backups live outside this function because we want them done once, late, rather
-- than in each invocation of `process_and_insert_placetype()`.
local function process_and_insert_placetype(qualifier, reduced_placetype)
if form_of_directive then
-- First check for e.g. `OFFICIAL_NAME_OF island country` and its fallbacks; then we look for fallbacks of
-- `island country` and check e.g. `OFFICIAL_NAME_OF country` and its fallbacks. All of this is handled by
-- `insert_placetype_and_fallbacks()` with appropriate parameters. After that, check the general class of
-- the directive, e.g. `subpolity` if something like `district` is given. (Eventually, we check for
-- `OFFICIAL_NAME_OF place` as a backup, but this happens at the end outside the loop over qualifiers.)
insert_placetype_and_fallbacks(qualifier, reduced_placetype, form_of_directive)
if not no_fallback then
local reduced_placetype_equivs = export.get_placetype_equivs(reduced_placetype)
local directive_type = export.get_equiv_placetype_prop_from_equivs(reduced_placetype_equivs,
function(pt) return export.get_placetype_prop(pt, form_of_directive .. "_type") or
export.get_placetype_prop(pt, "class") end
)
if not directive_type then
local pt_data = export.get_equiv_placetype_prop_from_equivs(reduced_placetype_equivs,
function(pt) return export.placetype_data[pt] end
)
if pt_data then
internal_error("For placetype %s in conjunction with form-of directive %s, placetype data " ..
'located but directive-specific type property %s missing, and so is "class"; ' ..
"placetypes searched are %s", reduced_placetype, form_of_directive,
form_of_directive .. "_type", reduced_placetype_equivs)
else
-- This should be allowed, as we allow unrecognized placetypes in general.
end
elseif directive_type ~= "!" then
insert_placetype_and_fallbacks(qualifier, directive_type, form_of_directive)
end
end
else
insert_placetype_and_fallbacks(qualifier, reduced_placetype)
end
end
-- Successively split off recognized qualifiers and loop over successively greater sets of qualifiers from the left
-- (unless `no_split_qualifiers` is specified, in which case we don't check for qualifiers).
local splits
if no_split_qualifiers then
splits = {{nil, nil, export.resolve_placetype_aliases(placetype)}}
else
splits = export.split_qualifiers_from_placetype(placetype)
end
for _, split in ipairs(splits) do
local prev_qualifier, this_qualifier, reduced_placetype = unpack(split, 1, 3)
-- If a special "former" qualifier like `former` or `historical` isn't present, and
-- `no_check_for_inherently_former` is not given (this flag is used to avoid infinite loops), check for
-- "inherently former" placetypes like `satrapy` and `treaty port` that always refer to no-longer-existing
-- placetypes, and handle accordingly.
local unlinked_this_qualifier
if this_qualifier and this_qualifier:find("%[") then
unlinked_this_qualifier = export.remove_links_and_html(this_qualifier)
else
unlinked_this_qualifier = this_qualifier
end
local former_qualifiers = this_qualifier and export.former_qualifiers[unlinked_this_qualifier] or nil
if not former_qualifiers and not no_check_for_inherently_former then
former_qualifiers = export.get_equiv_placetype_prop(reduced_placetype,
function(pt) return export.get_placetype_prop(pt, "inherently_former") end,
{no_check_for_inherently_former = true})
end
-- If a special "former" qualifier like `former` or `historical` is present, map it to the appropriate internal
-- qualifiers (`ANCIENT` and/or `FORMER`, which are written in all-caps to distinguish them from user-specified
-- qualifiers), fetch the `former_type` property, and treat the placetype as if a concatenation of the mapped
-- qualifier(s) and the value of `former_type`. For example, if `medieval village` is given, we map `medieval`
-- to `ANCIENT` and `FORMER`, and `village` to its `former_type` of `settlement`, and enter the placetypes
-- `ANCIENT settlement` and `FORMER settlement` (in that order) into `equivs`. If the placetype following the
-- "former" qualifier is recognized in `placetype_data` but has no `former_type` and no fallback with a
-- `former_type` specified, it is an internal error; but if the placetype isn't recognized (e.g. something like
-- `former greenhouse` is specified and we don't have an entry for `greenhouse`), just track the occurrence and
-- don't enter anything into `equivs`.
if former_qualifiers then
-- FIXME: Should we respect `no_fallback` here? My instinct says no.
local reduced_placetype_equivs = export.get_placetype_equivs(reduced_placetype, {
no_check_for_inherently_former = true
})
local former_type = export.get_equiv_placetype_prop_from_equivs(reduced_placetype_equivs,
function(pt) return export.get_placetype_prop(pt, "former_type") or
export.get_placetype_prop(pt, "class") end
)
if not former_type then
local pt_data = export.get_equiv_placetype_prop_from_equivs(reduced_placetype_equivs,
function(pt) return export.placetype_data[pt] end
)
if pt_data then
internal_error("For placetype %s, placetype data located but `former_type` missing; " ..
"placetypes searched are %s", reduced_placetype, reduced_placetype_equivs)
else
-- Enable error when we've verified there aren't any examples.
track("bad-former-placetype")
track("bad-former-placetype/" .. reduced_placetype)
--process_error("For placetype '%s', unrecognized placetype following 'former'-type " ..
-- "qualifier; searched placetype(s) %s", reduced_placetype, dump(reduced_placetype_equivs))
end
elseif former_type ~= "!" then
-- First check directly for `ANCIENT/FORMER` + the original following placetype. This makes it possible
-- for (e.g.) former provinces of the Roman empire to be categorized specially.
for _, former_qualifier in ipairs(former_qualifiers) do
process_and_insert_placetype(prev_qualifier, former_qualifier .. " " .. reduced_placetype)
end
for _, former_qualifier in ipairs(former_qualifiers) do
process_and_insert_placetype(prev_qualifier, former_qualifier .. " " .. former_type)
end
-- HACK! See explanation above for `register_former_as_non_former`.
if register_former_as_non_former then
process_and_insert_placetype(prev_qualifier, reduced_placetype)
end
-- If we're processing a form-of directive, after doing everything else we do
-- `DIRECTIVE ANCIENT/FORMER place` e.g. `OFFICIAL_NAME_OF FORMER place` as a backup.
if form_of_directive and not no_fallback then
for _, former_qualifier in ipairs(former_qualifiers) do
insert_placetype_and_fallbacks(prev_qualifier, form_of_directive .. " " .. former_qualifier ..
" place")
end
end
-- Don't continue processing equivs. The reason is probably the same as the `break` below for
-- qualifier_to_placetype_equivs[]; categories for `former BLAH` are set using `default`, and
-- non-former equivs will otherwise take precedence.
break
end
end
-- Then see if the rightmost split-off qualifier is in qualifier_to_placetype_equivs
-- (e.g. 'fictional *' -> 'fictional location'). If so, add the mapping.
if this_qualifier and export.qualifier_to_placetype_equivs[unlinked_this_qualifier] then
insert(equivs, {
qualifier=prev_qualifier,
placetype=export.qualifier_to_placetype_equivs[unlinked_this_qualifier]
})
-- Don't continue processing equivs; otherwise, if we specify 'mythological city', even though the
-- equivalent entry for 'mythological location' gets inserted ahead of the entry for 'city', the
-- latter ends up generating the category because the category for 'mythological location' is set as
-- the default value, which is used only when no non-default category can be found.
break
end
-- Finally, join the rightmost split-off qualifier to the previously split-off qualifiers to form a combined
-- qualifier, and add it along with reduced_placetype and any mapping in placetype_data for reduced_placetype.
-- NOTE: The first time through this loop, both `prev_qualifier` and `this_qualifier` are nil, and this inserts
-- the full placetype into `equivs`.
local qualifier = prev_qualifier and prev_qualifier .. " " .. this_qualifier or this_qualifier
process_and_insert_placetype(qualifier, reduced_placetype)
-- If `no_fallback` and there's an entry in `placetype_data` for this placetype, don't include any reduced
-- placetypes to avoid the "overseas territory treated as a territory" issue describe above.
if no_fallback then
local canon_placetype, ptdata, ptmatch = export.get_placetype_data(reduced_placetype, from_category)
if canon_placetype then
break
end
end
end
-- If we're processing a form-of directive, after doing everything else we do `DIRECTIVE place` e.g.
-- `OFFICIAL_NAME_OF place` as a backup; but only if either the placetype as a whole is recognized or the placetype
-- begins with a recognized qualifier. This latter check is to avoid categorizing into e.g.
-- [[Category:en:Former names of places]] in an invocation like
-- {{place|en|@former name of:Democratic Republic of the Congo|country|r/Central Africa|;|used from 1971–1997}};
-- the `used from 1971–1997` gets treated as a placetype and we're called on it.
if form_of_directive and not no_fallback and (splits[2] or export.get_placetype_data(placetype, from_category)) then
insert_placetype_and_fallbacks(nil, form_of_directive .. " place")
end
return equivs
end
function export.get_equiv_placetype_prop_from_equivs(equivs, fun, continue_on_nil_only)
for _, equiv in ipairs(equivs) do
local retval = fun(equiv.placetype)
if continue_on_nil_only and retval ~= nil or not continue_on_nil_only and retval then
return retval, equiv
end
end
return nil, nil
end
--[==[
Given a placetype `placetype` and a function `fun` of one argument, iteratively call the function on equivalent
placetypes fetched from `get_placetype_equivs` until the function returns a non-falsy value (i.e. not {nil} or {false});
but if `continue_on_nil_only` is specified, the iterations continue until the function returns non non-{nil} value.
FIXME: We should make `continue_on_nil_only` the default; but this requires changing some callers.) When `fun` returns a
non-falsy or non-{nil} value, `get_equiv_placetype_prop` returns two values: the value returned by `fun` and the
equivalent placetype that triggered the non-falsy (or non-{nil}) return value. If `fun` never returns a non-falsy (or
non-{nil}) value, `get_equiv_placetype_prop` returns {nil} for both return values. If `placetype` is passed in as {nil},
the return value is the result of calling `fun` on {nil} (whatever it is) with {nil} for the second return value.
]==]
function export.get_equiv_placetype_prop(placetype, fun, props)
if not placetype then
return fun(nil), nil
end
return export.get_equiv_placetype_prop_from_equivs(export.get_placetype_equivs(placetype, props), fun,
props and props.continue_on_nil_only)
end
--[==[
Return the article that is used with an entry placetype. We proceed as follows:
# See if there is a recognized qualifier at the beginning that specifies an article (including `false` for no article).
This takes precedence over anything else, so that e.g. `various capitals` gets no article rather than "`the"`.
# Then check the placetype or any equivalent placetype for the `entry_placetype_use_the` property, indicating that
`"the"` should be used.
# Otherwise we look to see if the placetype itself (not any equivalents, even those involving deleting a qualifier from
the beginning) has an entry in `placetype_data` that specifies the indefinite article using `entry_placetype_use_the`
(principally for use with placetypes like `union territory`).
# Otherwise, we use [[Module:en-utilities]] to apply the standard algorithm to generate `"an"` for words beginning with
a vowel and `"a"` otherwise.
If `ucfirst` is true, the first letter of the article is made upper-case.
]==]
function export.get_placetype_article(placetype, ucfirst)
local art
local qualifier, reduced_placetype = placetype:match("^(.-) (.*)$")
if qualifier then
local canon = export.placetype_qualifiers[qualifier]
if type(canon) == "table" then
art = canon.article
end
end
if art == false then
return art
end
if art == nil then
local placetype_use_the = export.get_equiv_placetype_prop(placetype,
function(pt) return export.get_placetype_prop(pt, "entry_placetype_use_the") end)
if placetype_use_the then
art = "the"
else
art = export.get_placetype_prop(placetype, "entry_placetype_indefinite_article")
if not art then
art = --[[require(en_utilities_module).get_indefinite_article(placetype)]] ""
end
end
end
if ucfirst then
art = m_strutils.ucfirst(art)
end
return art
end
--[==[
Return the preposition that should be used after `placetype` when occurring as an entry placetype or in categories
(e.g. `city >in< France` but `country >of< South America`). The preposition defaults to `"ใน"` if not specified.
]==]
function export.get_placetype_entry_preposition(placetype)
local pt_prep = export.get_equiv_placetype_prop(placetype,
function(pt) return export.get_placetype_prop(pt, "preposition") end
)
return pt_prep or "ใน"
end
--[==[
Given a place desc (see top of file) and a holonym object (see top of file), add a key/value into the place desc's
`holonyms_by_placetype` field corresponding to the placetype and placename of the holonym. For example, corresponding
to the holonym "c/Italy", a key "ประเทศ" with the list value {"Italy"} will be added to the place desc's
`holonyms_by_placetype` field. If there is already a key with that place type, the new placename will be added to the
end of the value's list.
]==]
function export.key_holonym_into_place_desc(place_desc, holonym)
if not holonym.placetype then
return
end
-- Key in equivalent placetypes, so that e.g. `cities/San Francisco` gets keyed under `city`; but don't do
-- fallbacks, as it doesn't seem correct for the "do other holonyms of the same placetype" algorithm to do holonyms
-- of different types just because they have the same fallback.
local equiv_placetypes = export.get_placetype_equivs(holonym.placetype, {no_fallback = true})
local unlinked_placename = holonym.unlinked_placename
for _, equiv in ipairs(equiv_placetypes) do
local placetype = equiv.placetype
if not place_desc.holonyms_by_placetype then
place_desc.holonyms_by_placetype = {}
end
if not place_desc.holonyms_by_placetype[placetype] then
place_desc.holonyms_by_placetype[placetype] = {unlinked_placename}
else
insert(place_desc.holonyms_by_placetype[placetype], unlinked_placename)
end
end
end
--[=[
Construct a formatted link from the raw link spec `link` given the canonical singular placetype `sg_placetype`. If the
placetype was originally plural, `orig_placetype` should contain this plural value; otherwise it should be nil. This
will construct the appropriate type of link that displays as `orig_placetype` (or otherwise `sg_placetype`) but links to
whatever the `link` spec specifies (which may be `sg_placetype`, a Wikipedia article, etc.). `ptdata` is the placetype
data structure for the placetype, and `from_category` indicates that we are generating the description of a category
(otherwise we are generating the display form of an entry placetype).
]=]
local function make_placetype_link(link, sg_placetype, orig_placetype, ptdata, from_category, noerror)
if not from_category and ptdata.disallow_in_entries then
if noerror then
return "[not meant to be specified directly, with warning: " .. ptdata.disallow_in_entries .. "]"
else
process_error("Placetype %s is not meant to be specified directly: " .. ptdata.disallow_in_entries, sg_placetype)
end
end
if link == nil then
internal_error("Placetype data present for placetype %s but no link= setting given", sg_placetype)
elseif link == true then
if orig_placetype then
return ("[[%s|%s]]"):format(sg_placetype, orig_placetype)
else
return ("[[%s]]"):format(sg_placetype)
end
elseif link == false then
process_error("Placetype %s is not meant to be specified directly, but is only for internal use", sg_placetype)
elseif link == "w" then
return ("[[w:%s|%s]]"):format(sg_placetype, orig_placetype or sg_placetype)
elseif link == "separately" then
if orig_placetype then
local sg_words = split(sg_placetype, " ")
local orig_words = split(orig_placetype, " ")
if #sg_words ~= #orig_words then
internal_error("Can't construct 'separately' link for plural placetype %s as original placetype %s " ..
"has different number of words", orig_placetype, sg_placetype)
else
for i = 1, #sg_words do
if sg_words[i] == orig_words[i] then
sg_words[i] = ("[[%s]]"):format(sg_words[i])
else
sg_words[i] = ("[[%s|%s]]"):format(sg_words[i], orig_words[i])
end
end
return concat(sg_words, " ")
end
else
return (sg_placetype:gsub("([^ ]+)", "[[%1]]"))
end
elseif link:find("^%+") then
link = link:sub(2) -- discard initial +
return ("[[%s|%s]]"):format(link, orig_placetype or sg_placetype)
elseif not orig_placetype then
return link
else
return --[[require(en_utilities_module).pluralize(link)]] link
end
end
--[==[
Get the display form of a placetype by looking it up in `placetype_data`. If the placetype is recognized, or is the
plural of a recognized placetype, the corresponding linked display form is returned (with plural placetypes displaying
as plural but linked to the singular form of the placetype). Otherwise, return nil. If we're generating the description
of a category, `category_type` should be set to one of `"top-level"` (for top-level categories like
[[:Category:Neighborhoods]]), `"noncity"` (for non-city categories like [[:Category:Neighborhoods in Illinois, USA]]) or
`"city"` (for city categories like [[:Category:Neighborhoods of Chicago]]). Otherwise, we're generating the description
for use in formatting a {{tl|place}} call, and category-only placetypes ending in `!` will be ignored, along with
special `category_link*` settings. `return_full` is used along with `category_type` and will preferably return the
"full" variant of category link settings, i.e. `full_category_link*`; if they don't exist, the `category_link*` value is
prepended with `"names of"`. `noerror` says to not throw an error when encountering entry placetypes that would be
disallowed.
]==]
function export.get_placetype_display_form(placetype, category_type, return_full, noerror)
local from_category = not not category_type
local canon_placetype, ptdata, ptmatch = export.get_placetype_data(placetype, from_category)
if canon_placetype then
local raw_link
local function is_linked_string(str)
return type(str) == "string" and str:find("%[%[")
end
if category_type then
local fetched_full
local function fetch_maybe_full(prop)
local retval = ptdata["full_" .. prop]
if retval ~= nil then
if return_full then
return retval, true
else
internal_error("Saw full_" .. prop .. "=%s but `return_full` not set, can't handle", retval)
end
end
return ptdata[prop], false
end
local function maybe_prefix(str)
if return_full and not fetched_full then
return "names of " .. str
else
return str
end
end
-- Careful with `false` as possible value.
if category_type == "top-level" then --ห้ามแปล
raw_link, fetched_full = fetch_maybe_full("category_link_top_level")
elseif category_type == "noncity" then --ห้ามแปล
raw_link, fetched_full = fetch_maybe_full("category_link_before_noncity")
elseif category_type == "city" then --ห้ามแปล
raw_link, fetched_full = fetch_maybe_full("category_link_before_city")
else
internal_error('Unrecognized value for `category_type` %s, should be "top-level", "noncity" or "city"', --ห้ามแปล
category_type)
end
if type(raw_link) == "string" then
return maybe_prefix(raw_link), ptdata
elseif raw_link ~= nil then
return raw_link, ptdata
end
raw_link, fetched_full = fetch_maybe_full("category_link")
if raw_link == false then
return raw_link, ptdata
end
if is_linked_string(raw_link) then
return maybe_prefix(raw_link), ptdata
end
if ptmatch == "plural" then
raw_link, fetched_full = fetch_maybe_full("plural_link")
if raw_link == false then
return raw_link, ptdata
end
if is_linked_string(raw_link) then
return maybe_prefix(raw_link), ptdata
end
end
if raw_link == nil then
raw_link, fetched_full = fetch_maybe_full("link")
end
if raw_link == false then
return raw_link, ptdata
end
return maybe_prefix(make_placetype_link(raw_link, canon_placetype,
placetype ~= canon_placetype and placetype or nil, ptdata, from_category, noerror)), ptdata
else
if ptmatch == "plural" then
raw_link = ptdata.plural_link
if raw_link == false then
process_error("Placetype %s cannot appear plural", placetype)
end
if is_linked_string(raw_link) then
return raw_link, ptdata
end
end
if raw_link == nil then
raw_link = ptdata.link
end
return make_placetype_link(raw_link, canon_placetype,
placetype ~= canon_placetype and placetype or nil, ptdata, from_category, noerror), ptdata
end
end
return nil
end
local function resolve_unlinked_placename_display_aliases(placetype, placename)
local equiv_placetypes = export.get_placetype_equivs(placetype)
for i, equiv in ipairs(equiv_placetypes) do
equiv_placetypes[i] = equiv.placetype
end
local all_display_aliases_found = {}
local all_others_found = {}
for group, key, spec in m_locations.iterate_matching_location {
placetypes = equiv_placetypes,
placename = placename,
alias_resolution = "display",
} do
if spec.alias_of and spec.display then
insert(all_display_aliases_found, {group, key, spec, spec.display_as_full})
else
insert(all_others_found, {group, key, spec})
end
end
if not all_display_aliases_found[1] then
return placename
elseif all_display_aliases_found[2] then
internal_error("Found multiple matching display aliases for placename %s, placetype %s: " ..
"all_display_aliases_found=%s, all_others_found=%s", placename, placetype, all_display_aliases_found,
all_others_found)
elseif all_others_found[1] then
internal_error("Found a display alias along with other possible meanings for placename %s, placetype %s: " ..
"all_display_aliases_found=%s, all_others_found=%s", placename, placetype, all_display_aliases_found,
all_others_found)
else
local group, key, spec, as_full = unpack(all_display_aliases_found[1])
local full, elliptical = m_locations.key_to_placename(group, key)
return as_full and full or elliptical
end
end
--[==[
If `placename` of type `placetype` is a display alias, convert it to its canonical form; otherwise, return unchanged.
Display aliases transform certain placenames into canonical displayed forms. For example, if any of `country/US`,
`country/USA` or `country/United States of America` (or `c/US`, etc.) are given, the result will be displayed as
`United States`.
'''NOTE''': Display aliases change what is displayed from what the editor wrote in the Wikitext. As a result, they
should (a) be non-political in nature, and (b) not involve a change where the word `the` needs to be added or removed.
For example, normalizing `US` and `USA` to `United States` for display purposes is OK but normalizing `Burma` to
`Myanmar` is not (instead a cat alias should be used) because the terms `Burma` and `Myanmar` have clear political
connotations. Similarly, we have a display alias that maps the old name of `Macedonia` as a country (but not a region!)
to `North Macedonia`, but `Republic of Macedonia` is mapped to `North Macedonia` only as a cat alias because the two
terms differ in their use of `the`. (For example, if we had a display alias mapping `Republic of Macedonia` to
`North Macedonia`, the call {{tl|place|en|the <<capital city>> of the <<c/Republic of Macedonia>>}} would wrongly
display as `the [[capital city]] of the [[North Macedonia]]`.) Generally, display normalizations tend to involve
alternative forms (e.g. abbreviations, ellipses, foreign spellings) where the normalization improves clarity and
consistency.
]==]
function export.resolve_placename_display_aliases(placetype, placename)
-- If the placename is a link, apply the alias inside the link.
-- This pattern matches both piped and unpiped links. If the link is not piped, the second capture (linktext) will
-- be empty.
local link, linktext = rmatch(placename, "^%[%[([^|%[%]]+)|?([^|%[%]]-)%]%]$")
if link then
if linktext ~= "" then
local alias = resolve_unlinked_placename_display_aliases(placetype, linktext)
return "[[" .. link .. "|" .. alias .. "]]"
else
local alias = resolve_unlinked_placename_display_aliases(placetype, link)
return "[[" .. alias .. "]]"
end
else
return resolve_unlinked_placename_display_aliases(placetype, placename)
end
end
--[==[
Generate the "prefixed" version of a bare key, i.e. prefix it with `the` if correct for this key.
]==]
function export.get_prefixed_key(key, spec)
if spec.the then
return "the " .. key
else
return key
end
end
-- Necessary for use by [[Module:place]]. FIXME: Reorganize the modules so this isn't necessary.
export.iterate_matching_location = m_locations.iterate_matching_location
--[=[
Iterator that iterates over holonyms in `place_desc`. If `first_holonym_index` is given, start iterating at the
specified holonym and stop either when there are no more holonyms or a holonym with modifier `:also` is found. If
`first_holonym_index` is nil or omitted, iterate over all holonyms regardless. If `include_raw_text_holonyms` is
specified, raw text holonyms (those not of the form `placetype/placename`) are returned as well; they can be identified
by the fact that the `placetype` field in the holonym structure is nil. Two values are returned at each iteration, the
holonym index and holonym structure, similar to `ipairs()`.
]=]
function export.get_holonyms_to_check(place_desc, first_holonym_index, include_raw_text_holonyms)
local stop_at_also = not not first_holonym_index
return function(place_desc, index)
while true do
index = index + 1
local this_holonym = place_desc.holonyms[index]
-- If we were passed in a starting holonym index, go up to but not including a holonym marked with `:also`
-- (continue_cat_loop); the categorization code will then restart the loop at that holonym. That holonym
-- will have `:also` marked on it, so make sure not to stop immediately if the first holonym is marked with
-- `:also`.
if not this_holonym or stop_at_also and index > first_holonym_index and this_holonym.continue_cat_loop then
return nil
end
-- If not placetype, we're processing raw text, which we normally want to skip.
if include_raw_text_holonyms or this_holonym.placetype then
return index, this_holonym
end
end
end, place_desc, first_holonym_index and first_holonym_index - 1 or 0
end
--[==[
If the holonym in `data` (in the format as passed to a category handler) refers to a known location, iterate over all
such known locations, returning for each location the corresponding key, spec and group as well as the trail of
ancestral containers. Unlike `iterate_matching_location()`, this specifically checks that there is no mismatch between
the location's containers at any level and any of the following holonyms in the {{tl|place}} spec. The fields in `data`
are:
* `holonym_placetype`: The placetype of the holonym. It can actually be a list of possible placetypes, as with
`iterate_matching_location()`.
* `holonym_placename`: The placename of the holonym.
* `holonym_index`: The index of the holonym among the holonyms in `place_desc`, or nil if the holonym is not among the
holonyms in `place_desc`. (If a holonym index is given, we check for container mismatches among the holonyms
following the specified index, stopping either when encountering a holonym marked with modifier `:also` or, if none
exist, when we run out of holonyms. If no holonym index is given, we check all holonyms for container mismatches.)
* `place_desc`: Description of the place; used for the holonyms, to check for container mismatches.
Returns four values: the location group, the canonical key by which the location is known, the spec object describing
the location and the trail of ancestral containers for the location. The first three values are the same as for
`iterate_matching_location`.
]==]
function export.iterate_matching_holonym_location(data)
local holonym_placetype, holonym_placename, holonym_index, place_desc =
data.holonym_placetype, data.holonym_placename, data.holonym_index, data.place_desc
local matching_location_iterator = m_locations.iterate_matching_location {
placetypes = holonym_placetype,
placename = holonym_placename,
}
return function()
while true do
local group, key, spec = matching_location_iterator()
if not group then
return nil
end
local container_trail = {}
-- For each level of container, check that there are no mismatches (i.e. other location of the same
-- placetype) mentioned. We allow a mismatch at a given level if there's also a match with the container
-- at that level. For example, in the case of Kansas City, defined in [[Module:place/locations]] as a city
-- in Missouri, if we define it as {{tl|place|city|s/Missouri,Kansas}}, we ignore the mismatching state of
-- Kansas because the correct state of Missouri was also mentioned. But imagine we are defining Newark,
-- Delaware as {{tl|place|city|s/Delaware|c/US}} and (as is the case) we have an entry for Newark, New
-- Jersey in [[Module:place/locations]]. Just because the containing location `US` matches isn't enough,
-- because Newark, NJ also has New Jersey as a containing location and there's a mismatch at that level. If
-- there are no mismatches at any level we assume we're dealing with the right known location.
--
-- If at a given level there are multiple containing locations, we count a match if any holonym matches any
-- containing location, and a mismatch only if a holonym exists of the same placetype that doesn't match any
-- containing location.
local containers_mismatch = false
for containers in m_locations.iterate_containers(group, key, spec) do
insert(container_trail, containers)
local match_at_level = false
local mismatch_at_level = false
for other_holonym_index, other_holonym in export.get_holonyms_to_check(place_desc,
holonym_index and holonym_index + 1 or nil) do
local other_source_holonym = other_holonym.augmented_from_holonym
if other_source_holonym and other_source_holonym.placetype == holonym_placetype and
other_source_holonym.unlinked_placename ~= holonym_placename then
-- Ignore holonyms added during the augmentation process for other holonyms of the same
-- placetype as the placetype of the holonym we're considering. See comment in
-- augment_holonyms_with_container() for why we do this.
-- continue; grrr, no 'continue' in Lua
else
local holonym_matches_at_level = false
local holonym_exists_with_same_placetype = false
for _, container in ipairs(containers) do
if not container.spec.no_check_holonym_mismatch then
local full_container_placename, elliptical_container_placename =
m_locations.key_to_placename(container.group, container.key)
local placetypes = container.spec.placetype
if type(placetypes) ~= "table" then
placetypes = {placetypes}
end
local placetype_equivs = {}
for _, pt in ipairs(placetypes) do
m_table.extend(placetype_equivs, export.get_placetype_equivs(pt))
end
local this_holonym_matches = export.get_equiv_placetype_prop_from_equivs(
placetype_equivs, function(placetype)
return other_holonym.placetype == placetype and
(other_holonym.unlinked_placename == full_container_placename or
other_holonym.unlinked_placename == elliptical_container_placename)
end
)
if this_holonym_matches then
holonym_matches_at_level = true
break
end
local this_holonym_exists_with_same_placetype = export.get_equiv_placetype_prop_from_equivs(
placetype_equivs, function(placetype)
return other_holonym.placetype == placetype
end
)
if this_holonym_exists_with_same_placetype then
-- We seem to have a mismatch at this level. But before we decide conclusively that this
-- is the case, check to see whether the putative mismatch is an alias and matches when
-- we resolve the alias.
for oh_group, oh_key, oh_spec, oh_container_trail in
export.iterate_matching_holonym_location {
holonym_placetype = other_holonym.placetype,
holonym_placename = other_holonym.unlinked_placename,
holonym_index = other_holonym_index,
place_desc = place_desc,
} do
local oh_full_placename, oh_elliptical_placename =
m_locations.key_to_placename(oh_group, oh_key)
if oh_full_placename == full_container_placename or
oh_elliptical_placename == elliptical_container_placename then
-- Alias matched when resolved.
this_holonym_matches = true
break
end
end
if this_holonym_matches then
-- Alias matched above when resolved.
holonym_matches_at_level = true
break
else
-- Not an alias, or doesn't match when resolved. We have a true mismatch.
holonym_exists_with_same_placetype = true
end
end
end
end
if holonym_matches_at_level then
match_at_level = true
break
end
if holonym_exists_with_same_placetype then
mismatch_at_level = true
end
end
end
if not match_at_level and mismatch_at_level then
containers_mismatch = true
break
end
end
if not containers_mismatch then
return group, key, spec, container_trail
end
end
end
end
--[==[
If the holonym in `data` (in the format as passed to a category handler) refers to a known location, find and return the
corresponding key, spec and group as well as the trail of ancestral containers. This is like
`iterate_matching_holonym_location()` but throws an error if more than one location matches. (An example where this
would happen is {{tl|place|en|neighborhood|city/Newcastle}}, because there are two known locations named Newcastle. To
fix this, specify additional following disambiguating holonyms, e.g.
{{tl|place|en|neighborhood|city/Newcastle|s/New South Wales}}.
]==]
function export.find_matching_holonym_location(data)
local all_found = {}
for group, key, spec, container_trail in export.iterate_matching_holonym_location(data) do
insert(all_found, {group, key, spec, container_trail})
end
if not all_found[1] then
return nil
elseif all_found[2] then
local holonym_placetype = data.holonym_placetype
if type(holonym_placetype) == "table" then
holonym_placetype = concat(holonym_placetype, ",")
end
local found_keys = {}
for _, found in ipairs(all_found) do
local _, key, _, _ = unpack(found)
insert(found_keys, key)
end
error(("Found multiple matching locations for holonym '%s/%s'; specify disambiguating context in the " ..
"containing holonyms: %s"):format(holonym_placetype, data.holonym_placename, dump(found_keys)))
else
return unpack(all_found[1])
end
end
------------------------------------------------------------------------------------------
-- Placename and placetype data --
------------------------------------------------------------------------------------------
--[==[ var:
This is a map from aliases to their canonical forms. Any placetypes appearing as keys here will be mapped to their
canonical forms in all respects, including the display form. Contrast entries in 'placetype_data' with a fallback, which
applies to categorization and other processes but not to display.
The most important aliases are for holonym placetypes, particularly those that occur often such as "ประเทศ", "รัฐ",
"จังหวัด" and the like. Particularly long placetypes that mostly occur as entry placetypes (e.g.
"census-designated place") can be given abbreviations, but it is generally preferred to spell out the entry placetype.
Note also that we purposely avoid certain abbreviations that would be ambiguous (e.g. "d", which could variously be
interpreted as "department", "อำเภอ" or "division").
]==]
export.placetype_aliases = {
["acomm"] = "autonomous community",
["adr"] = "administrative region",
["adterr"] = "administrative territory", -- Pakistan
["aobl"] = "autonomous oblast",
["aokr"] = "autonomous okrug",
["ap"] = "autonomous province",
["apref"] = "autonomous prefecture",
["aprov"] = "autonomous province",
["ar"] = "autonomous region",
["arch"] = "archipelago",
["arep"] = "autonomous republic",
["aterr"] = "autonomous territory",
["atu"] = "autonomous territorial unit",
["bor"] = "borough",
["c"] = "ประเทศ",
["can"] = "canton",
["carea"] = "council area",
["cc"] = "constituent country",
["cdblock"] = "community development block",
["cdep"] = "Crown dependency",
["CDP"] = "census-designated place",
["cdp"] = "census-designated place",
["clcity"] = "county-level city",
["co"] = "เทศมณฑล",
["cobor"] = "county borough",
["colcity"] = "county-level city",
["coll"] = "collectivity",
["comm"] = "community",
["cont"] = "ทวีป",
["contr"] = "continental region",
["contregion"] = "continental region",
["cpar"] = "civil parish",
["damun"] = "direct-administered municipality",
["dep"] = "dependency",
["department capital"] = "departmental capital",
["dept"] = "department",
["depterr"] = "dependent territory",
["dist"] = "อำเภอ",
["distmun"] = "district municipality",
["div"] = "division",
["emp"] = "จักรวรรดิ",
["fpref"] = "French prefecture",
["gov"] = "governorate",
["govnat"] = "governorate",
["home-rule city"] = "home rule city",
["home-rule municipality"] = "home rule municipality",
["inner-city area"] = "inner city area",
["ires"] = "Indian reservation",
["isl"] = "เกาะ",
["lbor"] = "London borough",
["lga"] = "local government area",
["lgarea"] = "local government area",
["lgd"] = "local government district",
["lgdist"] = "local government district",
["metbor"] = "metropolitan borough",
["metcity"] = "metropolitan city",
["metmun"] = "metropolitan municipality",
["mtn"] = "ภูเขา",
["mun"] = "เทศบาล",
["mundist"] = "municipal district",
["nonmetropolitan county"] = "non-metropolitan county",
["obl"] = "oblast",
["okr"] = "okrug",
["p"] = "จังหวัด",
["par"] = "parish",
["parmun"] = "parish municipality",
["pen"] = "peninsula",
["plcity"] = "prefecture-level city",
["plcolony"] = "Polish colony",
["pref"] = "prefecture",
["prefcity"] = "prefecture-level city",
["preflcity"] = "prefecture-level city",
["prov"] = "จังหวัด",
["r"] = "ภูมิภาค",
["range"] = "เทือกเขา",
["rcm"] = "regional county municipality",
["rcomun"] = "regional county municipality",
["rdist"] = "regional district",
["rep"] = "republic",
["rhrom"] = "rural hromada",
["riv"] = "แม่น้ำ",
["rmun"] = "regional municipality",
["robor"] = "royal borough",
["romp"] = "Roman province",
["runit"] = "regional unit",
["rurmun"] = "rural municipality",
["s"] = "รัฐ",
["sar"] = "special administrative region",
["shrom"] = "settlement hromada",
["spref"] = "subprefecture",
["sprefcity"] = "sub-prefectural city",
["sprovcity"] = "subprovincial city",
["submet city"] = "sub-metropolitan city",
["submetropolitan city"] = "sub-metropolitan city",
["sub-prefecture-level city"] = "sub-prefectural city",
["sub-provincial city"] = "subprovincial city",
["sub-provincial district"] = "subprovincial district",
["terr"] = "ดินแดน",
["terrauth"] = "territorial authority",
["twp"] = "township",
["twpmun"] = "township municipality",
["uauth"] = "unitary authority",
["ucomm"] = "unincorporated community",
["udist"] = "unitary district",
["uhrom"] = "urban hromada",
["uterr"] = "union territory",
["utwpmun"] = "united township municipality",
["val"] = "valley",
["vdc"] = "village development committee",
["vil"] = "village",
["voi"] = "voivodeship",
["wcomm"] = "Welsh community",
}
local no_link_def_article = {link = false, article = "the"}
local no_link_no_article = {link = false, article = false}
--[==[ var:
These qualifiers can be prepended onto any placetype and will be handled correctly. For example, the placetype
`large city` will be displayed as `large <nowiki>[[city]]</nowiki>` and categorized as if `city` were specified. If the
value in the following table is a string, the qualifier will display according to the string. If the value is `true`,
the qualifier will be linked to its corresponding Wiktionary entry. If the value is `false`, the qualifier will not be
linked but will appear as-is. Note that these qualifiers do not override placetypes with entries elsewhere that contain
those same qualifiers. For example, the entry for `inland sea` in `placetype_data` will apply in preference to treating
`inland sea` as equivalent to `sea`.
]==]
export.placetype_qualifiers = {
-- generic qualifiers
["huge"] = false,
["tiny"] = false,
["large"] = false,
["big"] = false,
["mid-size"] = false,
["mid-sized"] = false,
["small"] = false,
["sizable"] = false,
["important"] = false,
["long"] = false,
["short"] = false,
["major"] = false,
["minor"] = false,
["high"] = false,
["tall"] = false,
["low"] = false,
["left"] = false, -- left tributary
["right"] = false, -- right tributary
["modern"] = false, -- for use in opposition to "ancient" in another definition
-- "former" qualifiers
["abandoned"] = true,
["ancient"] = true,
["deserted"] = true,
["extinct"] = true,
["former"] = false,
["historic"] = "historical",
["historical"] = true,
["medieval"] = true,
["mediaeval"] = true,
["ruined"] = true,
["traditional"] = true,
-- sea qualifiers
["coastal"] = true,
["inland"] = true, -- note, we also have an entry in placetype_data for 'inland sea' to get a link to [[inland sea]]
["maritime"] = true,
["overseas"] = true,
["seaside"] = true,
["beachfront"] = true,
["beachside"] = true,
["riverside"] = true,
-- lake qualifiers
["freshwater"] = true,
["saltwater"] = true,
["endorheic"] = true,
["oxbow"] = true,
["ox-bow"] = "[[oxbow]]", -- [[ox-bow]] is a red link
["tidal"] = true,
-- land qualifiers
["hilltop"] = true,
["hilly"] = true,
["insular"] = true,
["peninsular"] = true,
["chalk"] = true,
["karst"] = true,
["limestone"] = true,
["mountainous"] = true,
["mountaintop"] = true,
["alpine"] = true,
["volcanic"] = true, -- for an island
-- political status qualifiers
["autonomous"] = true,
["incorporated"] = true,
["special"] = true,
["unincorporated"] = true,
["coterminous"] = true,
-- monetary status/etc. qualifiers
["fashionable"] = true,
["wealthy"] = true,
["affluent"] = true,
["declining"] = true,
-- city vs. rural qualifiers
["urban"] = true,
["suburban"] = true,
["exurban"] = true,
["outlying"] = true,
["remote"] = true,
["rural"] = true,
["outback"] = true,
["inner"] = false,
["inner-city"] = true,
["central"] = false,
["outer"] = false,
-- land use qualifiers
["residential"] = true,
["agricultural"] = true,
["business"] = true,
["commercial"] = true,
["industrial"] = true,
-- business use qualifiers
["railroad"] = true,
["railway"] = true,
["farming"] = true,
["fishing"] = true,
["mining"] = true,
["logging"] = true,
["cattle"] = true,
-- tourism use qualifiers
["resort"] = true, -- note, we also have 'resort city' and 'resort town', that take precedecne
["spa"] = true, -- note, we also have 'spa city' and 'spa town', that take precedecne
["ski"] = true, -- note, we also have 'ski resort city' and 'ski resort town', that take precedecne
-- religious qualifiers
["holy"] = true,
["sacred"] = true,
["religious"] = true,
["secular"] = true,
-- qualifiers for nonexistent places
["claimed"] = false,
["fictional"] = true,
["legendary"] = true,
["mythical"] = true,
["mythological"] = true,
-- directional qualifiers
["northern"] = false,
["southern"] = false,
["eastern"] = false,
["western"] = false,
["north"] = false,
["south"] = false,
["east"] = false,
["west"] = false,
["northeastern"] = false,
["southeastern"] = false,
["northwestern"] = false,
["southwestern"] = false,
["northeast"] = false,
["southeast"] = false,
["northwest"] = false,
["southwest"] = false,
-- seasonal qualifiers
["summer"] = true, -- e.g. for 'summer capital'
["winter"] = true,
-- legal status qualifiers
-- FIXME: Two-word qualifiers don't work yet. But you can enter "de-facto" and it's canonicalized to [[de facto]].
["official"] = true,
["unofficial"] = true,
["de facto"] = true, -- 'de facto capital'
["de-facto"] = "[[de facto]]", -- [[de-facto]] is a red link
["de jure"] = true, -- 'de jure capital'
["de-jure"] = "[[de jure]]", -- [[de-jure]] is a red link
-- NOTE: 'unrecognized/unrecognised' are handled as placetypes 'unrecognized country', 'unrecognized state'
-- misc. qualifiers
["planned"] = true,
["chartered"] = true,
["landlocked"] = true,
["uninhabited"] = true,
-- superlative qualifiers
["first"] = no_link_def_article,
["second"] = no_link_def_article, -- for "second largest" etc.
["third"] = no_link_def_article,
["fourth"] = no_link_def_article,
["last"] = no_link_def_article,
["only"] = no_link_def_article,
["sole"] = no_link_def_article,
["main"] = no_link_def_article,
["largest"] = no_link_def_article,
["biggest"] = no_link_def_article,
["smallest"] = no_link_def_article,
["shortest"] = no_link_def_article,
["longest"] = no_link_def_article,
["tallest"] = no_link_def_article,
["highest"] = no_link_def_article,
["lowest"] = no_link_def_article,
["leftmost"] = no_link_def_article,
["rightmost"] = no_link_def_article,
["innermost"] = no_link_def_article,
["outermost"] = no_link_def_article,
["northernmost"] = no_link_def_article,
["southernmost"] = no_link_def_article,
["westernmost"] = no_link_def_article,
["easternmost"] = no_link_def_article,
["northwesternmost"] = no_link_def_article,
["southwesternmost"] = no_link_def_article,
["northeasternmost"] = no_link_def_article,
["southeasternmost"] = no_link_def_article,
-- several/various
["several"] = no_link_no_article,
["various"] = no_link_no_article,
["numerous"] = no_link_no_article,
["multiple"] = no_link_no_article,
["many"] = no_link_no_article,
["other"] = no_link_no_article,
}
--[==[ var:
In this table, the key qualifiers should be treated the same as the value qualifiers for categorization purposes. This
is overridden by `placetype_data` and `qualifier_to_placetype_equivs`.
]==]
export.former_qualifiers = {
["abandoned"] = {"FORMER"},
["ancient"] = {"ANCIENT", "FORMER"},
["former"] = {"FORMER"},
["extinct"] = {"FORMER"},
["historic"] = {"FORMER"},
["historical"] = {"FORMER"},
["medieval"] = {"ANCIENT", "FORMER"},
["mediaeval"] = {"ANCIENT", "FORMER"},
["ruined"] = {"ANCIENT", "FORMER"},
["traditional"] = {"FORMER"},
}
--[==[ var:
In this table, any placetypes containing these qualifiers that do not occur in `placetype_data` should be mapped to the
specified placetypes for categorization purposes. Entries here are overridden by `placetype_data`.
]==]
export.qualifier_to_placetype_equivs = {
["fictional"] = "fictional location",
["legendary"] = "mythological location",
["mythical"] = "mythological location",
["mythological"] = "mythological location",
-- For e.g. Taiwan as a "claimed province" of China; parts of Belize as claimed by Guatemala; various islands
-- claimed by various parties in East Asia. FIXME: We should conditionalize on what is being claimed since there are
-- also claimed capitals, e.g. Israel and Palestine claim Jerusalem as their capital.
["claimed"] = "claimed political division",
}
--[==[ var:
Mapping from placetypes to the corresponding plural category-only placetype for a capital of that placetype. The reverse
mapping also exists.
]==]
export.placetype_to_capital_cat = {
["autonomous community"] = "autonomous community capitals",
["canton"] = "cantonal capitals",
["comarca"] = "comarca capitals",
["ประเทศ"] = "national capitals",
-- The following are not obviously different from 'county seats' but the latte terminology is used in the US.
["เทศมณฑล"] = "county capitals",
["department"] = "departmental capitals",
["อำเภอ"] = "district capitals",
["division"] = "division capitals",
["emirate"] = "emirate capitals",
["governorate"] = "governorate capitals",
["hromada"] = "hromada capitals",
["krai"] = "krai capitals",
["metropolitan city"] = "metropolitan city capitals",
["เทศบาล"] = "municipal capitals",
["oblast"] = "oblast capitals",
["okrug"] = "okrug capitals",
["prefecture"] = "prefectural capitals",
["จังหวัด"] = "provincial capitals",
["raion"] = "raion capitals",
["regency"] = "regency capitals",
["ภูมิภาค"] = "regional capitals",
["regional unit"] = "regional unit capitals",
["republic"] = "republic capitals",
["รัฐ"] = "state capitals",
["ดินแดน"] = "territorial capitals",
["voivodeship"] = "voivodeship capitals",
}
--[==[ var:
This contains placenames that should be preceded by an article (almost always "the"). '''NOTE''': There are multiple
ways that placenames can come to be preceded by "the":
# Listed here.
# Given in [[Module:place/locations]] with an initial "the". All such placenames are added to this map by the code
just below the map.
# The placetype of the placename has `holonym_use_the = true` in its placetype_data.
# A regex in placename_the_re matches the placename.
Note that "the" is added only before the first holonym in a place description.
]==]
export.placename_article = {
-- This should only contain info that can't be inferred from [[Module:place/locations]].
["archipelago"] = {
["Cyclades"] = "the",
["Dodecanese"] = "the",
},
["ประเทศ"] = {
["Holy Roman Empire"] = "the",
},
["จักรวรรดิ"] = {
["Holy Roman Empire"] = "the",
},
["เกาะ"] = {
["North Island"] = "the",
["South Island"] = "the",
},
["ภูมิภาค"] = {
["Balkans"] = "the",
["Russian Far East"] = "the",
["Caribbean"] = "the",
["Caucasus"] = "the",
["Middle East"] = "the",
["New Territories"] = "the",
["North Caucasus"] = "the",
["South Caucasus"] = "the",
["West Bank"] = "the",
["Gaza Strip"] = "the",
},
["valley"] = {
["San Fernando Valley"] = "the",
},
}
--[==[ var:
Regular expressions to apply to determine whether we need to put 'the' before a holonym. The key "*" applies to all
holonyms, otherwise only the regexes for the holonym's placetype apply.
]==]
export.placename_the_re = {
-- We don't need entries for peninsulas, seas, oceans, gulfs or rivers
-- because they have holonym_use_the = true.
["*"] = {"^Isle of ", " Islands$", " Mountains$", " Empire$", " Country$", " Region$", " District$", "^City of "},
["bay"] = {"^Bay of "},
["ทะเลสาบ"] = {"^Lake of "},
["ประเทศ"] = {"^Republic of ", " Republic$"},
["republic"] = {"^Republic of ", " Republic$"},
["ภูมิภาค"] = {" [Rr]egion$"},
["แม่น้ำ"] = {" River$"},
["local government area"] = {"^Shire of "},
["เทศมณฑล"] = {"^Shire of "},
["Indian reservation"] = {" Reservation", " Nation"},
["tribal jurisdictional area"] = {" Reservation", " Nation"},
}
--[==[ var:
If any of the following holonyms are present, the associated holonyms are automatically added to the end of the list of
holonyms for categorization (but not display) purposes.
]==]
export.cat_implications = {
["ภูมิภาค"] = {
["Eastern Europe"] = {"continent/Europe"},
["Central Europe"] = {"continent/Europe"},
["Western Europe"] = {"continent/Europe"},
["South Europe"] = {"continent/Europe"},
["Southern Europe"] = {"continent/Europe"},
["Northern Europe"] = {"continent/Europe"},
["Northeast Europe"] = {"continent/Europe"},
["Northeastern Europe"] = {"continent/Europe"},
["Southeast Europe"] = {"continent/Europe"},
["Southeastern Europe"] = {"continent/Europe"},
["North Caucasus"] = {"continent/Europe"},
["South Caucasus"] = {"continent/Asia"},
["South Asia"] = {"continent/Asia"},
["Southern Asia"] = {"continent/Asia"},
["East Asia"] = {"continent/Asia"},
["Eastern Asia"] = {"continent/Asia"},
["Central Asia"] = {"continent/Asia"},
["West Asia"] = {"continent/Asia"},
["Western Asia"] = {"continent/Asia"},
["Southeast Asia"] = {"continent/Asia"},
["North Asia"] = {"continent/Asia"},
["Northern Asia"] = {"continent/Asia"},
["Anatolia"] = {"continent/Asia"},
["Asia Minor"] = {"continent/Asia"},
["Mesopotamia"] = {"continent/Asia"},
["North Africa"] = {"continent/Africa"},
["Central Africa"] = {"continent/Africa"},
["West Africa"] = {"continent/Africa"},
["East Africa"] = {"continent/Africa"},
["Southern Africa"] = {"continent/Africa"},
["Central America"] = {"continent/Central America"},
["Caribbean"] = {"continent/North America"},
["Polynesia"] = {"continent/Oceania"},
["Micronesia"] = {"continent/Oceania"},
["Melanesia"] = {"continent/Oceania"},
["Siberia"] = {"country/Russia", "continent/Asia"},
["Russian Far East"] = {"country/Russia", "continent/Asia"},
["South Wales"] = {"constituent country/Wales", "continent/Europe"},
["Balkans"] = {"continent/Europe"},
["West Bank"] = {"country/Palestine", "continent/Asia"},
["Gaza"] = {"country/Palestine", "continent/Asia"},
["Gaza Strip"] = {"country/Palestine", "continent/Asia"},
}
}
------------------------------------------------------------------------------------------
-- Category and display handlers --
------------------------------------------------------------------------------------------
local function city_type_cat_handler(data)
local entry_placetype = data.entry_placetype
local generic_before_non_cities = export.get_placetype_prop(entry_placetype, "generic_before_non_cities")
if not generic_before_non_cities then
internal_error("city_type_cat_handler called on placetype %s that doesn't have a `generic_before_non_cities`" ..
" setting", entry_placetype)
end
local plural_entry_placetype = export.pluralize_placetype(entry_placetype)
local group, key, spec, container_trail = export.find_matching_holonym_location(data)
if group and not spec.is_former_place and not spec.is_city then
-- Categorize both in key, and in the larger polity that the key is part of, e.g. [[Hirakata]] goes in both
-- "Cities in Osaka Prefecture" and "Cities in Japan". (But don't do the latter if no_container_cat is set.)
local cap_plural_entry_placetype = ucfirst(plural_entry_placetype)
local retcats = {("%s%s%s"):format(cap_plural_entry_placetype, generic_before_non_cities, export.get_prefixed_key(key, spec))} --th
if container_trail[1] and not spec.no_container_cat then
for _, container in ipairs(container_trail[1]) do
insert(retcats, ("%s%s%s"):format(cap_plural_entry_placetype, generic_before_non_cities, export.get_prefixed_key(container.key, container.spec))) --th
end
end
return retcats
end
end
local function capital_city_cat_handler(data, non_city)
local holonym_placetype, holonym_placename, holonym_index, place_desc =
data.holonym_placetype, data.holonym_placename, data.holonym_index, data.place_desc
-- The first time we're called we want to return something; otherwise we will be called for later-mentioned
-- holonyms, which can result in wrongly classifying into e.g. `National capitals`. Simulate the loop in
-- find_placetype_cat_specs() over holonyms so we get the proper `Cities in ...` categories as well as the capital
-- category/categories we add below.
local retcats
if not non_city and place_desc.holonyms then
for h_index, holonym in export.get_holonyms_to_check(place_desc, holonym_index) do
local h_placetype, h_placename = holonym.placetype, holonym.unlinked_placename
retcats = city_type_cat_handler {
entry_placetype = "นคร",
holonym_placetype = h_placetype,
holonym_placename = h_placename,
holonym_index = h_index,
place_desc = place_desc,
}
if retcats then
break
end
end
end
if not retcats then
retcats = {}
end
-- Now find the appropriate capital-type category for the placetype of the holonym, e.g. 'State capitals'. If we
-- recognize the holonym among the known holonyms in [[Module:place/locations]], also add a category like 'State
-- capitals of the United States'. Truncate e.g. 'autonomous region' to 'region', 'union territory' to 'territory'
-- when looking up the type of capital category, if we can't find an entry for the holonym placetype itself (there's
-- an entry for 'autonomous community').
local capital_cat = export.placetype_to_capital_cat[holonym_placetype]
if not capital_cat then
capital_cat = export.placetype_to_capital_cat[holonym_placetype:gsub("^.* ", "")]
end
if capital_cat then
capital_cat = ucfirst(capital_cat)
local inserted_specific_variant_cat = false
if holonym_index then
-- Now find the first recognized holonym location. We don't stop when :also is seen because of the common pattern
-- where we use :also to specify that a given city is the capital at multiple surrounding levels.
local matching_group, matching_key, matching_spec, matching_container_trail, matching_holonym_index
for h_index = holonym_index, #place_desc.holonyms do
if place_desc.holonyms[h_index].placetype then
matching_group, matching_key, matching_spec, matching_container_trail = export.find_matching_holonym_location {
holonym_placetype = place_desc.holonyms[h_index].placetype,
holonym_placename = place_desc.holonyms[h_index].unlinked_placename,
holonym_index = h_index,
place_desc = place_desc,
}
if matching_group then
matching_holonym_index = h_index
break
end
end
end
if matching_holonym_index == holonym_index then
if matching_container_trail[1] and not matching_spec.no_container_cat then
for _, container in ipairs(matching_container_trail[1]) do
insert(retcats, ("%sของ%s"):format(capital_cat, export.get_prefixed_key(container.key,
container.spec)))
inserted_specific_variant_cat = true
end
end
elseif matching_holonym_index then
-- Check to make sure that the holonym placetype we were called on is listed among the
-- divtypes of the location we found.
local function insert_specific_variant_if_possible(key, spec)
return export.get_equiv_placetype_prop(holonym_placetype, function(pt)
local plural_holonym_placetype = export.pluralize_placetype(pt)
local saw_matching_div
if spec.divs then
local divs = spec.divs
if type(divs) ~= "table" then
divs = {divs}
end
for _, div in ipairs(divs) do
if type(div) ~= "table" then
div = {type = div}
end
if plural_holonym_placetype == div.type then
saw_matching_div = true
break
end
end
end
if saw_matching_div then
insert(retcats, ("%sของ%s"):format(capital_cat, export.get_prefixed_key(key, spec)))
return true
end
return false
end)
end
if insert_specific_variant_if_possible(matching_key, matching_spec) then
inserted_specific_variant_cat = true
elseif not matching_spec.no_container_cat then
for _, containers in ipairs(matching_container_trail) do
local saw_no_container_cat = false
for _, container in ipairs(containers) do
if insert_specific_variant_if_possible(container.key, container.spec) then
inserted_specific_variant_cat = true
break
end
saw_no_container_cat = saw_no_container_cat or container.spec.no_container_cat
end
if inserted_specific_variant_cat or saw_no_container_cat then
break
end
end
end
end
else
-- This happens when in an invocation like {{place|en|capital city|s/Haryana,Punjab}} for
-- [[Chandigarh]]. We fall back to older code that doesn't depend on the holonym index existing.
-- FIXME: This may not be necessary. In the example just given, when processing Haryana we add to
-- [[:Category:en:State capitals of India]], and nothing extra gets added when processing Punjab.
-- Possibly we can just skip this case entirely.
local group, key, spec, container_trail = export.find_matching_holonym_location(data)
if group and container_trail[1] and not spec.no_container_cat then
for _, container in ipairs(container_trail[1]) do
insert(retcats, ("%sของ%s"):format(capital_cat, export.get_prefixed_key(container.key,
container.spec)))
inserted_specific_variant_cat = true
end
end
end
if not inserted_specific_variant_cat then
insert(retcats, capital_cat)
end
else
-- We didn't recognize the holonym placetype; just put in 'Capital cities'.
insert(retcats, "Capital cities")
end
return retcats
end
--[=[
This is invoked specially for all placetypes (see the `*` placetype key at the bottom of `placetype_data`). This is used
in two ways:
# To add pages to generic holonym categories like [[:Category:en:สถานที่ในMerseyside, England]] (and
[[:Category:en:สถานที่ในEngland]]) for any pages that have `co/Merseyside` as their holonym.
# To categorize demonyms in bare placename categories like [[:Category:en:Merseyside, England]] if the demonym
description mentions `co/Merseyside` and doesn't mention a more specific placename that also has a category. (In this
case there are none, but we can have demonyms at multiple levels, e.g. in France for individual villages, departments,
administrative regions, and for the entire country, and for example we only want to categorize a demonym into
[[:Category:France]] if no more specific category applies.) Unlike when invoked from {{tl|place}}, a demonym
invocation only adds the most specific holonym category and not the category of any containing polity (hence if we
add [[:Category:en:Merseyside, England]] we won't also add [[:Category:England]]).
This code also handles cities; e.g. for the first use case above, it would be used to add a page that has `city/Boston`
as a holonym to [[:Category:en:สถานที่ในBoston]], along with [[:Category:en:สถานที่ในMassachusetts, USA]] and
[[:Category:en:สถานที่ในthe United States]]. The city handler tries to deal with the possibility of multiple cities
having the same name. For example, the code in [[Module:place/locations]] knows about the city of [[Columbus]],
[[Ohio]], which has containing polities `Ohio` (a state) and `the United States` (a country). If either containing
polity is mentioned, the handler proceeds to return the key `Columbus` (along with `Ohio, USA` and `the United States`).
Otherwise, if any other state or country is mentioned, the handler returns nothing, and otherwise it assumes the
mentioned city is the one we're considering and returns `Columbus` etc. This works correctly if the place only mentions
Ohio and a holonym for a Columbus in a different country is encountered, because of the function
`augment_holonyms_with_container`, which adds the US as a holonym when Ohio is encountered.
The single parameter `data` is as in category handlers. The return value is a list of categories (without the preceding
language code).
]=]
local function generic_place_cat_handler(data)
local from_demonym = data.from_demonym
local retcats = {}
local function insert_retkey(key, spec)
if from_demonym then
insert(retcats, key)
else
insert(retcats, ("สถานที่ใน%s"):format(export.get_prefixed_key(key, spec)))
end
end
local group, key, spec, container_trail = export.find_matching_holonym_location(data)
if group then
if not spec.no_generic_place_cat then
-- This applies to continents and continental regions.
insert_retkey(key, spec)
end
-- Categorize both in key, and in the larger location(s) that the key is part of, e.g. [[Hirakata]] goes in
-- both [[Category:สถานที่ในOsaka Prefecture, Japan]] and [[Category:สถานที่ในJapan]]. But not when
-- no_container_cat is set (e.g. for 'United Kingdom').
if not spec.no_container_cat then
for _, container_set in ipairs(container_trail) do
local stop_adding_containers = false
for _, container in ipairs(container_set) do
if not container.spec.no_generic_place_cat then
insert_retkey(container.key, container.spec)
end
if container.spec.no_container_cat then
stop_adding_containers = true
end
end
if stop_adding_containers then
break
end
end
end
return retcats
end
end
--[==[
Special category handler run for all placetypes that checks for specified division placetypes of known locations and
categorizes appropriately.
]==]
function export.political_division_cat_handler(data)
if data.from_demonym then
return
end
local group, key, spec, container_trail = export.find_matching_holonym_location(data)
if group then
local divlists = {}
if spec.divs then
insert(divlists, spec.divs)
end
if spec.addl_divs then
insert(divlists, spec.addl_divs)
end
for _, divlist in ipairs(divlists) do
if type(divlist) ~= "table" then
divlist = {divlist}
end
for _, div in ipairs(divlist) do
if type(div) == "string" then
div = {type = div}
end
local sgdiv = export.maybe_singularize_placetype(div.type) or div.type
local prep = div.prep or "ของ"
local cat_as = div.cat_as or div.type
if type(cat_as) ~= "table" then
cat_as = {cat_as}
end
if not export.placetype_data[sgdiv] then
internal_error("Placetype %s associated with known location key %s and data %s not found in " ..
"`placetype_data`", sgdiv, key, spec)
end
if sgdiv == data.entry_placetype then
local retcats = {}
for _, pt_cat in ipairs(cat_as) do
if type(pt_cat) == "string" then
pt_cat = {type = pt_cat}
end
local pt_prep = pt_cat.prep or prep
insert(retcats, ucfirst(pt_cat.type) .. pt_prep .. export.get_prefixed_key(key, spec)) --th
end
return retcats
end
end
end
end
end
--[==[
This is used to add pages to "bare" categories like [[:Category:en:Georgia, USA]] for `[[Georgia]]` and any
foreign-language terms that are translations of the state of Georgia. We look at the page title (or its overridden value
in {{para|pagename}}) as well as the glosses in {{para|t}}/{{para|t2}} etc., various extra-info values such as the
modern names in {{para|modern}}, and any values specified using a form-of directive. We need to pay attention to the
entry placetypes specified so we don't overcategorize; e.g. the US state of Georgia is `[[Джорджия]]` in Russian but the
country of Georgia is `[[Грузия]]`, and if we just looked for matching names, we'd get both Russian terms categorized
into both [[:Category:ru:Georgia, USA]] and [[:Category:ru:Georgia]]. We also need to check the containing holonyms to
make sure there isn't a mismatch (so we don't e.g. categorize Newark, Delaware in [[:Category:en:Newark]], which is
intended for Newark, New Jersey).
]==]
function export.get_bare_categories(args, overall_place_spec)
local bare_cats = {}
local place_descs = overall_place_spec.descs
local possible_placetypes_by_place_desc = {}
for i, place_desc in ipairs(place_descs) do
possible_placetypes_by_place_desc[i] = {}
for _, placetype in ipairs(place_desc.placetypes) do
if not export.placetype_is_ignorable(placetype) then
local equivs = export.get_placetype_equivs(placetype, {register_former_as_non_former = true})
for _, equiv in ipairs(equivs) do
insert(possible_placetypes_by_place_desc[i], equiv.placetype)
end
end
end
end
local function check_term(term)
-- Treat Wikipedia links like local ones.
term = term:gsub("%[%[w:", "[["):gsub("%[%[wikipedia:", "[[")
term = export.remove_links_and_html(term)
term = term:gsub("^the ", "")
for i, place_desc in ipairs(place_descs) do
-- Iterate over all matching locations in case there are multiple, as with Delhi defined as
-- {{place|en|megacity/and/union territory|c/India|containing the national capital [[New Delhi]]}}.
for group, key, spec, container_trail in export.iterate_matching_holonym_location {
holonym_placetype = possible_placetypes_by_place_desc[i],
holonym_placename = term,
place_desc = place_desc,
} do
insert(bare_cats, key)
end
end
end
-- FIXME: Should we only do the following if the language is English (requires that the lang is passed in)?
-- We should always do it if `pagename` is given (as it is with {{tcl}}) but maybe not otherwise unless 1=en. There
-- are cases like [[Ankara]] = English name for capital of Turkey, but also the name in various languages for the
-- capital of Ghana (= English [[Accra]]). But this should get caught by mismatching the containing country. The
-- advantage of checking when the language isn't English is we catch those places that fail to give an English
-- translation but where the translation happens to be the same as the other-language spelling. However, I don't
-- know how often this situation occurs.
check_term(args.pagename or mw.title.getCurrentTitle().subpageText)
for _, t in ipairs(args.t) do
check_term(t)
end
local function check_termobj_list(terms)
for _, term in ipairs(terms) do
if term.eq then
check_term(term.eq)
end
if term.alt or term.term then
check_term(term.alt or term.term)
end
end
end
for _, extra_info_terms in ipairs(overall_place_spec.extra_info) do
local arg = extra_info_terms.arg
if arg == "modern" or arg == "now" or arg == "full" or arg == "short" then
check_termobj_list(extra_info_terms.terms)
end
end
for _, directive in ipairs(overall_place_spec.directives) do
check_termobj_list(directive.terms)
end
return bare_cats
end
--[==[
This is used to augment the holonyms associated with a place description with the containing polities. For example,
given the following:
`# {{tl|place|en|subprefecture|pref/Hokkaido}}.`
We auto-add Japan as another holonym so that the term gets categorized into [[:Category:Subprefectures of Japan]].
To avoid over-categorizing we need to check to make sure no other countries are specified as holonyms.
]==]
function export.augment_holonyms_with_container(place_descs)
for _, place_desc in ipairs(place_descs) do
if place_desc.holonyms then
-- This ends up containing a copy of the original holonyms, with the augmented holonyms inserted in their
-- appropriate position. We don't just put them at the end because some holonyms have use the `:also`
-- modifier, which causes category processing to restart at that point after generating categories for a
-- preceding holonym, and we don't want the preceding holonym's augmented holonyms interfering with
-- categorization of a later holonym. We proceed from right to left, and each time we augment, we copy
-- the holonyms with the augmented holonym(s) inserted appropriately and replace the place description's
-- holonyms with the augmented ones before the next iteration. The reason for this is so that e.g.
-- {{place|neighborhood|city/Birmingham|co/West Midlands|cc/England}} doesn't throw an error during the
-- augmentation process due to 'Birmingham' referring to two known locations (in England and Alabama). If
-- we go left to right, we will throw an ambiguity error on `city/Birmingham` because code to exclude
-- Birmingham, Alabama needs `c/United Kingdom` present (to cause a mismatch with `c/United States`),
-- which isn't yet present as the augmentation code hasn't gotten to `cc/England` yet. For similar
-- reasons, we need to include the augmented holonyms in the holonyms considered in the next iteration
-- rather than modifying the place description once at athe end.
for i = #place_desc.holonyms, 1, -1 do
local holonym = place_desc.holonyms[i]
if holonym.placetype and not export.placetype_is_ignorable(holonym.placetype) then
local group, key, spec, container_trail = export.find_matching_holonym_location {
holonym_placetype = holonym.placetype,
holonym_placename = holonym.unlinked_placename,
holonym_index = i,
place_desc = place_desc,
}
if group and container_trail[1] and not spec.no_auto_augment_container then
local augmented_holonyms = {}
for j = 1, i do
insert(augmented_holonyms, place_desc.holonyms[j])
end
for _, containers in ipairs(container_trail) do
local any_no_auto_augment_container = false
for _, container in ipairs(containers) do
any_no_auto_augment_container = any_no_auto_augment_container or
container.spec.no_auto_augment_container
local containing_type = container.spec.placetype
if type(containing_type) == "table" then
-- If the containing type is a list, use the first element as the canonical variant.
containing_type = containing_type[1]
end
local full_container_placename, elliptical_container_placename =
m_locations.key_to_placename(container.group, container.key)
-- Don't side-effect holonyms while processing them.
local new_holonym = {
-- By the time we run, the display has already been generated so we don't need to
-- set display_placename.
placetype = containing_type,
-- placename_to_key() for the group should correctly handle both full and elliptical
-- placenames, but the full placename seems less likely to be ambiguous. FIXME: We
-- should just store the key directly and use it when available to avoid having to
-- convert key to placename and back to key.
unlinked_placename = full_container_placename,
-- Indicate that this is an augmented holonym, and was derived from the specified
-- holonym. In iterate_matching_holonym_location(), we ignore augmented holonyms
-- derived from holonyms that are different from the holonym we're searching for but
-- of the same placetype. This is to correctly handle a situation like
-- {{place|river|dept/Ardèche,Gard,Vaucluse,Bouches-du-Rhône|c/France}}. Here,
-- `Ardèche` is in `r/Auvergne-Rhône-Alpes`, while `Gard` is in `r/Occitania` and
-- the other two are in `r/Provence-Alpes-Côte d'Azur`. Augmenting proceeds from
-- right to left, so after it adds `r/Provence-Alpes-Côte d'Azur` to
-- `Bouches-du-Rhône`, Vaucluse gets augmented correctly but `Gard` fails to match
-- in find_matching_holonym_location() because of the mismatch between augmented
-- `r/Provence-Alpes-Côte d'Azur` and actual `r/Occitania`. Similarly, all later
-- calls to find_matching_holonym_location() fail to match `Gard` (and likewise
-- `Ardèche`) against any known location. To deal with this, we mark augmented
-- holoynms as being augmented due to a source holonym, and when processing a given
-- holonym, ignore augmented holonyms from other holonyms of the same placetype.
-- The restriction to the same placetype is so that `Birmingham` still gets
-- correctly disambiguated to Birmingham, England in the example given above near
-- the top of this function, using the augmented holonym `c/United Kingdom` added by
-- the specified `cc/England` (whose placetype `constituent country` differs from
-- the placetype `city` of Birmingham).
augmented_from_holonym = holonym,
}
insert(augmented_holonyms, new_holonym)
-- But it is safe to modify other parts of the place_desc.
export.key_holonym_into_place_desc(place_desc, new_holonym)
end
if any_no_auto_augment_container then
break
end
end
for j = i + 1, #place_desc.holonyms do
insert(augmented_holonyms, place_desc.holonyms[j])
end
place_desc.holonyms = augmented_holonyms
end
end
end
end
end
end
-- Cat handler for district, areas, neighborhoods and suburbs. Districts are tricky because they can either be political
-- divisions or city neighborhoods. Areas similarly can be political divisions (rarely; specifically, in Kuwait), city
-- neighborhoods or larger geographical areas/regions. We handle this as follows:
-- (1) `placetype_data` cat entries for specific countries or country divisions take precedence over cat_handlers, so if
-- the user says {{tl|place|district|s/Maharashtra|c/India}}, we won't even be called because there is an entry that
-- categorizes into [[:Category|Districts of Maharashtra, India]].
-- (2) If we're called, we check the holonym we're called on to see if it is a recognized city, e.g. if we're called
-- using {{tl|place|district|city/Mumbai|s/Maharashtra|c/India}}. If so, we categorize under e.g.
-- [[:Category:Neighbourhoods of Mumbai]]. (Choosing the spelling "neighbourhoods" because we're in India.)
-- (3) If we're called and the holonym is not a recognized city, we check if the placetype has has_neighborhoods set.
-- If so, it's "city-like" and we categorize under the first containing polity that we recognize. For example, if
-- we're called using {{tl|place|district|town/Northampton|co/Hampshire|s/Massachusetts|c/US}}, we should recognize
-- town as "city-like" and categorize under [[:Category:Neighborhoods in Massachusetts]]. (Note "ใน" not "ของ", and
-- note the spelling "neighborhoods" because we're in the US.)
-- (4) If the holonym is not city-like, we do nothing. If there's a city or city-like placetype farther up (e.g. we're
-- called as {{tl|place|district|ward/Foo|mun/Bar|...}}), we will handle the city-like entity according to (2) or
-- (3) when called on that holonym. Otherwise either the categorization in (1) takes place or there's no
-- categorization.
local function district_neighborhood_cat_handler(data)
local function get_plural_entry_placetype(location_spec, container_trail)
if data.entry_placetype == "suburb" then
return "Suburbs"
else
-- Check for `british_spelling` setting on the spec itself or any container.
local uses_british_spelling = location_spec.british_spelling
if uses_british_spelling == nil and container_trail then
for _, container_set in ipairs(container_trail) do
local must_outer_break = false
for _, container in ipairs(container_set) do
if container.spec.british_spelling ~= nil then
uses_british_spelling = container.spec.british_spelling
must_outer_break = true
break
end
end
if must_outer_break then
break
end
end
end
return uses_british_spelling and "Neighbourhoods" or "Neighborhoods"
end
end
-- First check the immediate holonym to see if it's a city or a city-like top-level entity (Hong Kong, Bonaire,
-- etc.)
local group, key, spec, container_trail = export.find_matching_holonym_location(data)
if group and not spec.is_former_place and spec.is_city then
return {get_plural_entry_placetype(spec, container_trail) .. " of " .. export.get_prefixed_key(key, spec)}
end
-- If the entry placetype is neighbo(u)rhood, assume it is a neighborhood even if there isn't a city-like
-- entity father up the chain. (E.g. due to a mistaken use of m/ instead of mun/ for municipality.)
local has_neighborhoods
local entry_placetype = data.entry_placetype
if entry_placetype == "neighborhood" or entry_placetype == "neighbourhood" or entry_placetype == "suburb" then
has_neighborhoods = true
else
-- Otherwise, make sure the current holonym is city-like.
has_neighborhoods = export.get_equiv_placetype_prop(data.holonym_placetype, function(pt)
return export.get_placetype_prop(pt, "has_neighborhoods")
end, {continue_on_nil_only = true})
end
if has_neighborhoods then
-- Loop up the holonyms, looking for city and city-like entities in case of e.g. [[Sepulveda]] written
-- {{place|en|neighborhood|valley/San Fernando Valley|city/Los Angeles|s/California|c/USA}}
-- but also look for a recognizable poldiv, and if so categorize as "Neighborhoods in POLDIV". We need
-- to start with the current holonym, which is especially important for neighborhoods and suburbs that
-- may have the first holonym be a recognizable province, etc. but can't hurt otherwise. (Previously
-- we skipped the first/current holonym.)
for other_holonym_index, other_holonym in export.get_holonyms_to_check(data.place_desc,
data.holonym_index) do
local other_holonym_data = {
holonym_placetype = other_holonym.placetype,
holonym_placename = other_holonym.unlinked_placename,
holonym_index = other_holonym_index,
place_desc = data.place_desc,
}
local group, key, spec, container_trail = export.find_matching_holonym_location(other_holonym_data)
if group and not spec.is_former_place then
return {get_plural_entry_placetype(spec, container_trail) .. (spec.is_city and "ของ" or "ใน") ..
export.get_prefixed_key(key, spec)}
end
end
end
end
function export.check_already_seen_string(holonym_placename, already_seen_strings)
local canon_placename = ulower(m_links.remove_links(holonym_placename))
if type(already_seen_strings) ~= "table" then
already_seen_strings = {already_seen_strings}
end
for _, already_seen_string in ipairs(already_seen_strings) do
if canon_placename:find(already_seen_string) then
return true
end
end
return false
end
-- Prefix display handler that adds a prefix such as "Metropolitan Borough of " to the display
-- form of holonyms. We make sure the holonym doesn't contain the prefix or some variant already.
-- We do this by checking if any of the strings in ALREADY_SEEN_STRINGS, either a single string or
-- a list of strings, or the prefix if ALREADY_SEEN_STRINGS is omitted, are found in the holonym
-- placename, ignoring case and links. If the prefix isn't already present, we create a link that
-- uses the raw form as the link destination but the prefixed form as the display form, unless the
-- holonym already has a link in it, in which case we just add the prefix.
local function prefix_display_handler(prefix, holonym_placename, already_seen_strings)
if export.check_already_seen_string(holonym_placename, already_seen_strings or ulower(prefix)) then
return holonym_placename
end
if holonym_placename:find("%[%[") then
return prefix .. " " .. holonym_placename
end
return prefix .. " [[" .. holonym_placename .. "]]"
end
-- Suffix display handler that adds a suffix such as " parish" to the display form of holonyms.
-- Works identically to prefix_display_handler but for suffixes instead of prefixes.
local function suffix_display_handler(suffix, holonym_placename, already_seen_strings, include_suffix_in_link)
if export.check_already_seen_string(holonym_placename, already_seen_strings or ulower(suffix)) then
return holonym_placename
end
if holonym_placename:find("%[%[") then
return holonym_placename .. " " .. suffix
end
if include_suffix_in_link then
return "[[" .. holonym_placename .. " " .. suffix .. "]]"
else
return "[[" .. holonym_placename .. "]] " .. suffix
end
end
-- Display handler for boroughs. New York City boroughs are display as-is. Others are suffixed
-- with "borough".
local function borough_display_handler(holonym_placetype, holonym_placename)
local unlinked_placename = m_links.remove_links(holonym_placename)
if m_locations.new_york_boroughs[unlinked_placename] then
-- Hack: don't display "borough" after the names of NYC boroughs
return holonym_placename
end
return suffix_display_handler("borough", holonym_placename)
end
local function county_display_handler(holonym_placetype, holonym_placename)
local unlinked_placename = m_links.remove_links(holonym_placename)
-- Display handler for Irish counties. Irish counties are displayed as e.g. "County [[Cork]]".
if m_locations.ireland_counties["County " .. unlinked_placename .. ", Ireland"] or
m_locations.northern_ireland_counties["County " .. unlinked_placename .. ", Northern Ireland"] then
return prefix_display_handler("เทศมณฑล", holonym_placename)
end
-- Display handler for Taiwanese counties. Taiwanese counties are displayed as e.g. "[[Chiayi]] County".
if m_locations.taiwan_counties[unlinked_placename .. " County, Taiwan"] then
return suffix_display_handler("เทศมณฑล", holonym_placename)
end
-- Display handler for Romanian counties. Romanian counties are displayed as e.g. "[[Cluj]] County".
if m_locations.romania_counties[unlinked_placename .. " County, Romania"] then
return suffix_display_handler("เทศมณฑล", holonym_placename)
end
-- FIXME, we need the same for US counties but need to key off the country, not the specific county.
-- Others are displayed as-is.
return holonym_placename
end
-- Display handler for prefectures. Japanese prefectures are displayed as e.g. "[[Fukushima]] Prefecture".
-- Others are displayed as e.g. "[[Fthiotida]] prefecture".
local function prefecture_display_handler(holonym_placetype, holonym_placename)
local unlinked_placename = m_links.remove_links(holonym_placename)
local suffix = m_locations.japan_prefectures[unlinked_placename .. " Prefecture, Japan"] and "Prefecture" or "prefecture"
return suffix_display_handler(suffix, holonym_placename)
end
-- Display handler for provinces of Iran, Laos, North and South Korea, Thailand, Turkey and Vietnam. Recognized
-- provinces are displayed as e.g. "[[Gyeonggi]] Province" or "[[Antalya]] Province". Others are displayed as-is.
local function province_display_handler(holonym_placetype, holonym_placename)
local unlinked_placename = m_links.remove_links(holonym_placename)
if
m_locations.iran_provinces[unlinked_placename .. ", Iran"] or
m_locations.laos_provinces[unlinked_placename .. ", Laos"] or
m_locations.north_korea_provinces[unlinked_placename .. ", North Korea"] or
m_locations.south_korea_provinces[unlinked_placename .. ", South Korea"] or
m_locations.thailand_provinces[unlinked_placename .. ", ไทย"] or
m_locations.turkey_provinces[unlinked_placename .. ", Turkey"] or
m_locations.vietnam_provinces[unlinked_placename .. ", เวียดนาม"] then
return suffix_display_handler("จังหวัด", holonym_placename)
end
return holonym_placename
end
-- Display handler for Nigerian states. Nigerian states are display as "[[Kano]] State". Others are displayed as-is.
local function state_display_handler(holonym_placetype, holonym_placename)
local unlinked_placename = m_links.remove_links(holonym_placename)
if m_locations.nigeria_states[unlinked_placename .. " State, Nigeria"] then
return suffix_display_handler("รัฐ", holonym_placename)
end
return holonym_placename
end
-- Display handler for voivodeships. Display as e.g. [[Subcarpathian Voivodeship]].
local function voivodesip_display_handler(holonym_placetype, holonym_placename)
return suffix_display_handler("Voivodeship", holonym_placename, nil, "include_suffix_in_link")
end
------------------------------------------------------------------------------------------
-- Placetype data --
------------------------------------------------------------------------------------------
--[==[ var:
Main placetype data structure. This specifies, for each canonicalized placetype, various properties. The keys are
placetypes (in the singular, except for category-only placetypes, which are plural and followed by `!`), and the value
is a table of properties. The `"*"` key is special and is used for adding "generic" categories of the form
`สถานที่ใน``location`` `; it runs for all entry placetypes. Keys in the form of plural placetypes followed by `!` are
used only in [[Module:category tree/topic cat/data/Places]] for specifying the properties of categories containing the
specified placetype, esp. bare categories like [[:Category:States and territories]] (rather than qualified categories
like [[:Category:States and territories of Australia]]).
Keys under the value table for a given placetype of are two types: ''property keys'' (which specify the value of
specific properties) and ''categorization keys'' (which tell how to categorize certain sorts of holonyms if the
placetype in question occurs as an entry placetype). Categorization keys are either the special value `default` or are
wildcard strings with a slash in them, such as `"country/*"`. Note that only wildcard strings are currently allowed
directly in the placetype data; everything else is handled through category handlers, either per-placetype or special
(such as `political_division_cat_handler`). The algorithm for how category keys and handlers are used to generate
categories is described at the top of [[Module:place]].
There are several recognized property keys, of various types:
1. The following link-related property keys are recognized:
* `link`: '''Required''' except in category-only placetypes ending in `!`. Describes how to link and display the
placetype in the formatted description when occurring as an entry placetype. Also used for formatting pluralized
placetypes (which may occur in entry placetypes, esp. new-format ones, such as `two <<islands>>`) and may occur in
categories). The possible values are:
*# `true`: Link to the same-named Wiktionary entry. This creates a raw link, e.g. `<nowiki>[[city]]</nowiki>`, which is
converted to an English-specific link by JavaScript postprocessing. If the placetype is plural, this creates a
two-part raw link e.g. `<nowiki>[[city|cities]]</nowiki>`.
*# `"w"`: Link to the same-named Wikipedia entry. This creates a two-part link, e.g.
`<nowiki>[[w:census town|census town]]</nowiki>`, or `<nowiki>[[w:census town|census towns]]</nowiki>` if the
placetype is given plural.
*# `"+..."`: Create a two-part link to the entry following the `+` sign. For example, if `cercle` specifies
`"+w:cercles of Mali"`, a two-part link `<nowiki>[[w:cercles of Mali|cercle]]</nowiki>` will be generated, or
`<nowiki>[[w:cercles of Mali|cercles]]</nowiki>` if plural `cercles` is specified.
*# `"separately"`: Link each word separately. For example, if `administrative territory` specifies `"separately"`, it
will be linked as `<nowiki>[[administrative]] [[territory]]</nowiki>`, or as
`<nowiki>[[administrative]] [[territory|territories]]</nowiki>` if plural `administrative territories` is given.
*# another string: Use that string directly. If the placetype is plural, `pluralize()` in [[Module:en-utilities]] is
called on the string, which will correctly pluralize most strings, including those with links in them. (If there
are multiple links, the display form of the last link is pluralized.)
*# `false`: This placetype is not allowed as an entry placetype. An error will be thrown if this placetype is given as
an entry placetype. This is specified for internal-use placetypes, especially placetypes used in conjunction with
the qualifiers `former`, `ancient`, `historical` and such.
* `plural_link`: If specified and the placetype is plural, use the value in place of generating a pluralized version of
the link spec in `link`. Most commonly, this is either a string with links in it (which is used directly) or the
value `false`, indicating that the placetype cannot occur plural. (This is used for example by `caplc`, which displays
as `<nowiki>[[capital]] and [[large]]st [[city]]</nowiki>`, where a plural version doesn't make sense.) Generally if
this is specified, `plural` also needs to be specified to give a special placetype plural; this situation occurs
especially with multiword placetypes where something other than the last word is pluralized. An example is
`town with bystatus`, whose plural is `towns with bystatus`, which needs to be explicitly given. This example uses
`link = <nowiki>"[[town]] with [[bystatus#Norwegian Bokmål|bystatus]]"</nowiki>` ({{m|nb|bystatus}}) is a Norwegian
Bokmål word, and template calls aren't currently permitted in link strings), along with
`plural_link = <nowiki>"[[town]]s with [[bystatus#Norwegian Bokmål|bystatus]]"</nowiki>`.
* `category_link`: Spec indicating how to display the placetype when occurring in category descriptions. Defaults to
the value of `link`, and in turn is overridden by more specific `category_link_*` keys; see below. Category-only
placetypes (which are plural and end in `!`) usually use `category_link` in preference to `link`. The value of
`category_link` can be any of the types of specs given above, but most commonly is a plural string with links in it,
spelling out the description; in this case it is used directly. When both `category_link` and `link` are given, the
value in `category_link` is typically longer and more descriptive. For example, `polity` uses `link = true`, which
just generates a link `<nowiki>[[polity]]</nowiki>` or plural `<nowiki>[[polity|polities]]</nowiki>`, but specifies a
separate `category_link = <nowiki>"[[independent]] or [[semi-]][[independent]] [[polity|polities]]"</nowiki>`, which
clarifies in the category description what a polity is.
* `category_link_top_level`: Spec indicating how to display top-level (bare/unqualified) categories, i.e. categories
where the placetype is not followed by `in ``location`` ` or `of ``location`` `. If given, this overrides
`category_link` for this type of category.
* `category_link_before_noncity`: Spec indicating how to display qualified categories of the form
` ``placetypes`` in/of ``location`` ` where ``location`` does not refer to a city. If given, this overrides
`category_link` for this type of category.
* `category_link_before_city`: Spec indicating how to display qualified categories of the form
` ``placetypes`` in/of ``location`` ` where ``location`` refer to a city. If given, this overrides `category_link` for
this type of category. An example where this is given is `neighborhood`, which uses the following specs:<ol>
<li>`link = true`</li>
<li>`category_link = <nowiki>"[[neighborhood]]s, [[district]]s and other subportions of [[city|cities]]"</nowiki>`</li>
<li>`category_link_before_city = <nowiki>"[[neighborhood]]s, [[district]]s and other subportions"</nowiki>`</li>
</ol> This has the effect of making the entry placetype `neighborhood` display as just
`<nowiki>[[neighborhood]]</nowiki>`, while e.g. a category like `Neighborhoods of Chicago` displays as
`<nowiki>[[neighborhood]]s, [[district]]s and other subportions of [[Chicago]], ...</nowiki>` and a category like
`Neighborhoods in Illinois, USA` displays as
`<nowiki>[[neighborhood]]s, [[district]]s and other subportions of [[city|cities]] in [[Illinois]], ...</nowiki>`.
* `disallow_in_entries`: If specified, this placetype cannot occur as an entry placetype, and the specified value
(a message indicating what to use instead) is displayed in the error message.
* `disallow_in_holonyms`: If specified, this placetype cannot occur as a holonym placetype, and the specified value
(a message indicating what to use instead) is displayed in the error message.
2. There is currently one fallback-related property key recognized:
* `fallback`: If specified, its value is a placetype which will be used for categorization purposes if no categories
get added using the placetype itself. As an example, `branch` sets a fallback of `river` but also sets
`preposition = "ของ"`, meaning that {{tl|place|en|branch|riv/Mississippi}} displays as `a branch of the Mississippi`
(whereas `river` itself uses the preposition `in`), but otherwise categorizes the same as `river`. A more complex
example is `area`, which sets a fallback of `geographic and cultural area` and also sets a category handler that
checks for cities or city-like entities (e.g. boroughs) occurring as holonyms and categorizes the toponym under
[[:Category:Neighborhoods of CITY]] (for recognized cities) or otherwise [[:Category:Neighborhoods of POLDIV]] (for
the nearest containing recognized location). In addition, `area` is set as a political division of Kuwait, meaning if
`c/Kuwait` occurs as holonym, the toponym is categorized under [[:Category:Areas of Kuwait]]. If none of these
categories trigger, the fallback of `geographic and cultural area` will take effect, and the toponym will be
categorized as e.g. [[:Category:Geographic and cultural areas of England]].
3. There is currently one property to control irregular plurals of placetypes:
* `plural`: If specified, its value is the plural of the placetype. Otherwise, the default pluralization algorithm in
[[Module:en-utilities]] applies (which correctly pluralizes most words, including those ending in `-y`, `-ch`, `-sh`,
`-x`, etc.). The value of `plural` is also used when converting a pluralized placetype into its singular equivalent;
for example, since the placetype `kibbutz` has `plural = "kibbutzim"`, the placetype `kibbutzim` will be recognized
as a plural and singularized to `kibbutz`. For this reason, it's occasionally necessary to specify a `plural` value
even when the default pluralization algorithm works correctly, if the default singularization algorithm won't
correctly reverse the pluralization (as with `pass` and other terms ending in `-ss`).
4. The following property keys relate to generating categories for entry placetypes and specifying the parents of those
categories:
* `class`: The general class of placetype. This is used for various purposes: (a) to categorize placetypes preceded by
a qualifier such as `former`, `ancient`, `medieval` or `historical` (note that these placetypes are not all treated
alike); (b) to determine the parent category of bare placetype categories (e.g. [[:Category:Villages]] for placetype
`village`); (c) to determine whether to add a parent category `political divisions of specific countries` to
qualified placetype categories (e.g. [[:Category:Villages in Mali]]). The possible values are:
*# `polity`: a more-or-less sovereign/independent polity, such as a country, kingdom or empire.
*# `subpolity`: a non-sovereign division of a polity, above the level of an individual settlement.
*# `settlement`: a city or smaller equivalent, such as a village. This also includes administrative divisions of a
settlement, such as wards and barangays.
*# `non-admin settlement`: similar to a settlement but without administrative or political significance, such as an
unincorporated community, farm or neighborhood.
*# `capital`: a settlement that is a capital. A former capital is generally still in existence, just not the capital
any more.
*# `natural feature`: any non-man-made feature, such as a lake, mountain, island, ocean, etc.
*# `man-made structure`: a man-made feature below the level of a neighborhood, such as a house, airport, university,
metro station, park or the like.
*# `geographic region`: a geographic or cultural region or area that has no administrative significance. These may vary
greatly in size but typically have some sort of cultural significance (possibly historical). The `former`, `ancient`,
etc. qualifier has no effect on the category of these placetypes.
*# `generic place`: a place that isn't further qualified into any specific subtype.
* `former_type`: The class of placetype used for categorizing placetypes preceded by a qualifier such as `former`,
`ancient`, `medieval` or `historical`. The possible values are the same as for `class` but with the addition of
`dependent territory` (for colonies, protectorates and the like) and `!` (ignore the historical/former/ancient/etc.
qualifier; used e.g. with `fictional location` and `mythological location`). If not specified, the value of `class`
is used. When a qualifier such as `former`, `ancient`, `medieval` or `historical` is encountered (specifically, those
in `former_qualifiers`), it is mapped using `former_qualifiers` to the appropriate internal qualifier or qualifiers
(one or both of `ANCIENT` and/or `FORMER`, which are written in all-caps to distinguish them from user-specified
qualifiers), which is prepended to the value of `former_type` or `class` to form a placetype whose properties are
looked up to determine how to categorize the toponym in question. For example, if `medieval village` is given, we map
`medieval` to `ANCIENT` and `FORMER`, and `village` to its `class` of `settlement`, and enter the placetypes
`ANCIENT settlement` and `FORMER settlement` (in that order) into the list of equivalent placetypes returned by
`get_placetype_equivs`. In this case, there is an entry in `placetype_data` for `ANCIENT settlement`, so its default
category spec `Ancient settlements` is used as the category. If on the other hand `medieval kingdom` is given, where
`kingdom` has a `class` value `polity`, we first look up `ANCIENT polity`, see there is no entry in `placetype_data`
for it, and then look up `FORMER polity`, which exists and has a default category spec `Former polities`, which is
used as the category. Note that if the placetype following the "former" qualifier is recognized in `placetype_data`
but has no `former_type` or `class` and no fallback with a `former_type` or `class` specified, it is an internal
error; but if the placetype isn't recognized (e.g. something like `former greenhouse` is specified and we don't have
an entry for `greenhouse`), we just track the occurrence and end up not categorizing.
* `bare_category_parent`: This specifies the first parent category of a bare placetype category named according to the
placetype in question (e.g. [[:Category:Atolls]] for placetype `atoll`, or [[:Category:Named buildings]] for
placetype `named buildings!`). If not specified, the first parent category is determined by the value of `class`,
using the mapping `class_to_bare_category_parent` in [[Module:category tree/topic cat/data/Places]].
* `addl_bare_category_parents`: Extra parent categories to add a bare placetype category to (see `bare_category_parent`
just above).
* `bare_category_breadcrumb`: Breadcrumb for bare placetype categories. Also used as the sort key of
`bare_category_parent` if it is a string.
* `inherently_former`: If specified and the given placetype is used as an entry placetype, act as if `former` or
`ancient` (depending on the value of `inherently_former`) were prefixed to the placetype. This is for placetypes that
always refer to no-longer-existing entities, such as `satrapy` and `treaty port`. The value of `inherently_former` is
a list of internal qualifiers (one or more of `ANCIENT` and/or `FORMER`), just as for `former_qualifiers`, and the
implementation is the same.
* `cat_handler`: Handler used to generate the categories to add a given toponym to, if its entry placetype is the
placetype in question. Generally the `cat_handler` function checks the holonyms specified in order to determine which
category or categories to generate. For example, `district_neighborhood_cat_handler` handles placetypes `district`,
`neighborhood`, `subdivision`, `suburb` and the like, and either adds the toponym to a category like
`Neighborhoods of ``city`` ` (if a recognized city is given as a holonym), or otherwise a category like
`Neighborhoods in ``location`` ` (for the first recognized non-city location given as a holonym, if an unrecognized
city or city-like entity is given before the recognized non-city). The algorithm that runs the category handlers
iterates over holonyms from left to right, running the `cat_handler` function on each holonym in turn until one or
more categories are returned; see below for more specifics. (Note that countries for which e.g. a `district` is a
political division do not get the corresponding category added by the `district_neighborhood_cat_handler` function but
by `political_division_cat_handler`.) `cat_handler` functions are called with one argument, `data`, describing the
resolved entry placetype (i.e. after resolving placetype aliases and fallbacks) and the holonym being processed. The
return value should be a list of category specs (categories minus the langcode prefix, with `+++` standing for the
holonym key, or the value `true`, which stands for ` ``Placetypes`` in/of ``Holonym`` `, i.e. the pluralized placetype
with the appropriate preposition as specified in `placetype_data`). `data` contains the following fields:
** `entry_placetype`: the resolved entry placetype for the entry placetype being processed (i.e. it will always have an
entry in `placetype_data` but may not be the original placetype given by the user);
** `holonym_placetype` and `holonym_placename`: the holonym placetype and placename being processed;
** `holonym_index`: the index of the holonym being processed, or {nil} if we're handling an overriding holonym (FIXME:
we will change the overriding holonym algorithm so there will be an index even when processing overriding holonyms);
** `place_desc`: a full description of the {{tl|place}} call, as specified at the top of [[Module:place]];
** `from_demonym`: If set, we are called from [[Module:demonym]], triggered by {{tl|demonym-adj}} or
{{tl|demonym-noun}}, instead of being triggered by {{tl|place}}.
* `has_neighborhoods`: If `true`, the specified placetype is city-like. This is used in the
`district_neighborhood_cat_handler` to determine whether to add a category such as `Neighborhoods in ``location`` `;
see the section just above on `cat_handler`.
5. The following preposition-related property keys are recognized:
* `preposition`: The preposition used after this placetype when it occurs as an entry placetype. Defaults to `"ใน"`.
* `generic_before_non_cities`: If specified, the appropriate category description handler in
[[Module:category tree/topic cat/data/Places]] will recognize categories of the form
` ``Placetype`` in/of ``location`` ` for the specified placetype and preposition, if ``location`` is a non-city. This
is used to generate descriptions for categories added by category handlers and by explicit category specs in the
placetype data. All placetypes that specify `generic_before_non_cities` or `generic_before_cities` *MUST* also specify
a value for `class` so that the category tree code can determine whether it's a political or non-political division.
* `generic_before_cities`: Like `generic_before_non_cities` but for locations referring to cities.
6. The following property keys control the auto-addition of affixes when formatting holonyms of a particular placetype:
* `affix_type`: If specified, add the placetype as an affix before or after holonyms of this placetype. Possible values
are:
*# `"pref"` (the holonym will display as `(the) placetype of Holonym`, where `the` appears when the holonym directly
follows an entry placetype);
*# `"Pref"` (same as `"pref"` but the placetype is capitalized; each word is capitalized if there are multiple);
*# `"suf"` (the holonym will display as `Holonym placetype`);
*# `"Suf"` (the holonym will display as `Holonym Placetype`, i.e. same as `"suf"` but the placetype is capitalized).
* `suffix`: String to use in place of the placetype itself when the placetype is displayed as a suffix after a holonym.
Note that `suffix` can be used independently of `affix_type` because the user can also request a suffix explicitly
using a syntax like `adr:suf/Occitania`, which will display as `Occitania region` because the placetype
`administrative region` specifies `suffix = "ภูมิภาค"`.
* `prefix`: Like `suffix` but for use when the placetype is displayed as a prefix before the holonym.
* `affix`: Like `suffix` and `prefix` but for use when the placetype is displayed as an affix either before or after the
holonym. If both `suffix` or `prefix` and `affix` are given for a single placetype, `suffix` or `prefix` take
precedence.
* `no_affix_strings`: String or list of strings that, if they occur in the holonym, suppress the addition of any affix
requested using `affix_type`. Defaults to the placetype itself. For example, `autonomous okrug` specifies
`affix_type = "Suf"` so that `aokr/Nenets` displays as `Nenets Autonomous Okrug`, but also specifies
`no_affix_strings = "okrug"` so that `aokr/Nenets Okrug` or `aokr/Nenets Autonomous Okrug` displays as specified,
without a redundant `Autonomous Okrug` added. Matching is case-insensitive but whole-word.
* `display_handler`: A function of two arguments, `holonym_placetype` and `holonym_placename` (specifying a holonym).
Its return value is a string specifying the display form of the holonym.
7. The following property keys control the indefinite and definite articles used before entry placetypes and/or holonyms
of the specified placetype.
* `entry_placetype_use_the`: Use `"the"` before this placetype when it occurs as an entry placetype.
* `entry_placetype_indefinite_article`: Indefinite article used before this placetype when it occurs as an entry
placetype (usually `"a"`, specifically for placetypes beginning with u- that don't take the indefinite article
`"an"`). Defaults to the appropriate indefinite article (`"a"` or `"an"` depending on whether the placetype begins
with a vowel). Overridden by `entry_placetype_use_the`, and unlike for most properties, does not apply to equivalent
placetypes (i.e. fallbacks or those formed by removing a qualifier from the beginning); only to the exact placetype
specified.
* `holonym_use_the`: Use `"the"` before holonyms of this placetype.
'''NOTE:'''
# The `link` property must be specified on all placetypes, except those ending in `!` (category-only placetypes), which
must have either `link` or `category_link` specified.
# Either the `class` or `former_type` property must be specified on all placetypes not ending in `!` that do not have a
fallback (if a placetype has a fallback and omits the `class` and `former_type` properties, they are taken from the
fallback). An internal error will result if a placetype has no `class` or `former_type` property derivable either
directly or through a fallback, if an attempt is made to categorize a former/ancient/historical/etc. entity of this
placetype.
# It is possible to have multiple levels of fallback (e.g. `frazione` falls back to `hamlet`, which falls back
to `village`). Fallback loops will cause an internal error. All placetypes specified as fallbacks must exist in
`placetype_data` or an internal error occurs.
]==]
export.placetype_data = {
--[=[
If you need to sort the following, do this (using Vim):
1. Make sure all full-line comments are within the { ... } table, or are moved after and on the same line as single-line
entries.
2. Make sure the table uses tabs everywhere for indent, and not spaces.
3. Mark the top of the table with `ma`, go to the bottom and execute the following two lines in sequence:
:'a,.s/\n/\\n/g
:s/\\n\(\t\[\)/\r\1/g
The first command converts every newline to a literal `\n` sequence, so the whole thing becomes a single line, while
the second command restores the newlines before the beginning of each entry. The effect is to convert all entries to
a single line while not losing any information. (Potentially a negative lookahead could be used to do it all in one
command.)
4. Execute the following to sort:
:'a,.!perl -pe 's/^(\t\[")(.*?)(".*)$/$2 @@@ $1$2$3/' | sort -f | perl -pe 's/.*? @@@ //'
Note that a simple `sort -f` (where `-f` means case-insensitive) would almost work, but it would sort "hill station"
before "hill" and "county borough" before "เทศมณฑล" because the space after e.g. "hill station" sorts before the
quotation mark after e.g. "hill". The above command deals with this by extracting the key, prepending it followed by
` @@@ `, sorting, and then removing key (the classic decorate-sort-undecorate pattern).
5. Put the table back to multi-line format by marking the top of the table with `ma`, going to the bottom and executing
:'a,.s/\\n/\r/g
Note that for some reason, in order to get a match a newline in the left side of a replacement, you must use \n, but
to insert a newline in the right sode of a replacement you must use \r.
]=]
["*"] = {
link = false,
cat_handler = generic_place_cat_handler,
},
["administrative atoll"] = {
-- Maldives
link = "+w:administrative divisions of the Maldives",
preposition = "ของ",
class = "subpolity",
},
["administrative capital"] = {
link = "w",
fallback = "capital city",
},
["administrative center"] = {
link = "w",
fallback = "non-city capital",
},
["administrative centre"] = {
link = "w",
fallback = "administrative center",
},
["administrative county"] = {
link = "w",
fallback = "เทศมณฑล",
},
["administrative district"] = {
link = "w",
fallback = "อำเภอ",
},
["administrative headquarters"] = {
link = "separately",
fallback = "administrative centre",
},
["administrative region"] = {
link = true,
preposition = "ของ",
suffix = "ภูมิภาค", -- but prefix is still "administrative region (of)"
fallback = "ภูมิภาค",
class = "subpolity",
},
["administrative seat"] = {
link = "w",
fallback = "administrative centre",
},
["administrative territory"] = {
link = "separately",
preposition = "ของ",
suffix = "ดินแดน", -- but prefix is still "administrative territory (of)"
fallback = "ดินแดน",
class = "subpolity",
},
["administrative unit"] = {
-- Grrr, it's difficult to generalize about "administrative units". In Albania, "administrative unit" is an
-- official term for a city-level division of municipalities; Wikipedia renders it using the more practical term
-- "commune". In Pakistan, "administrative unit" is a collective term used to refer to all the different types
-- of first-level divisions (four provinces, one federal territory, and two "disputed territories", i.e. Azad
-- Kashmir and Gilgit-Balistan, that are variously described). For this reason, we set no fallback, but we need
-- to include this so that it can be used as a placetype for Albania, categorizing as communes.
link = "w",
class = "subpolity",
},
["administrative village"] = {
link = "w",
preposition = "ของ",
has_neighborhoods = true,
class = "settlement",
},
["aimag"] = {
-- used in Mongolia, Russia and China (Inner Mongolia); in Mongolia, equivalent to a province;
-- in China, equivalent to a prefecture (below a province); in Russia, equivalent to a municipal district.
link = "w",
fallback = "prefecture",
},
["airport"] = {
link = true,
class = "man-made structure",
default = {true},
},
["alliance"] = {
link = true,
fallback = "confederation",
},
["archipelago"] = {
link = true,
fallback = "เกาะ",
},
["area"] = {
link = true,
preposition = "ของ",
fallback = "geographic and cultural area",
-- Areas can either be administrative divisions (specifically of Kuwait) or geographic areas. Assume the former
-- when categorizing 'Areas' but the latter when handling e.g. 'historical area'.
class = "subpolity",
former_type = "geographic region",
cat_handler = district_neighborhood_cat_handler,
},
["arm"] = {
link = true,
preposition = "ของ",
class = "natural feature",
default = {"ทะเล"},
},
["arrondissement"] = {
link = true,
preposition = "ของ",
-- FIXME!!! Grrrrr!!! In some countries, arrondissements are divisions of cities; in others, they are divisions
-- of departments or provinces. Need to conditionalize on the country for both of the following.
class = "subpolity",
has_neighborhoods = true,
},
["associated province"] = {
link = "separately",
fallback = "จังหวัด",
},
["atoll"] = {
-- FIXME! Atolls are administrative divisions of the Maldives but natural features elsewhere. Need to
-- conditionalize `class` on the country. See also `administrative atoll`.
link = true,
class = "natural feature",
bare_category_parent = "เกาะ",
default = {true},
},
["autonomous city"] = {
link = "w",
preposition = "ของ",
fallback = "นคร",
has_neighborhoods = true,
},
["autonomous community"] = {
-- Spain; refers to regional entities, not village-like entities, as might be expected from "community"
link = true,
preposition = "ของ",
class = "subpolity",
},
["autonomous island"] = {
-- Comoros; seems like an administrative atoll of the Maldives.
link = "+w:autonomous islands of Comoros",
preposition = "ของ",
class = "subpolity",
},
["autonomous oblast"] = {
link = true,
preposition = "ของ",
affix_type = "Suf",
no_affix_strings = "oblast",
class = "subpolity",
},
["autonomous okrug"] = {
link = true,
preposition = "ของ",
affix_type = "Suf",
no_affix_strings = "okrug",
class = "subpolity",
},
["autonomous prefecture"] = {
link = true,
fallback = "prefecture",
},
["autonomous province"] = {
link = "w",
fallback = "จังหวัด",
},
["autonomous region"] = {
link = "w",
preposition = "ของ",
fallback = "administrative region",
-- "administrative region" sets an affix of "ภูมิภาค" but we want to display as "Tibet Autonomous Region"
-- if the user writes 'ar:Suf/Tibet'.
affix = "autonomous region",
},
["autonomous republic"] = {
link = "w",
preposition = "ของ",
class = "subpolity",
},
["autonomous territorial unit"] = {
-- Moldova; only two of them, one for Gagauzia and one for Transnistria.
link = "w",
preposition = "ของ",
class = "subpolity",
},
["autonomous territory"] = {
link = "w",
fallback = "dependent territory",
},
["bailiwick"] = {
-- Jersey, etc.
link = true,
fallback = "องค์การทางการเมือง",
},
["barangay"] = {
-- Philippines
link = true,
class = "settlement",
-- Barangays are formal administrative divisions of a city rather than informal neighborhoods, but can use
-- some of the properties of a neighborhood.
fallback = "neighborhood",
},
["barrio"] = {
-- Spanish-speaking countries; Philippines
link = true,
-- FIXME: Not completely correct, in some countries barrios are formal administrative divisions of a city.
-- `class` will need to conditionalize on the country to be completely correct.
fallback = "neighborhood",
},
["basin"] = {
link = true,
fallback = "ทะเลสาบ",
},
["bay"] = {
link = true,
preposition = "ของ",
class = "natural feature",
addl_bare_category_parents = {"bodies of water"},
default = {true},
},
["beach"] = {
link = true,
class = "natural feature",
addl_bare_category_parents = {"water"},
default = {true},
},
["beach resort"] = {
link = "w",
fallback = "resort town",
},
["bishopric"] = {
link = true,
fallback = "องค์การทางการเมือง",
},
["bodies of water!"] = {
-- FIXME: This is (maybe?) a type category not a name category. There should be an option for this. We need to
-- straighten out the type vs. name vs. related-to issue.
category_link = "[[body of water|bodies of water]]",
class = "natural feature",
addl_bare_category_parents = {"ธรณีสัณฐาน", "ecosystems", "water"},
},
["borough"] = {
link = true,
preposition = "ของ",
display_handler = borough_display_handler,
has_neighborhoods = true,
-- "former borough" could be a former settlement or a former part of a city but seems more likely to
-- be a former subpolity, particularly in England. FIXME, we really need a handler to take care of this
-- properly.
class = "subpolity",
-- Grr, some boroughs are city-like but some (e.g. in Britain) may be larger.
},
["borough seat"] = {
link = true,
entry_placetype_use_the = true,
preposition = "ของ",
has_neighborhoods = true,
class = "capital",
},
["branch"] = {
link = true,
preposition = "ของ",
fallback = "แม่น้ำ",
},
["bridge"] = {
link = true,
class = "man-made structure",
default = {"Named bridges"},
},
["building"] = {
link = true,
class = "man-made structure",
default = {"Named buildings"},
},
["built-up area"] = {
link = "w",
fallback = "area",
},
["burgh"] = {
link = true,
fallback = "borough",
},
["business park"] = {
link = true,
fallback = "park",
},
["caliphate"] = {
link = true,
fallback = "องค์การทางการเมือง",
},
["canton"] = {
link = true,
preposition = "ของ",
affix_type = "suf",
class = "subpolity",
},
["cape"] = {
link = true,
fallback = "headland",
},
["capital"] = {
link = true,
fallback = "capital city",
},
["capital city"] = {
link = true,
category_link = "[[capital city|capital cities]]: the [[seat of government|seats of government]] for a country or [[political]] [[division]] of a country",
entry_placetype_use_the = true,
preposition = "ของ",
has_neighborhoods = true,
class = "capital",
bare_category_parent = "นคร",
cat_handler = capital_city_cat_handler,
default = {true},
-- The following is necessary so that e.g. [[Melbourne]] defined as {{place|en|capital city|s/Victoria|c/Australia}}
-- gets categorized in the bare category [[Category:en:Melbourne]]; otherwise placetype 'capital city' wouldn't
-- match against the placetype 'city' of Melbourne.
fallback = "นคร",
},
["caplc"] = {
link = "[[capital]] and [[large]]st [[city]]",
plural_link = false,
fallback = "capital city",
},
["captaincy"] = {
link = true,
preposition = "ของ",
class = "subpolity",
inherently_former = {"FORMER"},
},
["caravan city"] = {
link = "w",
fallback = "นคร",
class = "settlement",
inherently_former = {"ANCIENT", "FORMER"},
},
["castle"] = {
link = true,
fallback = "building",
},
["cathedral city"] = {
link = true,
fallback = "นคร",
},
["cattle station"] = {
-- Australia
link = true,
fallback = "farm",
},
["census area"] = {
link = true,
affix_type = "Suf",
has_neighborhoods = true,
class = "non-admin settlement",
},
["census-designated place"] = {
-- United States
link = true,
class = "non-admin settlement",
},
["census division"] = {
-- Canada
link = "w",
preposition = "ของ",
class = "subpolity",
},
["census town"] = {
link = "w",
fallback = "เมือง",
},
["central business district"] = {
link = true,
fallback = "neighborhood",
},
["cercle"] = {
-- Mali
link = "+w:cercles of Mali",
preposition = "ของ",
class = "subpolity",
},
["ceremonial county"] = {
link = true,
fallback = "เทศมณฑล",
},
["chain of islands"] = {
link = "[[chain]] of [[island]]s",
plural = "chains of islands",
plural_link = "[[chain]]s of [[island]]s",
fallback = "เกาะ",
},
["channel"] = {
link = true,
fallback = "strait",
},
["charter community"] = {
-- Northwest Territories, Canada
link = "w",
fallback = "village",
},
["นคร"] = {
link = true,
generic_before_non_cities = "ใน",
has_neighborhoods = true,
class = "settlement",
cat_handler = city_type_cat_handler,
default = {true},
},
["city-state"] = {
link = true,
category_link = "[[sovereign]] [[microstate]]s consisting of a single [[city]] and [[w:dependent territory|dependent territories]]",
has_neighborhoods = true,
class = "settlement",
["continent/*"] = {"City-states", "Cities in +++", "Countries in +++", "National capitals"},
default = {"City-states", "นคร", "ประเทศ", "National capitals"},
},
["civil parish"] = {
-- Mostly England; similar to municipalities
link = true,
preposition = "ของ",
affix_type = "suf",
has_neighborhoods = true,
class = "subpolity",
},
["claimed political division"] = {
link = "[[claim]]ed [[political]] [[division]]",
class = "subpolity",
default = {true},
},
["co-capital"] = {
link = "[[co-]][[capital]]",
fallback = "capital city",
},
["coal city"] = {
link = "+w:coal town",
fallback = "นคร",
},
["coal town"] = {
link = "w",
fallback = "เมือง",
},
["collectivity"] = {
link = "w",
preposition = "ของ",
-- No default; these are weird one-off governmental divisions in France (esp. for overseas collectivities)
class = "subpolity",
},
["colony"] = {
link = true,
fallback = "dependent territory",
},
["comarca"] = {
-- per Wikipedia: traditional region or local administrative division found in Portugal, Spain, and some of
-- their former colonies, like Brazil, Nicaragua, and Panama. In the Valencian Community, for example, it
-- sits between municipalities and provinces, something like a county or district.
link = true,
preposition = "ของ",
class = "subpolity",
},
["commandery"] = {
link = true,
preposition = "ของ",
class = "subpolity",
inherently_former = {"ANCIENT", "FORMER"},
},
["commonwealth"] = {
link = true,
preposition = "ของ",
-- No default; applies specifically to Puerto Rico
class = "subpolity",
},
["commune"] = {
link = true,
fallback = "เทศบาล",
},
["community"] = {
link = true,
category_link = "[[community|communities]] of all sizes",
fallback = "village",
},
["community development block"] = {
-- in India; appears to be similar to a rural municipality; groups several villages, unclear if there will be
-- neighborhoods so I'm not setting `has_neighborhoods` for now
link = "w",
affix_type = "suf",
no_affix_strings = "block",
class = "subpolity",
},
["comune"] = {
-- Italy, Switzerland
link = true,
fallback = "เทศบาล",
},
["condominium"] = {
link = true,
fallback = "องค์การทางการเมือง",
},
["confederacy"] = {
link = true,
fallback = "confederation",
},
["confederation"] = {
link = true,
fallback = "องค์การทางการเมือง",
},
["constituency"] = {
-- currently we have them as political divisions of Namibia but many countries have them
link = true,
preposition = "ของ",
class = "subpolity",
},
["constituent country"] = {
link = true,
preposition = "ของ",
class = "subpolity",
},
["constituent part"] = {
link = "separately",
preposition = "ของ",
class = "subpolity",
},
["constituent republic"] = {
-- Of Russia, Yugoslavia, etc.
link = "separately",
preposition = "ของ",
class = "subpolity",
},
["counties and county-level cities!"] = {
-- This is used when grouping counties and county-level cities under prefecture-level cities in China.
category_link = "[[county|counties]] and [[county-level city|county-level cities]]",
class = "subpolity",
},
["continent"] = {
link = true,
category_link = false, -- can't occur as a bare category
class = "natural feature",
default = {"Continents and continental regions"},
},
["continental region"] = {
link = "separately",
category_link = false, -- can't occur as a bare category
class = "geographic region",
fallback = "continent",
},
["continents and continental regions!"] = {
category_link = "[[continent]]s and [[continent]]-[[level]] [[region]]s (e.g. [[Polynesia]])",
class = "geographic region",
},
["council area"] = {
link = true,
-- in Scotland; similar to a county
preposition = "ของ",
affix_type = "suf",
class = "subpolity",
},
["ประเทศ"] = {
link = true,
class = "polity", --ห้ามแปล class
["continent/*"] = {true, "ประเทศ"},
default = {true},
},
["country-like entities!"] = {
category_link = "[[polity|polities]] not normally considered [[country|countries]] but treated similarly for categorization purposes; typically, [[unrecognized]] [[de-facto]] countries or [[w:dependent territory|dependent territories]]",
class = "polity", --ห้ามแปล class
},
["เทศมณฑล"] = {
link = true,
preposition = "ของ",
display_handler = county_display_handler,
class = "subpolity",
},
["county borough"] = {
link = true,
-- in Wales; similar to a county
preposition = "ของ",
affix_type = "suf",
fallback = "borough",
class = "subpolity",
},
["county seat"] = {
link = true,
entry_placetype_use_the = true,
preposition = "ของ",
has_neighborhoods = true,
class = "capital",
},
["county town"] = {
link = true,
entry_placetype_use_the = true,
preposition = "ของ",
fallback = "เมือง",
has_neighborhoods = true,
class = "capital",
},
["county-administered city"] = {
-- In Taiwan, per Wikipedia similar to a Taiwanese township or district, which is a small city.
-- NOT anything like a "county-level city" in PR China, which is a county masquerading as a city.
link = "w",
fallback = "นคร",
has_neighborhoods = true,
class = "settlement",
},
["county-controlled city"] = {
-- Taiwan
link = "w",
fallback = "county-administered city",
},
["county-level city"] = {
-- PR China
link = "w",
fallback = "prefecture-level city",
},
["crater lake"] = {
link = true,
fallback = "ทะเลสาบ",
},
["creek"] = {
link = true,
fallback = "stream",
},
["Crown colony"] = {
link = "+crown colony",
fallback = "crown colony",
},
["crown colony"] = {
link = true,
fallback = "colony",
},
["Crown dependency"] = {
link = true,
fallback = "dependent territory",
},
["crown dependency"] = {
link = true,
fallback = "dependent territory",
},
["cultural area"] = {
link = "w",
fallback = "geographic and cultural area",
},
["cultural region"] = {
link = "w",
fallback = "geographic and cultural area",
},
["delegation"] = {
-- Tunisia
link = "+w:delegations of Tunisia",
preposition = "ของ",
class = "subpolity",
},
["department"] = {
link = true,
preposition = "ของ",
affix_type = "suf",
class = "subpolity",
},
["departmental capital"] = {
link = "separately",
fallback = "capital city",
},
["dependency"] = {
link = true,
fallback = "dependent territory",
},
["dependent territory"] = {
link = "w",
preposition = "ของ",
class = "subpolity",
former_type = "dependent territory",
bare_category_parent = "political divisions",
["country/*"] = {true},
default = {true},
},
["desert"] = {
link = true,
class = "natural feature",
addl_bare_category_parents = {"ecosystems"},
default = {true},
},
["deserted mediaeval village"] = {
link = "w",
fallback = "deserted medieval village",
},
["deserted medieval village"] = {
link = "w",
fallback = "ANCIENT settlement",
},
["direct-administered municipality"] = {
-- China
link = "+w:direct-administered municipalities of China",
fallback = "เทศบาล",
},
["direct-controlled municipality"] = {
-- several countries
link = "w",
fallback = "เทศบาล",
},
["distributary"] = {
link = true,
preposition = "ของ",
fallback = "แม่น้ำ",
},
["อำเภอ"] = {
link = true,
preposition = "ของ",
affix_type = "suf",
-- Grrr! FIXME! Here is where we need handlers for `class`. Using similar logic to
-- district_neighborhood_cat_handler, we need to check if we're below or above a city to determine if the class
-- is "settlement" or "subpolity".
class = "subpolity",
cat_handler = district_neighborhood_cat_handler,
-- No default. Countries for which districts are political divisions will get entries.
},
["districts and autonomous regions!"] = {
-- This and other similar "combined placetypes" are for use in the plural when grouping first-level
-- administrative regions of certain countries, in this case Portugal.
category_link = "[[district]]s and [[autonomous region]]s",
class = "subpolity",
},
["districts and autonomous territorial units!"] = {
-- This and other similar "combined placetypes" are for use in the plural when grouping first-level
-- administrative regions of certain countries, in this case Moldova.
category_link = "[[district]]s and [[w:autonomous territorial unit|autonomous territorial unit]]s",
class = "subpolity",
},
["district capital"] = {
link = "separately",
fallback = "capital city",
},
["district headquarters"] = {
link = "separately",
fallback = "administrative centre",
},
["district municipality"] = {
-- In Canada, a district municipality is equivalent to a rural municipality and won't have neighborhoods; in
-- South Africa, district municipalities group local municipalities and hence won't have neighborhoods.
link = "w",
preposition = "ของ",
affix_type = "suf",
no_affix_strings = {"อำเภอ", "เทศบาล"},
fallback = "เทศบาล",
class = "subpolity",
},
["division"] = {
link = true,
preposition = "ของ",
class = "subpolity",
},
["division capital"] = {
link = "separately",
fallback = "capital city",
},
["dome"] = {
link = true,
fallback = "ภูเขา",
},
["dormant volcano"] = {
link = true,
fallback = "volcano",
},
["duchy"] = {
link = true,
fallback = "องค์การทางการเมือง",
},
["emirate"] = {
link = true,
preposition = "ของ",
-- FIXME: Can be subpolities (of the United Arab Emirates).
fallback = "องค์การทางการเมือง",
},
["จักรวรรดิ"] = {
link = true,
fallback = "องค์การทางการเมือง",
},
["enclave"] = {
link = true,
preposition = "ของ",
-- Enclaves can theoretically be any size but assume a subpolity.
class = "subpolity",
},
["entity"] = {
-- Bosnia and Herzegovina
link = "+w:entities of Bosnia and Herzegovina",
preposition = "ของ",
class = "subpolity",
},
["escarpment"] = {
link = true,
fallback = "ภูเขา",
},
["ethnographic region"] = {
-- used in Lithuania
link = "+w:ethnographic regions of Lithuania",
fallback = "geographic and cultural area",
},
["exclave"] = {
link = true,
preposition = "ของ",
-- exclaves can theoretically be any size but assume a subpolity.
class = "subpolity",
},
["external territory"] = {
link = "separately",
fallback = "dependent territory",
},
["farm"] = {
link = true,
class = "non-admin settlement",
default = {"Farms and ranches"},
},
["farms and ranches!"] = {
category_link = "[[farm]]s and [[ranch]]es",
class = "non-admin settlement",
},
["federal city"] = {
link = "w",
preposition = "ของ",
fallback = "นคร",
},
["federal district"] = {
link = true,
preposition = "ของ",
-- Might have neighborhoods as federal districts are often cities (e.g. Mexico City)
has_neighborhoods = true,
class = "settlement",
},
["federal subject"] = {
-- In Russia; a generic term for first-level administrative divisions (republics, oblasts, okrugs, krais,
-- autonomous okrugs and autonomous oblasts).
link = "w",
preposition = "ของ",
class = "subpolity",
},
["federal territory"] = {
link = "w",
fallback = "ดินแดน",
},
["fictional location"] = {
link = "separately",
former_type = "!",
class = "hypothetical location",
bare_category_parent = "สถานที่",
default = {true},
},
["First Nations reserve"] = {
-- Canada
link = "[[First Nations]] [[w:Indian reserve|reserve]]",
-- Wikipedia uses "Indian reserve"; presumably that is the legal term
fallback = "Indian reserve",
class = "subpolity",
},
["fjord"] = {
link = true,
class = "natural feature",
addl_bare_category_parents = {"bodies of water"},
default = {true},
},
["footpath"] = {
link = true,
fallback = "road",
},
["forest"] = {
link = true,
class = "natural feature",
addl_bare_category_parents = {"ecosystems", "forestry"},
default = {true},
},
["fort"] = {
link = true,
fallback = "building",
},
["fortress"] = {
link = true,
-- The default plural algorithm gets this right but the singularization algorithm incorrectly converts
-- fortresses -> fortresse, so put an entry here to ensure we singularize correctly.
plural = "fortresses",
fallback = "building",
},
["frazione"] = {
link = "w",
fallback = "hamlet",
},
["freeway"] = {
link = true,
fallback = "road",
},
["French prefecture"] = {
link = "[[w:prefectures in France|prefecture]]",
entry_placetype_use_the = true,
preposition = "ของ",
has_neighborhoods = true,
class = "capital",
},
["geographic and cultural area"] = {
link = "+w:cultural area",
-- `generic_before_non_cities` is used when generating the category description of categories of the format
-- `Geographic and cultural areas of PLACE`. `preposition` is used when generating {{place}} description and
-- categories for any placetype that falls back to `geographic and cultural area`.
generic_before_non_cities = "ของ",
preposition = "ของ",
class = "geographic region",
bare_category_parent = "สถานที่",
["country/*"] = {true},
["constituent country/*"] = {true},
["continent/*"] = {true},
default = {true},
},
["geographic area"] = {
link = "+w:geographic region",
fallback = "geographic and cultural area",
},
["geographic region"] = {
link = "w",
fallback = "geographic and cultural area",
},
["geographical area"] = {
link = "w",
fallback = "geographic and cultural area",
},
["geographical region"] = {
link = "w",
fallback = "geographic and cultural area",
},
["geopolitical zone"] = {
-- Nigeria
link = true,
preposition = "ของ",
class = "subpolity",
},
["gewog"] = {
-- Bhutan
link = true,
preposition = "ของ",
class = "subpolity",
},
["ghost town"] = {
link = true,
generic_before_non_cities = "ใน",
class = "non-admin settlement",
bare_category_parent = "former settlements",
cat_handler = city_type_cat_handler,
default = {true},
},
["glen"] = {
link = true,
fallback = "valley",
},
["governorate"] = {
link = true,
preposition = "ของ",
affix_type = "suf",
class = "subpolity",
},
["greater administrative region"] = {
-- China (former division)
link = "w",
preposition = "ของ",
class = "subpolity",
inherently_former = {"FORMER"},
},
["gromada"] = {
-- Poland (former division)
link = "w",
preposition = "ของ",
affix_type = "Pref",
class = "subpolity",
inherently_former = {"FORMER"},
},
["group of islands"] = {
link = "[[group]] of [[island]]s",
plural = "groups of islands",
plural_link = "[[group]]s of [[island]]s",
fallback = "island group",
},
["gulf"] = {
link = true,
preposition = "ของ",
holonym_use_the = true,
class = "natural feature",
addl_bare_category_parents = {"bodies of water"},
default = {true},
},
["hamlet"] = {
link = true,
fallback = "village",
},
["harbor city"] = {
link = "separately",
fallback = "นคร",
},
["harbor town"] = {
link = "separately",
fallback = "เมือง",
},
["harbour city"] = {
link = "separately",
fallback = "นคร",
},
["harbour town"] = {
link = "separately",
fallback = "เมือง",
},
["headland"] = {
link = true,
class = "natural feature",
addl_bare_category_parents = {"ธรณีสัณฐาน"},
default = {true},
},
["headquarters"] = {
link = "w",
fallback = "administrative centre",
},
["heath"] = {
link = true,
fallback = "moor",
},
["hemisphere"] = {
link = true,
entry_placetype_use_the = true,
fallback = "continental region",
},
["highway"] = {
link = true,
fallback = "road",
},
["hill"] = {
link = true,
class = "natural feature",
addl_bare_category_parents = {"ธรณีสัณฐาน"},
default = {true},
},
["hill station"] = {
link = "w",
fallback = "เมือง",
},
["hill town"] = {
link = "w",
fallback = "เมือง",
},
["historic region"] = {
-- provided only for the link
link = "+w:historical region",
fallback = "FORMER geographic region",
},
["historical county"] = {
-- needed for historical counties of England/etc.
link = "+w:historic county",
fallback = "FORMER subpolity",
},
["historical region"] = {
-- provided only for the link
link = "w",
fallback = "FORMER geographic region",
},
["home rule city"] = {
link = "w",
fallback = "นคร",
},
["home rule municipality"] = {
link = "w",
fallback = "เทศบาล",
},
["hot spring"] = {
link = true,
fallback = "spring",
},
["house"] = {
link = true,
fallback = "building",
},
["housing estate"] = {
-- not the same as a housing project (i.e. public housing)
link = true,
-- not exactly the case but approximately
fallback = "neighborhood",
},
["hromada"] = {
-- Ukraine
link = "w",
disallow_in_entries = "Use placetype 'urban hromada', 'rural hromada' or 'settlement hromada' in place of bare 'hromada'",
disallow_in_holonyms = "Use placetype 'urban hromada'/'uhrom', 'rural hromada'/'rhrom' or 'settlement hromada'/'shrom' in place of bare 'hromada'",
preposition = "ของ",
affix_type = "suf",
class = "subpolity",
},
["inactive volcano"] = {
link = "w",
fallback = "dormant volcano",
},
["independent city"] = {
link = true,
fallback = "นคร",
},
["independent town"] = {
link = "+independent city",
fallback = "เมือง",
},
["Indian reservation"] = {
link = "w",
-- In the US. Also known as "Native American reservation" or "domestic dependent nation", and the reservations
-- themselves often use the term "nation" in their official name (e.g. the "Navajo Nation"). But Wikipedia puts
-- the article at [[w:Indian reservation]] and uses that term when describing e.g. what the Navajo Nation is,
-- so this must still be the legal term.
preposition = "ของ",
class = "subpolity",
default = {true},
},
["Indian reserve"] = {
link = "w",
-- In Canada. "First Nations reserve" sounds more modern/PC but Wikipedia uses "Indian reserve"; presumably that
-- is still the legal term.
preposition = "ของ",
class = "subpolity",
default = {true},
},
["inland sea"] = {
-- note, we also have 'inland' as a qualifier
link = true,
fallback = "ทะเล",
},
["inner city area"] = {
link = "[[inner city]] [[area]]",
fallback = "neighborhood",
},
["เกาะ"] = {
link = true,
preposition = "ของ",
class = "natural feature",
addl_bare_category_parents = {"ธรณีสัณฐาน"},
default = {true},
},
["island country"] = {
-- FIXME: The following should map to both 'island' and 'country'.
link = "w",
fallback = "ประเทศ",
},
["island group"] = {
link = "separately",
fallback = "เกาะ",
},
["island municipality"] = {
link = "w",
fallback = "เทศบาล",
},
["islet"] = {
link = "w",
fallback = "เกาะ",
},
["Israeli settlement"] = {
link = "w",
class = "settlement",
default = {true},
},
["judicial capital"] = {
link = "w",
fallback = "capital city",
},
["khanate"] = {
link = true,
fallback = "องค์การทางการเมือง",
},
["kibbutz"] = {
link = true,
plural = "kibbutzim",
class = "non-admin settlement",
default = {true},
},
["kingdom"] = {
link = true,
fallback = "monarchy",
},
["krai"] = {
link = true,
preposition = "ของ",
affix_type = "Suf",
class = "subpolity",
},
["ทะเลสาบ"] = {
link = true,
class = "natural feature",
addl_bare_category_parents = {"bodies of water"},
default = {true},
},
["ธรณีสัณฐาน!"] = {
category_link = "[[ธรณีสัณฐาน]]",
bare_category_parent = "สถานที่",
addl_bare_category_parents = {"โลก"},
},
["largest city"] = {
link = "[[large]]st [[city]]",
entry_placetype_use_the = true,
fallback = "นคร",
has_neighborhoods = true,
},
["league"] = {
link = true,
fallback = "confederation",
},
["legislative capital"] = {
link = "separately",
fallback = "capital city",
},
["library"] = {
link = true,
fallback = "building",
},
["lieutenancy area"] = {
-- used in the United Kingdom; per Wikipedia:
-- In England, lieutenancy areas are colloquially known as the ceremonial counties, although this phrase does
-- not appear in any legislation referring to them. The lieutenancy areas of Scotland are subdivisions of
-- Scotland that are more or less based on the counties of Scotland, making use of the major cities as separate
-- entities.[2] In Wales, the lieutenancy areas are known as the preserved counties of Wales and are based on
-- those used for lieutenancy and local government between 1974 and 1996. The lieutenancy areas of Northern
-- Ireland correspond to the six counties and two former county boroughs.[3]
link = "w",
fallback = "ceremonial county",
},
["local authority district"] = {
link = "w",
fallback = "local government district",
},
["local government area"] = {
-- Australia
link = "w",
preposition = "ของ",
class = "subpolity",
},
["local council"] = {
-- Malta; similar to municipalities
link = "+w:local councils of Malta",
preposition = "ของ",
fallback = "เทศบาล",
},
["local government district"] = {
link = "w",
preposition = "ของ",
affix_type = "suf",
affix = "อำเภอ",
class = "subpolity",
},
["local government district with borough status"] = {
link = "[[w:local government district|local government district]] with [[w:borough status|borough status]]",
plural = "local government districts with borough status",
plural_link = "[[w:local government district|local government districts]] with [[w:borough status|borough status]]",
preposition = "ของ",
affix_type = "suf",
affix = "อำเภอ",
class = "subpolity",
},
["local urban district"] = {
link = "w",
fallback = "unincorporated community",
},
["locality"] = {
link = "+w:locality (settlement)",
-- not necessarily true, but usually is the case
fallback = "village",
},
["London borough"] = {
link = "w",
preposition = "ของ",
affix_type = "pref",
affix = "borough",
fallback = "local government district with borough status",
has_neighborhoods = true,
},
["macroregion"] = {
link = true,
fallback = "ภูมิภาค",
},
["man-made structures!"] = {
category_link = "[[w:geographical feature#Engineered constructs|man-made structures]] such as [[airport]]s, [[university|universities]] and [[metro station]]s",
bare_category_parent = "สถานที่",
},
["manor"] = {
-- FIXME: or is this more like a farm?
link = true,
fallback = "building",
},
["marginal sea"] = {
link = true,
preposition = "ของ",
fallback = "ทะเล",
},
["market city"] = {
link = "+market town",
fallback = "นคร",
},
["market town"] = {
link = true,
fallback = "เมือง",
},
["massif"] = {
link = true,
fallback = "ภูเขา",
},
["megacity"] = {
link = true,
fallback = "นคร",
},
["metro station"] = {
link = true,
class = "man-made structure",
},
["metropolitan borough"] = {
link = true,
preposition = "ของ",
affix_type = "Pref",
no_affix_strings = {"borough", "นคร"},
fallback = "local government district",
has_neighborhoods = true,
},
["metropolitan city"] = {
-- These exist e.g. in Italy and are more like municipalities or even provinces than cities.
link = true,
preposition = "ของ",
affix_type = "Pref",
no_affix_strings = {"metropolitan", "นคร"},
class = "subpolity",
},
["metropolitan county"] = {
link = true,
fallback = "เทศมณฑล",
},
["metropolitan municipality"] = {
-- In South Africa, metropolitan municipalities group local municipalities and are like districts, between
-- provinces and municipalities.
-- In Turkey, metropolitan municipalities are provinces-level.
link = "w",
preposition = "ของ",
affix_type = "Suf",
no_affix_strings = {"metropolitan", "เทศบาล"},
fallback = "เทศบาล",
class = "subpolity",
},
["microdistrict"] = {
-- residential complex in post-Soviet states
link = true,
fallback = "neighborhood",
},
["micronations!"] = {
-- FIXME, merge with microstate
category_link = "[[micronation]]s",
bare_category_parent = "ประเทศ",
},
["microstate"] = {
link = true,
fallback = "ประเทศ",
},
["military base"] = {
link = "w",
class = "settlement", -- or "man-made structure"?
default = {true},
},
["minster town"] = {
-- England
link = "separately",
fallback = "เมือง",
},
["monarchy"] = {
link = true,
fallback = "องค์การทางการเมือง",
},
["moor"] = {
link = true,
class = "natural feature",
addl_bare_category_parents = {"ธรณีสัณฐาน", "ecosystems"},
default = {true},
},
["moorland"] = {
link = true,
fallback = "moor",
},
["motorway"] = {
link = true,
fallback = "road",
},
["ภูเขา"] = {
link = true,
class = "natural feature",
addl_bare_category_parents = {"ธรณีสัณฐาน"},
default = {true},
},
["mountain indigenous district"] = {
-- Taiwan
link = "+w:district (Taiwan)",
fallback = "อำเภอ",
},
["mountain indigenous township"] = {
-- Taiwan
link = "+w:township (Taiwan)",
fallback = "township",
},
["mountain pass"] = {
link = true,
-- The default plural algorithm gets this right but the singularization algorithm incorrectly converts
-- passes -> passe, so put an entry here to ensure we singularize correctly.
plural = "mountain passes",
class = "natural feature",
addl_bare_category_parents = {"ภูเขา"},
default = {true},
},
["เทือกเขา"] = {
link = true,
fallback = "ภูเขา",
},
["mountainous region"] = {
link = "separately",
fallback = "ภูมิภาค",
},
["mukim"] = {
-- Malaysia, Brunei, Indonesia, Singapore
link = true,
preposition = "ของ",
class = "subpolity",
},
["municipal district"] = {
link = "w",
-- meaning varies depending on the country; for now, assume no neighborhoods.
-- FIXME: has_neighborhoods might have to be a function that looks at the containing holonyms.
preposition = "ของ",
affix_type = "Pref",
no_affix_strings = "อำเภอ",
fallback = "เทศบาล",
},
["เทศบาล"] = {
link = true,
preposition = "ของ",
has_neighborhoods = true,
class = "subpolity",
},
["municipality with city status"] = {
link = "[[municipality]] with [[w:city status|city status]]",
plural = "municipalities with city status",
plural_link = "[[municipality|municipalities]] with [[w:city status|city status]]",
fallback = "เทศบาล",
},
["museum"] = {
link = true,
fallback = "building",
},
["mythological location"] = {
link = "separately",
former_type = "!",
class = "hypothetical location",
bare_category_parent = "สถานที่",
default = {true},
},
["named bridges!"] = {
category_link = "notable [[bridge]]s",
bare_category_parent = "man-made structures",
addl_bare_category_parents = {"bridges"},
},
["named buildings!"] = {
category_link = "notable [[house]]s, [[library|libraries]] and other [[building]]s",
bare_category_parent = "man-made structures",
addl_bare_category_parents = {"buildings"},
},
["named roads!"] = {
category_link = "notable [[road]]s, [[highway]]s, [[trail]]s and similar linear structures",
bare_category_parent = "man-made structures",
addl_bare_category_parents = {"roads"},
},
["national capital"] = {
link = "w",
fallback = "capital city",
},
["national park"] = {
link = true,
fallback = "park",
},
["natural features!"] = {
category_link = "[[w:geographical feature#Natural features|natural features]] such as [[lake]]s, [[mountain]]s, [[island]]s and [[ocean]]s",
bare_category_parent = "สถานที่",
},
["neighborhood"] = {
-- The majority of the properties here apply to both `neighborhoods` and `neighbourhoods`; the choice of which
-- one to use is made by district_neighborhood_cat_handler() based on the value of `british_spelling` for the
-- location (city, political division, etc.) of the holonym that follows the word "neighbo(u)hoods" in the
-- category name. It does *NOT* depend on whether the {{place}} call uses "neighborhoods" or "neighbourhoods".
-- (In general it can't, because other things like "urban areas", "อำเภอ", "subdivisions" and the like also
-- categorize as neighbo(u)rhoods.)
link = true,
-- See below. These are used by category handlers in [[Module:category tree/topic cat/data/Places]].
generic_before_non_cities = "ใน",
generic_before_cities = "ของ",
-- The following text is suitable for the top-level description of a neighborhood as well as categories of the
-- form `Neighborhoods in POLDIV` e.g. `Neighborhoods in Illinois, USA` but not for categories of the form
-- `Neighborhoods of Chicago`, where we'd get "... and other subportions of [[city|cities]] of [[Chicago]]".
category_link = "[[neighborhood]]s, [[district]]s and other subportions of [[city|cities]]",
category_link_before_city = "[[neighborhood]]s, [[district]]s and other subportions",
-- NOTE: This setting is needed for administrative divisions like barangays that fall back to `neighborhood`,
-- when set in [[Module:place/locations]] for a specific country (e.g. the Philippines). The above settings
-- for `generic_before_non_cities` and `generic_before_cities` are used by category handlers in
-- [[Module:category tree/topic cat/data/Places]] for `Neighborhoods in POLDIV` and `Neighborhoods of CITY`
-- categories. In fact, district_neighborhood_cat_handler() does not currently pay attention to them, but
-- generates "ของ" before cities and "ใน" before non-cities regardless. (FIXME: We should change that.)
preposition = "ของ",
class = "non-admin settlement",
cat_handler = district_neighborhood_cat_handler,
},
["neighbourhood"] = {
link = true,
category_link = "[[neighbourhood]]s, [[district]]s and other subportions of [[city|cities]]",
category_link_before_city = "[[neighbourhood]]s, [[district]]s and other subportions",
fallback = "neighborhood",
},
["new area"] = {
-- China (type of economic development zone, varying greatly in size)
link = "w",
preposition = "ใน",
class = "subpolity", --?
},
["new town"] = {
link = true,
fallback = "เมือง",
},
["non-city capital"] = {
link = "[[capital]]",
entry_placetype_use_the = true,
preposition = "ของ",
has_neighborhoods = true,
class = "capital",
cat_handler = function(data)
return capital_city_cat_handler(data, "non-city")
end,
-- FIXME, do we need the following?
default = {true},
},
["non-metropolitan county"] = {
link = "w",
fallback = "เทศมณฑล",
},
["non-metropolitan district"] = {
link = "w",
fallback = "local government district",
},
["non-sovereign kingdom"] = {
-- especially in Africa and Asia
link = "+w:non-sovereign monarchy",
generic_before_non_cities = "ใน",
class = "subpolity",
["country/*"] = {true},
["continent/*"] = {true},
default = {true},
},
["non-sovereign monarchy"] = {
link = "w",
fallback = "non-sovereign kingdom",
},
["oblast"] = {
link = true,
preposition = "ของ",
affix_type = "Suf",
class = "subpolity",
},
["oblasts and autonomous republics!"] = {
-- This and other similar "combined placetypes" are for use in the plural when grouping first-level
-- administrative regions of certain countries, in this case Ukraine.
category_link = "[[oblast]]s and [[w:autonomous republic|autonomous republic]]s",
class = "subpolity",
},
["มหาสมุทร"] = {
link = true,
holonym_use_the = true,
class = "natural feature",
addl_bare_category_parents = {"ทะเล", "bodies of water"},
default = {true},
},
["okrug"] = {
link = true,
preposition = "ของ",
affix_type = "Suf",
class = "subpolity",
},
["overseas collectivity"] = {
link = "w",
fallback = "collectivity",
},
["overseas department"] = {
link = "w",
fallback = "department",
},
["overseas territory"] = {
link = "w",
fallback = "dependent territory",
},
["parish"] = {
link = true,
preposition = "ของ",
affix_type = "suf",
class = "subpolity",
},
["parish municipality"] = {
-- in Quebec, often similar to a rural village; the famous [[Saint-Louis-du-Ha! Ha!]] is one of them.
link = "+w:parish municipality (Quebec)",
preposition = "ของ",
fallback = "เทศบาล",
has_neighborhoods = true,
},
["parish seat"] = {
link = true,
entry_placetype_use_the = true,
preposition = "ของ",
class = "capital",
has_neighborhoods = true,
},
["park"] = {
link = true,
class = "man-made structure",
default = {true},
},
["pass"] = {
link = "+mountain pass",
-- The default plural algorithm gets this right but the singularization algorithm incorrectly converts
-- passes -> passe, so put an entry here to ensure we singularize correctly.
plural = "passes",
fallback = "mountain pass",
},
["path"] = {
link = true,
fallback = "road",
},
["peak"] = {
link = true,
fallback = "ภูเขา",
},
["peninsula"] = {
link = true,
class = "natural feature",
addl_bare_category_parents = {"ธรณีสัณฐาน"},
default = {true},
},
["periphery"] = {
link = true,
preposition = "ของ",
class = "subpolity",
},
["สถานที่!"] = {
generic_before_non_cities = "ใน",
generic_before_cities = "ใน",
class = "generic place",
category_link = "[[place]]s of all sorts",
-- `category_link_top_level` control the description used in the top-level [[Category:Places]] and
-- language-specific variants such as [[Category:en:Places]]. The actual text for a language-spefic variant is
-- "{{{langname}}} names of [[geographical]] [[place]]s of all sorts; [[toponym]]s." where the "names of"
-- portion is automatically generated by the appropriate handler in
-- [[Module:category tree/topic cat/data/Places]].
category_link_top_level = "[[geographical]] [[place]]s of all sorts; [[toponym]]s",
bare_category_parent = "ชื่อ (หัวข้อ)",
},
["planned community"] = {
-- Include this so we don't categorize 'planned community' into villages, as 'community' does.
link = true,
class = "settlement",
has_neighborhoods = true,
},
["plateau"] = {
link = true,
class = "natural feature",
addl_bare_category_parents = {"ธรณีสัณฐาน"},
default = {true},
-- FIXME: Should generate both "Plateaus" and the appropriate 'geographic and cultural area' category
},
["Polish colony"] = {
link = "[[w:colony (Poland)|colony]]",
affix_type = "suf",
affix = "colony",
fallback = "village",
has_neighborhoods = true,
},
["political divisions!"] = {
category_link = "[[political]] [[division]]s and [[subdivision]]s, such as [[state]]s, [[province]]s, [[county|counties]] or [[district]]s",
bare_category_parent = "สถานที่",
},
["องค์การทางการเมือง"] = {
link = true,
category_link = "[[independent]] or [[semi-]][[independent]] [[polity|polities]]",
class = "polity", --ห้ามแปล class
bare_category_parent = "สถานที่",
default = {true},
},
["populated place"] = {
link = "+w:populated place",
-- not necessarily true, but usually is the case
fallback = "village",
},
["port"] = {
link = true,
class = "man-made structure",
default = {true},
},
["port city"] = {
-- FIXME: should categorize into "Ports" as well as "นคร"
link = true,
fallback = "นคร",
},
["port town"] = {
-- FIXME: should categorize into "Ports" as well as "เมือง"
link = "w",
fallback = "เมือง",
},
["prefecture"] = {
-- FIXME! `prefecture` is like a county in Japan and elsewhere but a department capital city in France.
-- May need `has_neighborhoods` to be a function.
link = true,
preposition = "ของ",
display_handler = prefecture_display_handler,
class = "subpolity",
},
["prefecture-level city"] = {
-- China; they are huge entities with a central city; not cities themselves.
link = "w",
preposition = "ของ",
class = "subpolity",
},
["preserved county"] = {
-- In Wales; they are former counties enshrined in law; there are 8 of them and each consists of one or more
-- "principal areas" (styled as "เทศมณฑล" or "county boroughs"), of which there are 22.
link = "w",
preposition = "ของ",
class = "subpolity",
inherently_former = {"FORMER"},
},
["primary area"] = {
-- a grouping of "อำเภอ" (neighborhoods) in Gothenburg, Sweden
link = "+w:sv:primärområde",
fallback = "neighborhood",
},
["principality"] = {
link = true,
fallback = "monarchy",
},
["promontory"] = {
link = true,
fallback = "headland",
},
["protectorate"] = {
link = true,
fallback = "dependent territory",
},
["จังหวัด"] = {
link = true,
preposition = "ของ",
display_handler = province_display_handler,
class = "subpolity",
},
["provinces and autonomous regions!"] = {
-- This and other similar "combined placetypes" are for use in the plural when grouping first-level
-- administrative regions of certain countries, in this case China.
category_link = "[[province]]s and [[autonomous region]]s",
class = "subpolity",
},
["provinces and territories!"] = {
-- This and other similar "combined placetypes" are for use in the plural when grouping first-level
-- administrative regions of certain countries, in this case Canada and Pakistan.
category_link = "[[province]]s and [[territory|territories]]",
class = "subpolity",
},
["provincial capital"] = {
link = true,
fallback = "capital city",
},
["raion"] = {
link = true,
preposition = "ของ",
affix_type = "Suf",
class = "subpolity",
},
["ranch"] = {
link = true,
fallback = "farm",
},
["range"] = {
-- FIXME: Where is this used? Is it a mountain range?
link = true,
holonym_use_the = true,
class = "natural feature",
},
["regency"] = {
link = true,
preposition = "ของ",
class = "subpolity",
},
["ภูมิภาค"] = {
link = true,
preposition = "ของ",
-- If 'region' isn't a specific administrative division, fall back to 'geographic and cultural area'
fallback = "geographic and cultural area",
-- "former region" is a subpolity but traditional/historic(al)/ancient/medieval/etc. is a geographic region
class = "geographic region",
},
["regional capital"] = {
link = "separately",
fallback = "capital city",
},
["regional county municipality"] = {
-- Quebec
link = "w",
preposition = "ของ",
affix_type = "Suf",
no_affix_strings = {"เทศบาล", "เทศมณฑล"},
fallback = "เทศบาล",
},
["regional district"] = {
link = "w",
preposition = "ของ",
affix_type = "Pref",
no_affix_strings = "อำเภอ",
fallback = "อำเภอ",
},
["regional municipality"] = {
link = "w",
preposition = "ของ",
affix_type = "Pref",
no_affix_strings = "เทศบาล",
fallback = "เทศบาล",
},
["regional unit"] = {
link = "w",
preposition = "ของ",
affix_type = "suf",
class = "subpolity",
},
["registration county"] = {
-- Used in Scotland for land registration purposes; formerly used in England, Wales and Ireland for statistical
-- purposes (registration of births, deaths and marriages, and for the output of census information).
link = "w",
fallback = "เทศมณฑล",
},
["republic"] = {
-- Of Russia, Yugoslavia, etc. "Republics" in general are sovereign but we use "ประเทศ" in that case.
link = true,
fallback = "constituent republic",
},
["research base"] = {
link = "+w:research station",
fallback = "research station",
},
["research station"] = {
link = "w",
class = "non-admin settlement", -- or "man-made structure"?
default = {true},
},
["reservoir"] = {
link = true,
fallback = "ทะเลสาบ",
},
["residential area"] = {
link = "separately",
fallback = "neighborhood",
},
["resort city"] = {
link = "w",
fallback = "นคร",
},
["resort town"] = {
link = "w",
fallback = "เมือง",
},
["แม่น้ำ"] = {
link = true,
generic_before_non_cities = "ใน",
holonym_use_the = true,
class = "natural feature",
addl_bare_category_parents = {"bodies of water"},
cat_handler = city_type_cat_handler,
["continent/*"] = {true},
default = {true},
},
["river island"] = {
link = "w",
fallback = "เกาะ",
},
["road"] = {
link = true,
class = "man-made structure",
default = {"Named roads"},
},
["Roman province"] = {
-- FIXME! Eliminate this in favor of 'former province|emp/Roman Empire'
link = "w",
default = {"Provinces of the Roman Empire"},
class = "subpolity",
},
["royal borough"] = {
link = "w",
preposition = "ของ",
affix_type = "Pref",
no_affix_strings = {"royal", "borough"},
fallback = "local government district with borough status",
has_neighborhoods = true,
},
["royal burgh"] = {
link = true,
fallback = "borough",
},
["royal capital"] = {
link = "w",
fallback = "capital city",
},
["rural committee"] = {
-- Hong Kong; a group of villages
link = "w",
affix_type = "Suf",
has_neighborhoods = true,
class = "settlement",
},
["rural community"] = {
-- New Brunswick
link = "+w:list of municipalities in New_Brunswick#Rural communities",
fallback = "เทศบาล",
},
["rural hromada"] = {
link = "[[rural]] [[w:hromada|hromada]]",
affix_type = "suf",
fallback = "hromada",
},
["rural municipality"] = {
link = "w",
preposition = "ของ",
affix_type = "Pref",
no_affix_strings = "เทศบาล",
fallback = "เทศบาล",
has_neighborhoods = true, --?
},
["rural township"] = {
-- Taiwan
link = "+w:rural township (Taiwan)",
fallback = "township",
},
["sanctuary"] = {
link = true,
fallback = "temple",
},
["satrapy"] = {
link = true,
preposition = "ของ",
class = "subpolity",
inherently_former = {"ANCIENT", "FORMER"},
},
["ทะเล"] = {
link = true,
holonym_use_the = true,
class = "natural feature",
addl_bare_category_parents = {"bodies of water"},
default = {true},
},
["seaport"] = {
link = true,
fallback = "port",
},
["seat"] = {
link = true,
fallback = "administrative centre",
},
["self-administered area"] = {
-- Myanmar (groups self-administered divisions and zones)
link = "+w:self-administered zone",
preposition = "ของ",
class = "subpolity",
},
["self-administered division"] = {
-- Myanmar (only one of them: Wa Self-Administered Division)
link = "w",
fallback = "self-administered area",
},
["self-administered zone"] = {
-- Myanmar (five of them)
link = "w",
fallback = "self-administered area",
},
["separatist state"] = {
link = "separately",
fallback = "unrecognized country",
},
["การตั้งถิ่นฐาน"] = {
link = true,
category_link = "[[settlement]]s such as [[city|cities]], [[village]]s and [[farm]]s",
bare_category_parent = "สถานที่",
-- not necessarily true, but usually is the case
fallback = "village",
},
["settlement hromada"] = {
link = "[[w:Populated สถานที่ในUkraine#Rural settlements|การตั้งถิ่นฐาน]] [[w:hromada|hromada]]",
affix_type = "suf",
fallback = "hromada",
},
["sheading"] = {
-- Isle of Man
link = true,
fallback = "อำเภอ",
},
["sheep station"] = {
-- Australia
link = true,
fallback = "farm",
},
["shire"] = {
link = true,
fallback = "เทศมณฑล",
},
["shire county"] = {
link = "w",
fallback = "เทศมณฑล",
},
["shire town"] = {
link = true,
fallback = "county seat",
},
["ski resort city"] = {
link = "[[ski resort]] [[city]]",
fallback = "นคร",
},
["ski resort town"] = {
link = "[[ski resort]] [[town]]",
fallback = "เมือง",
},
["spa city"] = {
link = "+w:spa town",
fallback = "นคร",
},
["spa town"] = {
link = "w",
fallback = "เมือง",
},
["space station"] = {
link = true,
fallback = "research station",
},
["special administrative region"] = {
-- in China; in practice they are city-like (Hong Kong, Macau); also [[Oecusse]] in East Timor is formally a
-- "special administrative region"; North Korea had one such region planned (Sinuiju) but abandoned; Indonesia
-- has similar "special regions" of Jakarta, Yogyakarta and Aceh; and South Sudan has three "special
-- administrative areas"
link = "+w:special administrative regions of China",
preposition = "ของ",
class = "subpolity",
has_neighborhoods = true, --?
-- no suffix since สถานที่ในHong Kong or Macau are listed without China, except Hong Kong and Macau themselves
-- they also contain regions (or areas), e.g. [[Kowloon]], so it would be confusing
suffix = "",
},
["special collectivity"] = {
link = "w",
fallback = "collectivity",
},
["special municipality"] = {
-- formerly linked to the Taiwan article but there are also special municipalities of the Netherlands
link = "w",
fallback = "เทศบาล",
},
["special ward"] = {
-- Tokyo
link = true,
fallback = "เทศบาล",
},
["spit"] = {
link = true,
fallback = "peninsula",
},
["spring"] = {
link = true,
class = "natural feature",
default = {true},
},
["star"] = {
link = true,
class = "natural feature",
default = {true},
},
["รัฐ"] = {
link = true,
preposition = "ของ",
class = "subpolity",
-- 'former/historical state' could refer either to a state of a country (a division) or a state = sovereign
-- entity. The latter appears more common (e.g. in various "ancient states" of East Asia).
former_type = "องค์การทางการเมือง",
},
["states and territories!"] = {
-- This and other similar "combined placetypes" are for use in the plural when grouping first-level
-- administrative regions of certain countries, in this case Australia.
category_link = "[[state]]s and [[territory|territories]]",
class = "subpolity",
},
["states and union territories!"] = {
-- This and other similar "combined placetypes" are for use in the plural when grouping first-level
-- administrative regions of certain countries, in this case India.
category_link = "[[state]]s and [[union territory|union territories]]",
class = "subpolity",
},
["state capital"] = {
link = true,
fallback = "capital city",
},
["state park"] = {
link = true,
fallback = "park",
},
["state-level new area"] = {
-- China (type of economic development zone, varying greatly in size)
link = "w",
fallback = "new area",
},
["statistical region"] = {
-- Slovenia
link = true,
fallback = "administrative region",
},
["statutory city"] = {
link = "w",
fallback = "นคร",
},
["statutory town"] = {
link = "w",
fallback = "เมือง",
},
["strait"] = {
link = true,
class = "natural feature",
addl_bare_category_parents = {"bodies of water"},
default = {true},
},
["stream"] = {
link = true,
fallback = "แม่น้ำ",
},
["street"] = {
link = true,
fallback = "road",
},
["strip"] = {
link = true,
fallback = "geographic region",
},
["strip of land"] = {
link = "[[strip]] of [[land]]",
plural = "strips of land",
plural_link = "[[strip]]s of [[land]]",
fallback = "geographic region",
},
["sub-metropolitan city"] = {
link = "+w:List of cities in Nepal#Sub-metropolitan cities",
fallback = "นคร",
},
["sub-prefectural city"] = {
link = "w",
fallback = "subprovincial city",
},
["ตำบล"] = {
link = true,
preposition = "ของ",
has_neighborhoods = true, --?
-- FIXME: subdistricts can be neighborhood-like (of Jakarta) or larger (in China); need a handler
class = "subpolity",
default = {true},
},
["subdivision"] = {
link = true,
preposition = "ของ",
affix_type = "suf",
-- FIXME: subdivisions can be neighborhood-like or larger; need a handler
class = "subpolity",
cat_handler = district_neighborhood_cat_handler,
},
["submerged ghost town"] = {
-- FIXME: Consider just having "submerged" as a qualifier.
link = "[[submerged]] [[ghost town]]",
fallback = "ghost town",
},
["subnational kingdom"] = {
link = "+w:subnational monarchy",
fallback = "non-sovereign kingdom",
},
["subnational monarchy"] = {
link = "w",
fallback = "non-sovereign kingdom",
},
["subprefecture"] = {
link = true,
affix_type = "suf",
preposition = "ของ",
class = "subpolity",
},
["subprovince"] = {
link = true,
preposition = "ของ",
class = "subpolity",
},
["subprovincial city"] = {
link = "w",
-- China; special status given to certain prefecture-level cities
fallback = "prefecture-level city",
},
["subprovincial district"] = {
link = "w",
-- China; special status given to Binhai New Area and Pudong New Area, which are county-level districts
preposition = "ของ",
class = "subpolity",
},
["subregion"] = {
link = true,
fallback = "geographic region",
},
["suburb"] = {
link = true,
-- The following text is suitable for the top-level description of a suburb as well as categories of the form
-- 'Suburbs in POLDIV' e.g. 'Suburbs in Illinois, USA' but not for categories of the form 'Suburbs of Chicago',
-- where we'd get "[[suburb]]s of [[city|cities]] of [[Chicago]]".
category_link = "[[suburb]]s of [[city|cities]]",
category_link_before_city = "[[suburb]]s",
-- See comments under "neighborhood" for the following three settings. They are used by
-- [[Module:category tree/topic cat/data/Places]] for generating the text of 'Suburbs in/of PLACE' categories
-- but currently ignored by district_neighborhood_cat_handler (which actually generates the categories for a
-- given page), which hardcodes "ใน" for non-cities and "ของ" for cities. (FIXME: Change this.)
generic_before_non_cities = "ใน",
generic_before_cities = "ของ",
preposition = "ของ",
has_neighborhoods = true, --?
class = "non-admin settlement", --?
cat_handler = district_neighborhood_cat_handler,
},
["suburban area"] = {
link = "w",
fallback = "suburb",
},
["subway station"] = {
link = "w",
fallback = "metro station",
},
["sum"] = {
-- In China, Mongolia, Russia; something like a county in Mongolia but a township in China (Inner Mongolia),
-- and equivalent to a [[selsoviet]] in the parts of Russia where it's in use (a rural council, below a raion).
link = "+w:sum (administrative division)",
-- This fallback is somewha arbitrary. We could use "เทศมณฑล" but that has a display handler
-- which we don't want to be active (FIXME: If the display handler would be active, that's a bug).
fallback = "division",
},
["supercontinent"] = {
link = true,
fallback = "continent",
},
["tehsil"] = {
link = true,
affix_type = "suf",
no_affix_strings = {"tehsil", "tahsil"},
class = "subpolity",
},
["temple"] = {
link = true,
fallback = "building",
},
["territorial authority"] = {
link = "w",
fallback = "อำเภอ",
},
["ดินแดน"] = {
link = true,
preposition = "ของ",
class = "subpolity",
},
["theme"] = {
link = "+w:theme (Byzantine district)",
preposition = "ของ",
class = "subpolity",
},
["เมือง"] = {
link = true,
generic_before_non_cities = "ใน",
has_neighborhoods = true,
class = "settlement",
cat_handler = city_type_cat_handler,
default = {true},
},
["town with bystatus"] = {
-- can't use templates in links currently
link = "[[town]] with [[bystatus#Norwegian Bokmål|bystatus]]",
plural = "towns with bystatus",
plural_link = "[[town]]s with [[bystatus#Norwegian Bokmål|bystatus]]",
fallback = "เมือง",
},
["township"] = {
link = true,
has_neighborhoods = true,
class = "settlement", --?
default = {true},
},
["township municipality"] = {
-- Quebec
link = "+w:township municipality (Quebec)",
preposition = "ของ",
fallback = "เทศบาล",
has_neighborhoods = true, --?
},
["traditional county"] = {
link = true,
fallback = "เทศมณฑล",
},
["traditional region"] = {
-- FIXME: Verify this works. Same for 'historic(al) region'.
-- provided only for the link
link = "w",
fallback = "FORMER geographic region",
},
["trail"] = {
link = true,
fallback = "road",
},
["treaty port"] = {
link = "w",
fallback = "นคร",
class = "settlement",
inherently_former = {"FORMER"},
},
["tributary"] = {
link = true,
preposition = "ของ",
fallback = "แม่น้ำ",
},
["underground station"] = {
link = "w",
fallback = "metro station",
},
["unincorporated area"] = {
link = "w",
-- I don't know if this fallback makes sense everywhere.
fallback = "unincorporated community",
},
["unincorporated community"] = {
link = true,
generic_before_non_cities = "ใน",
class = "non-admin settlement",
},
["unincorporated territory"] = {
link = "w",
fallback = "ดินแดน",
},
["union territory"] = {
-- India
link = true,
preposition = "ของ",
entry_placetype_indefinite_article = "a",
class = "subpolity",
},
["unitary authority"] = {
-- UK, New Zealand
link = true,
entry_placetype_indefinite_article = "a",
fallback = "local government district",
},
["unitary district"] = {
link = "w",
entry_placetype_indefinite_article = "a",
fallback = "local government district",
},
["united township municipality"] = {
-- Quebec
link = "+w:united township municipality (Quebec)",
entry_placetype_indefinite_article = "a",
fallback = "township municipality",
has_neighborhoods = true, --?
},
["university"] = {
link = true,
entry_placetype_indefinite_article = "a",
class = "man-made structure",
default = {true},
},
["unrecognised country"] = {
link = "w",
fallback = "unrecognized country",
},
["unrecognized and nearly unrecognized countries!"] = {
category_link = "[[de facto]] [[independent]] [[state]]s with little or no {{w|international recognition}}",
bare_category_parent = "country-like entities",
},
["unrecognized country"] = {
link = "w",
class = "polity", --ห้ามแปล class
default = {"Unrecognized and nearly unrecognized countries"},
},
["unrecognised state"] = {
link = "w",
fallback = "unrecognized country",
},
["unrecognized state"] = {
link = "w",
fallback = "unrecognized country",
},
["urban area"] = {
link = "separately",
fallback = "neighborhood",
},
["urban hromada"] = {
link = "[[urban]] [[w:hromada|hromada]]",
affix_type = "suf",
fallback = "hromada",
},
["urban service area"] = {
-- A strange beast existing in Alberta; technically a type of hamlet but in practice used for much larger
-- cities and treated equivalent to a city. (There are only two of them, [[Fort McMurray]] and [[Sherwood Park]]).
link = "w",
fallback = "นคร",
},
["urban township"] = {
link = "w",
fallback = "township",
},
["urban-type settlement"] = {
-- appears to be a particular type of small urban settlement in post-Soviet states,
-- had an administrative function.
link = "w",
fallback = "เมือง",
},
["valley"] = {
link = true,
class = "natural feature",
addl_bare_category_parents = {"ธรณีสัณฐาน", "water"},
default = {true},
},
["viceroyalty"] = {
-- in essence, a type of colony
link = true,
fallback = "dependent territory",
},
["village"] = {
link = true,
generic_before_non_cities = "ใน",
category_link = "[[village]]s, [[hamlet]]s, and other small [[community|communities]] and [[settlement]]s",
class = "settlement",
cat_handler = city_type_cat_handler,
default = {true},
},
["village development committee"] = {
-- former administrative structure in Nepal; also exists in India but not as a formal unit
link = "+w:village development committee (Nepal)",
inherently_former = {"FORMER"},
fallback = "village",
},
["village municipality"] = {
-- Quebec
link = "+w:village municipality (Quebec)",
preposition = "ของ",
fallback = "เทศบาล",
has_neighborhoods = true, --?
},
["voivodeship"] = {
-- Poland
link = true,
display_handler = voivodeship_display_handler,
preposition = "ของ",
class = "subpolity",
},
["volcano"] = {
link = true,
plural = "volcanoes",
class = "natural feature",
addl_bare_category_parents = {"ธรณีสัณฐาน"},
default = {true, "ภูเขา"},
},
["ward"] = {
link = true,
class = "settlement",
-- Wards are formal administrative divisions of a city but have some properties of neighborhoods.
fallback = "neighborhood",
},
["watercourse"] = {
link = true,
fallback = "channel",
},
["Welsh community"] = {
-- Wales
link = "[[w:community (Wales)|community]]",
preposition = "ของ",
affix_type = "suf",
affix = "community",
has_neighborhoods = true,
class = "settlement",
},
["zone"] = {
-- administrative division of Ethiopia, Qatar, Nepal, India
link = "+w:zone#Place names",
preposition = "ของ",
class = "subpolity",
},
----------------------------------------------------------------------------------------------
-- Categories for former places --
----------------------------------------------------------------------------------------------
["ANCIENT capital"] = {
link = false,
entry_placetype_use_the = true,
preposition = "ของ",
has_neighborhoods = true,
class = "capital",
-- FIXME: Consider removing 'ancient settlements' here. Ancient capitals, like former capitals, often still
-- exist but just aren't the capital any more. Maybe we should have an 'Ancient capitals' category.
default = {"Ancient settlements", "Former capitals"},
},
["ANCIENT non-admin settlement"] = {
link = false,
class = "non-admin settlement",
fallback = "ANCIENT settlement",
},
["ANCIENT settlement"] = {
link = false,
has_neighborhoods = true,
class = "settlement",
default = {"Ancient settlements"},
},
["ancient settlements!"] = {
category_link = "former [[city|cities]], [[town]]s and [[village]]s that existed in [[antiquity]]",
bare_category_parent = "former settlements",
},
["FORMER capital"] = {
link = false,
entry_placetype_use_the = true,
preposition = "ของ",
has_neighborhoods = true,
class = "capital",
default = {"Former capitals"},
},
["former capitals!"] = {
category_link = "former [[capital]] [[city|cities]] and [[town]]s",
bare_category_parent = "การตั้งถิ่นฐาน",
},
["former counties and county-level cities!"] = {
-- For categorizing former counties and county-level cities of China
category_link = "no-longer existing [[county|counties]] and [[county-level city|county-level cities]]",
bare_category_breadcrumb = "counties and county-level cities",
bare_category_parent = "former political divisions",
},
["FORMER county"] = {
-- For categorizing former counties and county-level cities of China
link = false,
fallback = "FORMER subpolity",
},
["FORMER county-level city"] = {
-- For categorizing former counties and county-level cities of China
link = false,
fallback = "FORMER subpolity",
},
["former countries and country-like entities!"] = {
category_link = "[[country|countries]] and similar [[polity|polities]] that no longer exist",
bare_category_breadcrumb = "countries and country-like entities",
bare_category_parent = "former polities",
},
["FORMER country"] = {
link = false,
class = "polity", --ห้ามแปล class
default = {"Former countries and country-like entities"},
},
["former dependent territories!"] = {
category_link = "[[w:dependent territory|dependent territories]] (colonies, dependencies, protectorates, etc.) that no longer exist",
bare_category_breadcrumb = "dependent territories",
bare_category_parent = "former political divisions",
},
["FORMER dependent territory"] = {
link = false,
preposition = "ของ",
class = "subpolity",
default = {"Former dependent territories"},
},
["former districts!"] = {
-- For categorizing former districts of China
category_link = "no-longer-existing [[district]]s",
bare_category_breadcrumb = "อำเภอ",
bare_category_parent = "former political divisions",
},
["FORMER district"] = {
-- For categorizing former districts of China
link = false,
fallback = "FORMER subpolity",
},
["FORMER geographic region"] = {
link = false,
fallback = "geographic and cultural area",
},
["FORMER man-made structure"] = {
link = false,
class = "man-made structure",
default = {"Former man-made structures"},
},
["former man-made structures!"] = {
category_link = "man-made structures such as [[airport]]s and [[park]]s that no longer exist",
bare_category_breadcrumb = "man-made structures",
bare_category_parent = "former places",
},
["former municipalities!"] = {
-- For categorizing former municipalities of the Netherlands
category_link = "no-longer-existing [[municipality|municipalities]]",
bare_category_breadcrumb = "เทศบาล",
bare_category_parent = "former political divisions",
},
["FORMER municipality"] = {
-- For categorizing former municipalities of the Netherlands
link = false,
fallback = "FORMER subpolity",
},
["FORMER natural feature"] = {
link = false,
class = "natural feature",
default = {"Former natural features"},
},
["former natural features!"] = {
category_link = "natural features such as [[lake]]s, [[river]]s and [[island]]s that no longer exist",
bare_category_breadcrumb = "natural features",
bare_category_parent = "former places",
},
["FORMER non-admin settlement"] = {
link = false,
class = "non-admin settlement",
fallback = "FORMER settlement",
},
["former places!"] = {
category_link = "[[place]]s of all sorts that no longer exist",
bare_category_breadcrumb = "former",
bare_category_parent = "สถานที่",
},
["former political divisions!"] = {
category_link = "[[political]] [[division]]s (states, provinces, counties, etc.) that no longer exist",
bare_category_breadcrumb = "political divisions",
bare_category_parent = "former places",
},
["former polities!"] = {
category_link = "[[polity|polities]] (countries, kingdoms, empires, etc.) that no longer exist",
bare_category_breadcrumb = "องค์การทางการเมือง",
bare_category_parent = "former places",
},
["FORMER polity"] = {
link = false,
class = "polity", --ห้ามแปล class
default = {"Former polities"},
},
["former prefectures!"] = {
-- For categorizing former prefectures of China
category_link = "no-longer-existing [[prefecture]]s",
bare_category_breadcrumb = "prefectures",
bare_category_parent = "former political divisions",
},
["FORMER prefecture"] = {
-- For categorizing former prefectures of China
link = false,
fallback = "FORMER subpolity",
},
["former provinces!"] = {
-- For categorizing former provinces of China, etc.
category_link = "no-longer-existing [[province]]s",
bare_category_breadcrumb = "จังหวัด",
bare_category_parent = "former political divisions",
},
["FORMER province"] = {
-- For categorizing ancient/historical/former provinces of the Roman Empire
link = false,
fallback = "FORMER subpolity",
},
["former region"] = {
-- A former region is considered a former political division, but not a 'historical/traditional/etc.' region.
link = "separately",
preposition = "ของ",
inherently_former = {"FORMER"},
class = "subpolity",
},
["FORMER settlement"] = {
link = false,
has_neighborhoods = true,
class = "settlement",
default = {"Former settlements"},
},
["former settlements!"] = {
category_link = "[[city|cities]], [[town]]s and [[village]]s that no longer exist or have been merged or reclassified",
bare_category_breadcrumb = "การตั้งถิ่นฐาน",
bare_category_parent = "former political divisions",
},
["FORMER subpolity"] = {
link = false,
preposition = "ของ",
class = "subpolity",
default = {"Former political divisions"},
},
----------------------------------------------------------------------------------------------
-- form-of categories --
----------------------------------------------------------------------------------------------
---------- Abbreviations ----------
["abbreviations of counties!"] = {
-- For categorizing abbreviations of counties of e.g. England
full_category_link = "{{glossary|abbreviation}}s of [[name]]s of [[county|counties]]",
bare_category_breadcrumb = "เทศมณฑล",
bare_category_parent = "abbreviations of political divisions",
},
["abbreviations of countries!"] = {
full_category_link = "{{glossary|abbreviation}}s of [[name]]s of [[country|countries]]",
bare_category_breadcrumb = "ประเทศ",
bare_category_parent = "abbreviations of places",
},
["abbreviations of departments!"] = {
-- For categorizing abbreviations of departments of e.g. France
full_category_link = "{{glossary|abbreviation}}s of [[name]]s of [[department]]s",
bare_category_breadcrumb = "departments",
bare_category_parent = "abbreviations of political divisions",
},
["abbreviations of districts!"] = {
-- For categorizing abbreviations of districts of e.g. ???
full_category_link = "{{glossary|abbreviation}}s of [[name]]s of [[district]]s",
bare_category_breadcrumb = "อำเภอ",
bare_category_parent = "abbreviations of political divisions",
},
["abbreviations of divisions!"] = {
-- For categorizing abbreviations of divisions of e.g. Bangladesh
full_category_link = "{{glossary|abbreviation}}s of [[name]]s of [[division]]s",
bare_category_breadcrumb = "divisions",
bare_category_parent = "abbreviations of political divisions",
},
["abbreviations of former countries!"] = {
full_category_link = "{{glossary|abbreviation}}s of [[country|countries]] that no longer [[exist]]",
bare_category_breadcrumb = "ประเทศ",
bare_category_parent = "abbreviations of former places",
},
["abbreviations of former places!"] = {
full_category_link = "{{glossary|abbreviation}}s of [[place]]s that no longer [[exist]]",
bare_category_breadcrumb = "abbreviations",
bare_category_parent = "former places",
addl_bare_category_parents = {{name = "abbreviations of places", sort = "former"}},
},
["abbreviations of places!"] = {
full_category_link = "{{glossary|abbreviation}}s of [[name]]s of [[place]]s",
bare_category_breadcrumb = "abbreviations",
bare_category_parent = "สถานที่",
},
["abbreviations of political divisions!"] = {
full_category_link = "{{glossary|abbreviation}}s of [[name]]s of [[political]] [[division]]s",
bare_category_breadcrumb = "political divisions",
bare_category_parent = "abbreviations of places",
},
["abbreviations of prefectures!"] = {
-- For categorizing abbreviations of prefectures of e.g. Japan
full_category_link = "{{glossary|abbreviation}}s of [[name]]s of [[prefecture]]s",
bare_category_breadcrumb = "prefectures",
bare_category_parent = "abbreviations of political divisions",
},
["abbreviations of provinces!"] = {
-- For categorizing abbreviations of provinces of e.g. Canada
full_category_link = "{{glossary|abbreviation}}s of [[name]]s of [[province]]s",
bare_category_breadcrumb = "จังหวัด",
bare_category_parent = "abbreviations of political divisions",
},
["abbreviations of provinces and territories!"] = {
full_category_link = "{{glossary|abbreviation}}s of [[name]]s of [[province]]s and [[territory|territories]]",
bare_category_breadcrumb = "provinces and territories",
bare_category_parent = "abbreviations of political divisions",
},
["abbreviations of regions!"] = {
-- For categorizing abbreviations of regions of e.g. Italy
full_category_link = "{{glossary|abbreviation}}s of [[name]]s of [[administrative region]]s",
bare_category_breadcrumb = "ภูมิภาค",
bare_category_parent = "abbreviations of political divisions",
},
["abbreviations of states!"] = {
-- For categorizing abbreviations of states of e.g. the United States
full_category_link = "{{glossary|abbreviation}}s of [[name]]s of [[state]]s",
bare_category_breadcrumb = "รัฐ",
bare_category_parent = "abbreviations of political divisions",
},
["abbreviations of states and territories!"] = {
full_category_link = "{{glossary|abbreviation}}s of [[name]]s of [[state]]s and [[territory|territories]]",
bare_category_breadcrumb = "states and territories",
bare_category_parent = "abbreviations of political divisions",
},
["abbreviations of states and union territories!"] = {
full_category_link = "{{glossary|abbreviation}}s of [[name]]s of [[state]]s and [[union territory|union territories]]",
bare_category_breadcrumb = "states and union territories",
bare_category_parent = "abbreviations of political divisions",
},
["abbreviations of territories!"] = {
full_category_link = "{{glossary|abbreviation}}s of [[name]]s of [[territory|territories]]",
bare_category_breadcrumb = "ดินแดน",
bare_category_parent = "abbreviations of political divisions",
},
["ABBREVIATION_OF country"] = {
link = false,
default = {"Abbreviations of countries"},
},
["ABBREVIATION_OF county"] = {
link = false,
fallback = "ABBREVIATION_OF subpolity",
},
["ABBREVIATION_OF department"] = {
link = false,
fallback = "ABBREVIATION_OF subpolity",
},
["ABBREVIATION_OF district"] = {
link = false,
fallback = "ABBREVIATION_OF subpolity",
},
["ABBREVIATION_OF division"] = {
link = false,
fallback = "ABBREVIATION_OF subpolity",
},
["ABBREVIATION_OF FORMER country"] = {
link = false,
default = {"Abbreviations of former countries"},
},
["ABBREVIATION_OF FORMER place"] = {
link = false,
default = {"Abbreviations of former places"},
},
["ABBREVIATION_OF place"] = {
link = false,
default = {"Abbreviations of places"},
},
["ABBREVIATION_OF prefecture"] = {
link = false,
fallback = "ABBREVIATION_OF subpolity",
},
["ABBREVIATION_OF province"] = {
link = false,
fallback = "ABBREVIATION_OF subpolity",
},
["ABBREVIATION_OF region"] = {
link = false,
fallback = "ABBREVIATION_OF subpolity",
},
["ABBREVIATION_OF state"] = {
link = false,
fallback = "ABBREVIATION_OF subpolity",
},
["ABBREVIATION_OF subpolity"] = {
link = false,
default = {"Abbreviations of political divisions"},
},
["ABBREVIATION_OF territory"] = {
link = false,
fallback = "ABBREVIATION_OF subpolity",
},
["ABBREVIATION_OF union territory"] = {
link = false,
fallback = "ABBREVIATION_OF subpolity",
},
---------- Archaic forms ----------
["archaic forms of places!"] = {
full_category_link = "{{glossary|archaic}} [[form]]s of [[name]]s of [[place]]s",
bare_category_breadcrumb = "archaic forms",
bare_category_parent = "สถานที่",
},
["ARCHAIC_FORM_OF place"] = {
link = false,
default = {"Archaic forms of places"},
},
---------- Clippings ----------
["clippings of places!"] = {
full_category_link = "{{glossary|clipping}}s of [[name]]s of [[place]]s",
bare_category_breadcrumb = "clippings",
bare_category_parent = "สถานที่",
},
["CLIPPING_OF place"] = {
link = false,
default = {"Clippings of places"},
},
---------- Dated forms ----------
["dated forms of places!"] = {
full_category_link = "{{glossary|dated}} [[form]]s of [[name]]s of [[place]]s",
bare_category_breadcrumb = "dated forms",
bare_category_parent = "สถานที่",
},
["DATED_FORM_OF place"] = {
link = false,
default = {"Dated forms of places"},
},
---------- Derogatory names ----------
["derogatory names for cities!"] = {
full_category_link = "{{glossary|derogatory}} [[name]]s for [[city|cities]]",
bare_category_breadcrumb = "นคร",
bare_category_parent = "derogatory names for places",
addl_bare_category_parents = {"nicknames for cities"},
},
["derogatory names for continents!"] = {
full_category_link = "{{glossary|derogatory}} [[name]]s for [[continent]]s",
bare_category_breadcrumb = "ทวีป",
bare_category_parent = "derogatory names for places",
addl_bare_category_parents = {"nicknames for continents"},
},
["derogatory names for countries!"] = {
full_category_link = "{{glossary|derogatory}} [[name]]s for [[country|countries]]",
bare_category_breadcrumb = "ประเทศ",
bare_category_parent = "derogatory names for places",
addl_bare_category_parents = {"nicknames for countries"},
},
["derogatory names for places!"] = {
full_category_link = "{{glossary|derogatory}} [[name]]s for [[place]]s",
bare_category_breadcrumb = "derogatory names",
bare_category_parent = "nicknames for places",
},
["derogatory names for states!"] = {
full_category_link = "{{glossary|derogatory}} [[name]]s for [[state]]s",
bare_category_breadcrumb = "รัฐ",
bare_category_parent = "derogatory names for places",
addl_bare_category_parents = {"nicknames for states"},
},
["DEROGATORY_NAME_FOR capital"] = {
link = false,
default = {"Derogatory names for cities"},
},
["DEROGATORY_NAME_FOR city"] = {
link = false,
default = {"Derogatory names for cities"},
},
["DEROGATORY_NAME_FOR continent"] = {
link = false,
default = {"Derogatory names for continents"},
},
["DEROGATORY_NAME_FOR country"] = {
link = false,
default = {"Derogatory names for countries"},
},
["DEROGATORY_NAME_FOR metropolitan city"] = {
-- "metropolitan city" doesn't fall back to "นคร"
link = false,
default = {"Derogatory names for cities"},
},
["DEROGATORY_NAME_FOR place"] = {
link = false,
default = {"Derogatory names for places"},
},
["DEROGATORY_NAME_FOR prefecture-level city"] = {
-- "prefecture-level city" doesn't fall back to "นคร" but things like "county-level city" and
-- "subprovincial city" fall back to "prefecture-level city"
link = false,
default = {"Derogatory names for cities"},
},
["DEROGATORY_NAME_FOR state"] = {
link = false,
default = {"Derogatory names for states"},
},
["DEROGATORY_NAME_FOR town"] = {
link = false,
default = {"Derogatory names for cities"},
},
---------- Ellipses ----------
["ellipses of places!"] = {
full_category_link = "{{glossary|ellipsis|ellipses}} of [[name]]s of [[place]]s",
bare_category_breadcrumb = "ellipses",
bare_category_parent = "สถานที่",
},
["ELLIPSIS_OF place"] = {
link = false,
default = {"Ellipses of places"},
},
---------- Former long-form names ----------
["former long-form names of countries!"] = {
full_category_link = "no-longer-[[use]]d [[long]]-[[form]] (but typically [[unofficial]]) [[name]]s of [[country|countries]]",
bare_category_breadcrumb = "ประเทศ",
bare_category_parent = "former long-form names of places",
addl_bare_category_parents = {{name = "former names of countries", sort = "long-form"}},
},
["former long-form names of places!"] = {
full_category_link = "no-longer-[[use]]d [[long]]-[[form]] (but typically [[unofficial]]) [[name]]s of [[place]]s",
bare_category_breadcrumb = "long-form",
bare_category_parent = "former names of places",
},
["FORMER_LONG_FORM_OF country"] = {
link = false,
default = {"Former long-form names of countries"},
},
["FORMER_LONG_FORM_OF place"] = {
link = false,
default = {"Former long-form names of places"},
},
---------- Former names ----------
["former names of capitals!"] = {
full_category_link = "[[former]] [[name]]s of [[capital city|capital cities]] that generally still exist but under a different name",
bare_category_breadcrumb = "capitals",
bare_category_parent = "former names of settlements",
},
["former names of countries!"] = {
full_category_link = "[[former]] [[name]]s of [[country|countries]] that generally still exist but under a different name",
bare_category_breadcrumb = "ประเทศ",
bare_category_parent = "former names of places",
},
["former names of places!"] = {
full_category_link = "[[former]] [[name]]s of [[place]]s that generally still exist but under a different name",
bare_category_breadcrumb = "former names",
bare_category_parent = "สถานที่",
},
["former names of political divisions!"] = {
full_category_link = "[[former]] [[name]]s of [[political]] [[division]]s (states, provinces, counties, etc.) that generally still exist but under a different name",
bare_category_breadcrumb = "political divisions",
bare_category_parent = "former names of places",
},
["former names of polities!"] = {
full_category_link = "[[former]] [[name]]s of [[polity|polities]] (e.g. [[country|countries]]) that generally still exist but under a different name",
bare_category_breadcrumb = "องค์การทางการเมือง",
bare_category_parent = "former names of places",
},
["former names of settlements!"] = {
full_category_link = "[[former]] [[name]]s of [[city|cities]], [[town]]s, [[village]]s, etc. that generally still exist but under a different name",
bare_category_breadcrumb = "การตั้งถิ่นฐาน",
bare_category_parent = "former names of political divisions",
},
["FORMER_NAME_OF capital"] = {
link = false,
default = {"Former names of capitals"},
},
["FORMER_NAME_OF country"] = {
link = false,
default = {"Former names of countries"},
},
["FORMER_NAME_OF place"] = {
link = false,
default = {"Former names of places"},
},
["FORMER_NAME_OF polity"] = {
link = false,
default = {"Former names of polities"},
},
["FORMER_NAME_OF region"] = {
link = false,
fallback = "FORMER_NAME_OF subpolity",
},
["FORMER_NAME_OF settlement"] = {
link = false,
default = {"Former names of settlements"},
},
["FORMER_NAME_OF subpolity"] = {
link = false,
default = {"Former names of political divisions"},
},
---------- Former nicknames ----------
["former nicknames for cities!"] = {
full_category_link = "no-longer-used [[nickname]]s for [[city|cities]], e.g. the [[Eternal City]] for [[Kyoto]] during the {{w|Heian period}} ({{circa2|800–1100|short=yes}} {{AD}})",
bare_category_breadcrumb = "นคร",
bare_category_parent = "former nicknames for places",
addl_bare_category_parents = {"nicknames for cities"},
},
["former nicknames for places!"] = {
full_category_link = "no-longer-used [[nickname]]s for [[place]]s",
bare_category_breadcrumb = "former",
bare_category_parent = "nicknames for places",
addl_bare_category_parents = {{name = "former names of places", sort = "nicknames"}},
},
["FORMER_NICKNAME_FOR capital"] = {
link = false,
default = {"Former nicknames for cities"},
},
["FORMER_NICKNAME_FOR city"] = {
link = false,
default = {"Former nicknames for cities"},
},
["FORMER_NICKNAME_FOR metropolitan city"] = {
-- "metropolitan city" doesn't fall back to "นคร"
link = false,
default = {"Former nicknames for cities"},
},
["FORMER_NICKNAME_FOR place"] = {
link = false,
default = {"Former nicknames for places"},
},
["FORMER_NICKNAME_FOR prefecture-level city"] = {
-- "prefecture-level city" doesn't fall back to "นคร" but things like "county-level city" and
-- "subprovincial city" fall back to "prefecture-level city"
link = false,
default = {"Former nicknames for cities"},
},
["FORMER_NICKNAME_FOR town"] = {
link = false,
default = {"Former nicknames for cities"},
},
---------- Former official names ----------
["former official names of countries!"] = {
full_category_link = "no-longer-[[use]]d [[official]] [[name]]s of [[country|countries]]",
bare_category_breadcrumb = "ประเทศ",
bare_category_parent = "former official names of places",
addl_bare_category_parents = {{name = "former names of countries", sort = "official"}},
},
["former official names of places!"] = {
full_category_link = "no-longer-[[use]]d [[official]] [[name]]s of [[place]]s",
bare_category_breadcrumb = "official",
bare_category_parent = "former names of places",
},
["FORMER_OFFICIAL_NAME_OF country"] = {
link = false,
default = {"Former official names of countries"},
},
["FORMER_OFFICIAL_NAME_OF place"] = {
link = false,
default = {"Former official names of places"},
},
---------- Long-form names ----------
["long-form names of countries!"] = {
full_category_link = "[[long]]-[[form]] (but typically [[unofficial]]) [[name]]s of [[country|countries]]",
bare_category_breadcrumb = "ประเทศ",
bare_category_parent = "long-form names of places",
},
["long-form names of places!"] = {
full_category_link = "[[long]]-[[form]] (but typically [[unofficial]]) [[name]]s of [[place]]s",
bare_category_breadcrumb = "long-form names",
bare_category_parent = "สถานที่",
},
["LONG_FORM_OF country"] = {
link = false,
default = {"Long-form names of countries"},
},
["LONG_FORM_OF place"] = {
link = false,
default = {"Long-form names of places"},
},
---------- Nicknames ----------
["nicknames for cities!"] = {
full_category_link = "[[nickname]]s for [[city|cities]], e.g. the [[Big Apple]] for [[New York City]]",
bare_category_breadcrumb = "นคร",
bare_category_parent = "nicknames for places",
addl_bare_category_parents = {"นคร"},
},
["nicknames for continents!"] = {
full_category_link = "[[nickname]]s for [[continent]]s",
bare_category_breadcrumb = "ทวีป",
bare_category_parent = "nicknames for places",
addl_bare_category_parents = {"ทวีป"},
},
["nicknames for countries!"] = {
full_category_link = "[[nickname]]s for [[country|countries]]",
bare_category_breadcrumb = "ประเทศ",
bare_category_parent = "nicknames for places",
addl_bare_category_parents = {"ประเทศ"},
},
["nicknames for places!"] = {
full_category_link = "[[nickname]]s for [[place]]s",
bare_category_breadcrumb = "สถานที่",
bare_category_parent = "nicknames",
addl_bare_category_parents = {"สถานที่"},
},
["nicknames for states!"] = {
-- For categorizing nicknames for states of e.g. the United States
full_category_link = "[[nicknames]] for [[state]]s",
bare_category_breadcrumb = "รัฐ",
bare_category_parent = "nicknames for places",
addl_bare_category_parents = {"รัฐ"},
},
["NICKNAME_FOR capital"] = {
link = false,
default = {"Nicknames for cities"},
},
["NICKNAME_FOR city"] = {
link = false,
default = {"Nicknames for cities"},
},
["NICKNAME_FOR continent"] = {
link = false,
default = {"Nicknames for continents"},
},
["NICKNAME_FOR country"] = {
link = false,
default = {"Nicknames for countries"},
},
["NICKNAME_FOR metropolitan city"] = {
-- "metropolitan city" doesn't fall back to "นคร"
link = false,
default = {"Nicknames for cities"},
},
["NICKNAME_FOR place"] = {
link = false,
default = {"Nicknames for places"},
},
["NICKNAME_FOR prefecture-level city"] = {
-- "prefecture-level city" doesn't fall back to "นคร" but things like "county-level city" and
-- "subprovincial city" fall back to "prefecture-level city"
link = false,
default = {"Nicknames for cities"},
},
["NICKNAME_FOR state"] = {
link = false,
default = {"Nicknames for states"},
},
["NICKNAME_FOR town"] = {
link = false,
default = {"Nicknames for cities"},
},
---------- Obsolete forms ----------
["obsolete forms of places!"] = {
full_category_link = "{{glossary|obsolete}} [[form]]s of [[name]]s of [[place]]s",
bare_category_breadcrumb = "obsolete forms",
bare_category_parent = "สถานที่",
},
["OBSOLETE_FORM_OF place"] = {
link = false,
default = {"Obsolete forms of places"},
},
---------- Official names ----------
["official names of countries!"] = {
full_category_link = "[[official]] [[name]]s of [[country|countries]]",
bare_category_breadcrumb = "ประเทศ",
bare_category_parent = "official names of places",
},
["official names of former countries!"] = {
full_category_link = "[[official]] [[name]]s of [[country|countries]] that no longer [[exist]]",
bare_category_breadcrumb = "ประเทศ",
bare_category_parent = "official names of former places",
},
["official names of former places!"] = {
full_category_link = "[[official]] [[name]]s of [[place]]s that no longer [[exist]]",
bare_category_breadcrumb = "official names",
bare_category_parent = "former places",
addl_bare_category_parents = {{name = "official names of places", sort = "former"}},
},
["official names of places!"] = {
full_category_link = "[[official]] [[name]]s of [[place]]s",
bare_category_breadcrumb = "official names",
bare_category_parent = "สถานที่",
},
["OFFICIAL_NAME_OF country"] = {
link = false,
default = {"Official names of countries"},
},
["OFFICIAL_NAME_OF FORMER country"] = {
link = false,
default = {"Official names of former countries"},
},
["OFFICIAL_NAME_OF FORMER place"] = {
link = false,
default = {"Official names of former places"},
},
["OFFICIAL_NAME_OF place"] = {
link = false,
default = {"Official names of places"},
},
---------- Official nicknames ----------
["official nicknames for places!"] = {
full_category_link = "[[official]] [[nickname]]s for [[place]]s",
bare_category_breadcrumb = "official",
bare_category_parent = "nicknames for places",
},
["official nicknames for states!"] = {
-- For categorizing official nicknames for states of e.g. the United States
full_category_link = "[[official]] [[nicknames]] for [[state]]s",
bare_category_breadcrumb = "official",
bare_category_parent = "nicknames for states",
addl_bare_category_parents = {"รัฐ"},
},
["OFFICIAL_NICKNAME_FOR place"] = {
link = false,
default = {"Official nicknames for places"},
},
["OFFICIAL_NICKNAME_FOR state"] = {
link = false,
default = {"Official nicknames for states"},
},
}
export.plural_placetype_to_singular = {}
for sg_placetype, spec in pairs(export.placetype_data) do
if spec.plural then
export.plural_placetype_to_singular[spec.plural] = sg_placetype
end
end
return export
c2rgy85gws88642kx3n64pxzd5hvu9l
5720688
5720686
2026-04-21T01:23:31Z
OctraBot
3198
5720688
Scribunto
text/plain
local export = {}
export.force_cat = false -- set to true for testing
local m_locations = require("Module:place/locations")
local m_links = require("Module:links")
local m_table = require("Module:table")
local m_strutils = require("Module:string utilities")
local debug_track_module = "Module:debug/track"
local en_utilities_module = "Module:en-utilities"
local dump = mw.dumpObject
local insert = table.insert
local concat = table.concat
local internal_error = m_locations.internal_error
export.internal_error = internal_error
local process_error = m_locations.process_error
export.process_error = process_error
local unpack = unpack or table.unpack -- Lua 5.2 compatibility
local ucfirst = m_strutils.ucfirst
local ulower = m_strutils.lower
local rmatch = m_strutils.match
local split = m_strutils.split
--[==[ intro:
This module contains placetype data used by [[Module:place]] and {{tl|place}}, along with a significant amount of code
to work with both placetypes and locations, as well as some placename-related info (FIXME: Consider moving it to
[[Module:place/locations]]). See also [[Module:place/locations]], which has definitions of all known locations. You must
currently load this module using {{cd|require()}}, not using {{cd|mw.loadData()}}.
In particular, it contains two fundamental and tricky functions:
# `get_placetype_equivs`, which finds the equivalent placetypes to look under in order to find a given property, and in
the process correctly handles placetypes with qualifiers (including qualifiers that act similar to "type-raising"
operators in that they do something non-trivial to the placetype to their right) as well as form-of directives and
fallbacks.
# `find_matching_holonym_location`, which looks up a holonym to find a matching known location, but in the process
checks holonyms to the right to make sure there isn't a clash between the user-specified containing holonyms and the
containers of the known location being considered. This is done to prevent overcategorizing when either there are two
known locations with the same name (e.g. Birmingham in England and Birmingham, Alabama in the US), or more generally
two locations with the same name, one of which is a known location but where the other is not (e.g. we're processing
non-known-location Mérida, Spain and don't want it categorized like known location Mérida, Yucatán, Mexico).
Both of these functions are invoked repeatedly, and probably are invoked several times on the same inputs and as a
result are candidates for memoization to speed up the operation of {{tl|place}}.
]==]
------------------------------------------------------------------------------------------
-- Basic utilities --
------------------------------------------------------------------------------------------
--[==[
Return true if `force_cat` is set either in this module or in [[Module:place/locations]].
]==]
function export.get_force_cat()
return export.force_cat or m_locations.force_cat
end
-- Add the page to a tracking "category". To see the pages in the "category",
-- go to [[Wiktionary:Tracking/place/PAGE]] and click on "What links here".
local function track(page)
require(debug_track_module)("place/" .. page)
return true
end
function export.remove_links_and_html(text)
text = m_links.remove_links(text)
return text:gsub("<.->", "")
end
--[==[
Return the singular version of a maybe-plural placetype, or nil if not plural. This correctly handles placetypes with
irregular plurals such as `kibbutzim` plural of `kibbutz` by looking up in a table constructed from the `plural` values
specified in `placetype_data`. If a special plural value is not found, the regular singularization algorithm in
[[Module:en-utilities]] is invoked, which reverses the y -> ies change after vowels and the 'es' addition after sh/ch/x,
and otherwise just subtracts a final 's' (which will incorrectly generate 'passe' for plural 'passes'; FIXME: consider
changing this for words ending in '-sses'). If the generated singular is the same as the passed-in value, nil is
returned.
]==]
function export.maybe_singularize_placetype(placetype)
if not placetype then
return nil
end
if export.plural_placetype_to_singular[placetype] then
return export.plural_placetype_to_singular[placetype]
end
local retval = --[[require(en_utilities_module).singularize(placetype)]] placetype
if retval == placetype then
return nil
end
return retval
end
-- Return the correct plural of a placetype, and (if `do_ucfirst` is given) make the first letter uppercase. We first
-- look up the plural in `placetype_data`, falling back to pluralize() in [[Module:en-utilities]], which is almost
-- always correct.
function export.pluralize_placetype(placetype, do_ucfirst)
local ptdata = export.placetype_data[placetype]
if ptdata and ptdata.plural then
placetype = ptdata.plural
else
placetype = --[[require(en_utilities_module).pluralize(placetype)]] placetype
end
if do_ucfirst then
return ucfirst(placetype)
else
return placetype
end
end
--[==[
Get the data associated with a placetype, which may be in its singular or plural form. If `from_category` is specified,
we also look for category-only placetypes (generally plural) followed by `!`. Return three values: (a) the placetype
under which the data can be looked up (i.e. in its singular form if the passed-in `placetype` is plural and did not
match a category-only placetype followed by `!`); (b) the placetype data structure; (c) the type of `placetype` match
that occurred, one of `"direct"` if the canonical placetype is the same as the passed-in `placetype` and also the same
as the key under which `ptdata` was looked up, or `"direct-category"` if the `ptdata` was looked up under a key formed
from the passed-in `placetype` by adding `!`, or `"plural"` if the `ptdata` was looked up under the singularized version
of the plural passed-in `placetype`.
]==]
function export.get_placetype_data(placetype, from_category)
local ptdata = export.placetype_data[placetype]
if ptdata then
return placetype, ptdata, "direct"
end
if from_category then
ptdata = export.placetype_data[placetype .. "!"]
if ptdata then
return placetype .. "!", ptdata, "direct-category"
end
end
local sg_placetype = export.maybe_singularize_placetype(placetype)
if sg_placetype then
ptdata = export.placetype_data[sg_placetype]
if ptdata then
return sg_placetype, ptdata, "plural"
end
end
return nil
end
--[==[
Check for special pseudo-placetypes that should be ignored for categorization purposes.
]==]
function export.placetype_is_ignorable(placetype)
return placetype == "and" or placetype == "or" or placetype == "และ" or placetype == "หรือ" or placetype:find("^%(")
end
function export.resolve_placetype_aliases(placetype)
return export.placetype_aliases[placetype] or placetype
end
--[==[
Return a property from `placetype_data` for a given placetype. If the placetype isn't found in `placetype_data`, or the
key isn't found in the placetype's entry in `placetype_data`, return nil.
]==]
function export.get_placetype_prop(placetype, key)
-- Usually we are called on equivalent placetypes returned from `get_placetype_equivs`, in which case placetype
-- aliases have been resolved, but sometimes not, e.g. when fetching the indefinite article in
-- get_placetype_article(). `resolve_placetype_aliases` is just a simple lookup and it doesn't hurt to do it twice.
placetype = export.resolve_placetype_aliases(placetype)
if export.placetype_data[placetype] then
return export.placetype_data[placetype][key]
else
return nil
end
end
--[==[
Given a placetype, split the placetype into one or more potential ''splits'', each consisting of a three-element list
{ {``prev_qualifiers``, ``this_qualifier``, ``reduced_placetype``}}, i.e.
# the concatenation of zero or more previously-recognized qualifiers on the left, normally canonicalized (if there are
zero such qualifiers, the value will be nil);
# a single recognized qualifier, normally canonicalized (if there is no qualifier, the value will be nil);
# the "reduced placetype" on the right.
Splitting between the qualifier in (2) and the reduced placetype in (3) happens at each space character, proceeding from
left to right, and stops if a qualifier isn't recognized. All placetypes are canonicalized by checking for aliases
in `placetype_aliases`, but no other checks are made as to whether the reduced placetype is recognized. Canonicalization
of qualifiers does not happen if `no_canon_qualifiers` is specified.
For example, given the placetype `"small beachside unincorporated community"`, the return value will be
{ {
{nil, nil, "small beachside unincorporated community"},
{nil, "small", "beachside unincorporated community"},
{"small", "[[beachfront]]", "unincorporated community"},
{"small [[beachfront]]", "[[unincorporated]]", "community"},
}}
Here, `"beachside"` is canonicalized to `"[[beachfront]]"` and `"unincorporated"` is canonicalized to
`"[[unincorporated]]"`, in both cases according to the entry in `placetype_qualifiers`.
On the other hand, if given `"small former haunted community"`, the return value will be
{ {
{nil, nil, "small former haunted community"},
{nil, "small", "former haunted community"},
{"small", "former", "haunted community"},
}}
because `"small"` and `"former"` but not `"haunted"` are recognized as qualifiers.
Finally, if given `"former adr"`, the return value will be
{ {
{nil, nil, "former adr"},
{nil, "former", "administrative region"},
}}
because `"adr"` is a recognized placetype alias for `"administrative region"`.
]==]
function export.split_qualifiers_from_placetype(placetype, no_canon_qualifiers)
local splits = {{nil, nil, export.resolve_placetype_aliases(placetype)}}
local prev_qualifier = nil
while true do
local qualifier, reduced_placetype = placetype:match("^(.-) (.*)$")
if qualifier then
local canon = export.placetype_qualifiers[qualifier]
if canon == nil then
break
end
local new_qualifier = qualifier
if type(canon) == "table" then
canon = canon.link
end
if not no_canon_qualifiers and canon ~= false then
if canon == true then
new_qualifier = "[[" .. qualifier .. "]]"
else
new_qualifier = canon
end
end
insert(splits, {prev_qualifier, new_qualifier, export.resolve_placetype_aliases(reduced_placetype)})
prev_qualifier = prev_qualifier and prev_qualifier .. " " .. new_qualifier or new_qualifier
placetype = reduced_placetype
else
break
end
end
return splits
end
--[==[
Given a `placetype` (which may be pluralized), return an ordered list of equivalent placetypes to look under to find the
placetype's properties (such as the category or categories to be inserted). The return value is actually an ordered list
of objects of the form `{qualifier=``qualifier``, placetype=``equiv_placetype``}` where ``equiv_placetype`` is a
placetype whose properties to look up, derived from the passed-in placetype or from a contiguous subsequence of the
words in the passed-in placetype (always including the rightmost word in the placetype, i.e. we successively chop off
qualifier words from the left and use the remainder to find equivalent placetypes). ``qualifier`` is the remaining words
not part of the subsequence used to find ``equiv_placetype``; or nil if all words in the passed-in placetype were used
to find ``equiv_placetype``. (FIXME: This qualifier is not currently used anywhere.) Only placetypes for which there is
an entry in `placetype_data` are included. The placetype passed in is always checked first, and will form the first
entry if it exists in `placetype_data`.
'''NOTE:''' This is a tricky function as it implements handling of (a) qualifiers, (b) fallback logic, (c)
"type-raising" qualifiers such as `former`/`ancient`/etc. as well as `fictional` and `mythological`, and (d) form-of
directives, which act somewhat similarly to `former`, and allows interaction between more than one of these
simultaneously (e.g. official names of former places, which have their own categorization).
If {{tl|place}} gets too slow, one potential speedup is to memoize the results of this function, as it appears to be
getting called more than once on the same inputs. Another similar potential speedup is to memoize the results of
`iterate_matching_holonym_location()`.
For example, given the placetype `left tributary`, the following placetype/qualifier combinations are checked in turn:
```
{qualifier = nil, placetype="left tributary"}
{qualifier = "left", placetype="tributary"}
{qualifier = "left", placetype="แม่น้ำ"}
```
and the return value will be
{ {
{qualifier = "left", placetype="tributary"},
{qualifier = "left", placetype="แม่น้ำ"},
}}
The algorithm first enters the placetype itself into the list, then checks for `left tributary` as a recognized
placetype in `placetype_data` and doesn't find it, so it doesn't enter it into the returned list (if it found it, it
would add it as well as any fallbacks directly after it). It then splits off the recognized qualifier `left` to form the
''reduced placetype'' `tributary`, which is entered into the list because it is found in `placetype_data`. Then, because
it has a fallback `river`, which exists in `placetype_data`, the fallback is entered next.
Another example is `small rural fraziones` (where a ''frazione'' is type of subdivision of a ''comune'' or municipality,
often specifically an outlying hamlet). the placetype/qualifier combinations checked are:
```
{qualifier = nil, placetype="small rural fraziones"}
{qualifier = nil, placetype="small rural frazione"}
{qualifier = "small", placetype="rural fraziones"}
{qualifier = "small", placetype="rural frazione"}
{qualifier = "small [[rural]]", placetype="fraziones"}
{qualifier = "small [[rural]]", placetype="frazione"}
{qualifier = "small [[rural]]", placetype="hamlet"}
{qualifier = "small [[rural]]", placetype="village"}
```
The return value ends up as
{qualifier = "small [[rural]]", placetype="frazione"},
{qualifier = "small [[rural]]", placetype="hamlet"},
{qualifier = "small [[rural]]", placetype="village"},
}}
Here, because the result of singularizing `fraziones` returns a different value from the placetype itself, that
singularized value is checked after the original plural value. Also, in the process of splitting off qualifiers,
they are canonicalized if the entry in `placetype_qualifiers` says to do so; in this case, links are placed around
`rural`. Finally, `frazione` has `hamlet` as its fallback, which in turn has `village` as its fallback, so both
fallbacks end up being returned.
`no_fallback`, if set, disables returning equivalent placetypes based on the `fallback` setting for a placetype. This is
used in the first of two loops in find_placetype_cat_specs() in [[Module:place]] to prefer exact matches for placetypes
such as barangays with later holonyms to matches based on a fallback such as `neighborhood` with an earlier holonym.
See the comment in that function in [[Module:place]] for a more detailed explanation of why this is needed. Only the
placetype itself, and any reduced placetypes created by chopping off recognized qualifiers at the beginning, are
returned; but we do not return reduced placetypes if a containing placetype exists in `placetype_data`. (For example,
`"overseas territory"` has a fallback `"dependent territory"`, and `"overseas"` is also a recognized qualifier. When
`no_fallback` is in place, without the above proviso, we would return `"overseas territory"` followed by `"ดินแดน"`
with the incorrect effect of classifying an `"overseas territory"` of the United Kingdom such as `"Gibraltar"` under
[[:Category:Territories of the United Kingdom]] instead of [[:Category:Dependent territories of the United Kingdom]].)
As an exception, if `historical`, `ancient`, `former` or the like are found, they proceed ignoring `no_fallback`,
because it seems tricky to handle them correctly in the presence of `no_fallback`, and historical/former placetypes
rarely occur with exact match category specs anyway.
`no_split_qualifiers` prevents splitting off recognized qualifiers and returning the remainder of the placetype as an
equivalent placetype. Only the passed-in placetype, and any fallbacks, will be returned. This is used in
[[Module:category tree/topic cat/data/Places]] when looking up placetypes found in categories. Such placetypes won't
have qualifiers and so it doesn't make sense to try and look for them.
`from_category`, if set, causes category-only placetypes (those ending in `!`) to also be checked.
`form_of_directive`, if set, causes the specified form-of directive (e.g. `FORMER_NAME_OF`) to be prepended to checked
placetypes, their directive-specific type (e.g. `FORMER_NAME_OF_type`), and their classes (`class`) to get the
appropriate placetypes to check for form-of-directive categories. It falls back to the prepended generic `place` as a
placetype, e.g. `FORMER_NAME_OF place`, if nothing else matches.
`no_check_for_inherently_former` is used internally to prevent an infinite loop when checking for `inherently_former`.
`register_former_as_non_former` is a major hack used in `get_bare_categories` to deal with the mismatch between e.g.
known location `Yugoslavia` declaring itself a `country` but definitions of it declaring it a `former country`. It
causes the non-former version of the specified placetype to be included in the returned equivalents along with the
former placetypes. [FIXME: This should apply only to the entries in `former_countries` but it's tricky to do that now;
fix this in the known-location refactor. -- The known-location refactor is already done but we haven't yet fixed this.]
]==]
function export.get_placetype_equivs(placetype, props)
local no_fallback, no_split_qualifiers, no_check_for_inherently_former, from_category, register_former_as_non_former
local form_of_directive
if props then
no_fallback, no_split_qualifiers, no_check_for_inherently_former, from_category, register_former_as_non_former =
props.no_fallback, props.no_split_qualifiers, props.no_check_for_inherently_former, props.from_category,
props.register_former_as_non_former
form_of_directive = props.form_of_directive
end
local equivs = {}
-- Insert `placetype` into `equivs`, along with any fallback placetypes listed in `placetype_data`. `qualifier` is
-- the preceding qualifier to insert into `equivs` along with the placetype (see comment at top of function). If
-- `from_category` is given, we also check for a category-specific entry consisting of the placetype followed by
-- `!`, and in all cases we also check to see if `placetype` is plural, and if so, insert the singularized version
-- along with its fallbacks (if any) in `placetype_data`. `form_of_prefix` is a form-of prefix such as
-- `OFFICIAL_NAME_OF`. If specified, we check the fallbacks of `placetype` without the prefix but then insert into
-- `equivs` the prefixed placetype. This way, if the user says e.g. {{tl|place|pt|@official name of:Cuba|island country|r/Caribbean}},
-- we will correctly categorize into [[:Category:Official names of countries]], rather than only trying to look up
-- `OFFICIAL_NAME_OF island country` and failing, falling back ultimately to [[:Category:Official names of places]].
local function insert_placetype_and_fallbacks(qualifier, placetype, form_of_prefix)
local function insert_equiv(pt)
if form_of_prefix then
-- Let's say the user says {{tl|place|pt|@official name of:Cuba|island country|r/Caribbean}} and we have
-- no entry for `OFFICIAL_NAME_OF island country` but we do for `OFFICIAL_NAME_OF country` (which we end
-- up processing because `island country` falls back to `country`), and that entry in turn is defined
-- using a fallback. We have to insert that fallback-of-fallback, and the easiest/cleanest way of
-- handling this is by calling ourselves recursively.
insert_placetype_and_fallbacks(qualifier, form_of_prefix .. " " .. pt)
else
insert(equivs, {qualifier=qualifier, placetype=pt})
end
end
-- Insert the placetype, along with any fallbacks.
local canon_placetype, ptdata, ptmatch = export.get_placetype_data(placetype, from_category)
if ptdata then
insert_equiv(canon_placetype)
if no_fallback then
return
end
local first_placetype = #equivs + 1
local prev_placetype = nil
while true do
local pt_value = export.placetype_data[canon_placetype]
if not pt_value then
internal_error("Fallback value %s specified for placetype %s but is not in `placetype_data`",
canon_placetype, prev_placetype)
end
if pt_value.fallback then
insert_equiv(pt_value.fallback)
local last_placetype = #equivs
if last_placetype - first_placetype >= 10 then
local fallback_loop = {}
for i = first_placetype, last_placetype do
insert(fallback_loop, equivs[i].placetype)
end
internal_error("Apparent loop in fallback chain: %s", table.concat(fallback_loop, " -> "))
end
prev_placetype = canon_placetype
canon_placetype = pt_value.fallback
else
break
end
end
end
end
-- Insert `placetype` into `equivs`, along with any fallback placetypes listed in `placetype_data`. This is a
-- wrapper around the more basic `insert_placetype_and_fallbacks()` which handles form-of directives. If there is no
-- form-of directive, this function directly calls `insert_placetype_and_fallbacks()`. We do things this way so that
-- form-of directives correctly combine with `former`-type qualifiers. Note that we also have special backups for
-- form-of directives that check `DIRECTIVE place` (and before that, `DIRECTIVE FORMER/ANCIENT place` is there's a
-- `former`-type directive); these backups live outside this function because we want them done once, late, rather
-- than in each invocation of `process_and_insert_placetype()`.
local function process_and_insert_placetype(qualifier, reduced_placetype)
if form_of_directive then
-- First check for e.g. `OFFICIAL_NAME_OF island country` and its fallbacks; then we look for fallbacks of
-- `island country` and check e.g. `OFFICIAL_NAME_OF country` and its fallbacks. All of this is handled by
-- `insert_placetype_and_fallbacks()` with appropriate parameters. After that, check the general class of
-- the directive, e.g. `subpolity` if something like `district` is given. (Eventually, we check for
-- `OFFICIAL_NAME_OF place` as a backup, but this happens at the end outside the loop over qualifiers.)
insert_placetype_and_fallbacks(qualifier, reduced_placetype, form_of_directive)
if not no_fallback then
local reduced_placetype_equivs = export.get_placetype_equivs(reduced_placetype)
local directive_type = export.get_equiv_placetype_prop_from_equivs(reduced_placetype_equivs,
function(pt) return export.get_placetype_prop(pt, form_of_directive .. "_type") or
export.get_placetype_prop(pt, "class") end
)
if not directive_type then
local pt_data = export.get_equiv_placetype_prop_from_equivs(reduced_placetype_equivs,
function(pt) return export.placetype_data[pt] end
)
if pt_data then
internal_error("For placetype %s in conjunction with form-of directive %s, placetype data " ..
'located but directive-specific type property %s missing, and so is "class"; ' ..
"placetypes searched are %s", reduced_placetype, form_of_directive,
form_of_directive .. "_type", reduced_placetype_equivs)
else
-- This should be allowed, as we allow unrecognized placetypes in general.
end
elseif directive_type ~= "!" then
insert_placetype_and_fallbacks(qualifier, directive_type, form_of_directive)
end
end
else
insert_placetype_and_fallbacks(qualifier, reduced_placetype)
end
end
-- Successively split off recognized qualifiers and loop over successively greater sets of qualifiers from the left
-- (unless `no_split_qualifiers` is specified, in which case we don't check for qualifiers).
local splits
if no_split_qualifiers then
splits = {{nil, nil, export.resolve_placetype_aliases(placetype)}}
else
splits = export.split_qualifiers_from_placetype(placetype)
end
for _, split in ipairs(splits) do
local prev_qualifier, this_qualifier, reduced_placetype = unpack(split, 1, 3)
-- If a special "former" qualifier like `former` or `historical` isn't present, and
-- `no_check_for_inherently_former` is not given (this flag is used to avoid infinite loops), check for
-- "inherently former" placetypes like `satrapy` and `treaty port` that always refer to no-longer-existing
-- placetypes, and handle accordingly.
local unlinked_this_qualifier
if this_qualifier and this_qualifier:find("%[") then
unlinked_this_qualifier = export.remove_links_and_html(this_qualifier)
else
unlinked_this_qualifier = this_qualifier
end
local former_qualifiers = this_qualifier and export.former_qualifiers[unlinked_this_qualifier] or nil
if not former_qualifiers and not no_check_for_inherently_former then
former_qualifiers = export.get_equiv_placetype_prop(reduced_placetype,
function(pt) return export.get_placetype_prop(pt, "inherently_former") end,
{no_check_for_inherently_former = true})
end
-- If a special "former" qualifier like `former` or `historical` is present, map it to the appropriate internal
-- qualifiers (`ANCIENT` and/or `FORMER`, which are written in all-caps to distinguish them from user-specified
-- qualifiers), fetch the `former_type` property, and treat the placetype as if a concatenation of the mapped
-- qualifier(s) and the value of `former_type`. For example, if `medieval village` is given, we map `medieval`
-- to `ANCIENT` and `FORMER`, and `village` to its `former_type` of `settlement`, and enter the placetypes
-- `ANCIENT settlement` and `FORMER settlement` (in that order) into `equivs`. If the placetype following the
-- "former" qualifier is recognized in `placetype_data` but has no `former_type` and no fallback with a
-- `former_type` specified, it is an internal error; but if the placetype isn't recognized (e.g. something like
-- `former greenhouse` is specified and we don't have an entry for `greenhouse`), just track the occurrence and
-- don't enter anything into `equivs`.
if former_qualifiers then
-- FIXME: Should we respect `no_fallback` here? My instinct says no.
local reduced_placetype_equivs = export.get_placetype_equivs(reduced_placetype, {
no_check_for_inherently_former = true
})
local former_type = export.get_equiv_placetype_prop_from_equivs(reduced_placetype_equivs,
function(pt) return export.get_placetype_prop(pt, "former_type") or
export.get_placetype_prop(pt, "class") end
)
if not former_type then
local pt_data = export.get_equiv_placetype_prop_from_equivs(reduced_placetype_equivs,
function(pt) return export.placetype_data[pt] end
)
if pt_data then
internal_error("For placetype %s, placetype data located but `former_type` missing; " ..
"placetypes searched are %s", reduced_placetype, reduced_placetype_equivs)
else
-- Enable error when we've verified there aren't any examples.
track("bad-former-placetype")
track("bad-former-placetype/" .. reduced_placetype)
--process_error("For placetype '%s', unrecognized placetype following 'former'-type " ..
-- "qualifier; searched placetype(s) %s", reduced_placetype, dump(reduced_placetype_equivs))
end
elseif former_type ~= "!" then
-- First check directly for `ANCIENT/FORMER` + the original following placetype. This makes it possible
-- for (e.g.) former provinces of the Roman empire to be categorized specially.
for _, former_qualifier in ipairs(former_qualifiers) do
process_and_insert_placetype(prev_qualifier, former_qualifier .. " " .. reduced_placetype)
end
for _, former_qualifier in ipairs(former_qualifiers) do
process_and_insert_placetype(prev_qualifier, former_qualifier .. " " .. former_type)
end
-- HACK! See explanation above for `register_former_as_non_former`.
if register_former_as_non_former then
process_and_insert_placetype(prev_qualifier, reduced_placetype)
end
-- If we're processing a form-of directive, after doing everything else we do
-- `DIRECTIVE ANCIENT/FORMER place` e.g. `OFFICIAL_NAME_OF FORMER place` as a backup.
if form_of_directive and not no_fallback then
for _, former_qualifier in ipairs(former_qualifiers) do
insert_placetype_and_fallbacks(prev_qualifier, form_of_directive .. " " .. former_qualifier ..
" place")
end
end
-- Don't continue processing equivs. The reason is probably the same as the `break` below for
-- qualifier_to_placetype_equivs[]; categories for `former BLAH` are set using `default`, and
-- non-former equivs will otherwise take precedence.
break
end
end
-- Then see if the rightmost split-off qualifier is in qualifier_to_placetype_equivs
-- (e.g. 'fictional *' -> 'fictional location'). If so, add the mapping.
if this_qualifier and export.qualifier_to_placetype_equivs[unlinked_this_qualifier] then
insert(equivs, {
qualifier=prev_qualifier,
placetype=export.qualifier_to_placetype_equivs[unlinked_this_qualifier]
})
-- Don't continue processing equivs; otherwise, if we specify 'mythological city', even though the
-- equivalent entry for 'mythological location' gets inserted ahead of the entry for 'city', the
-- latter ends up generating the category because the category for 'mythological location' is set as
-- the default value, which is used only when no non-default category can be found.
break
end
-- Finally, join the rightmost split-off qualifier to the previously split-off qualifiers to form a combined
-- qualifier, and add it along with reduced_placetype and any mapping in placetype_data for reduced_placetype.
-- NOTE: The first time through this loop, both `prev_qualifier` and `this_qualifier` are nil, and this inserts
-- the full placetype into `equivs`.
local qualifier = prev_qualifier and prev_qualifier .. " " .. this_qualifier or this_qualifier
process_and_insert_placetype(qualifier, reduced_placetype)
-- If `no_fallback` and there's an entry in `placetype_data` for this placetype, don't include any reduced
-- placetypes to avoid the "overseas territory treated as a territory" issue describe above.
if no_fallback then
local canon_placetype, ptdata, ptmatch = export.get_placetype_data(reduced_placetype, from_category)
if canon_placetype then
break
end
end
end
-- If we're processing a form-of directive, after doing everything else we do `DIRECTIVE place` e.g.
-- `OFFICIAL_NAME_OF place` as a backup; but only if either the placetype as a whole is recognized or the placetype
-- begins with a recognized qualifier. This latter check is to avoid categorizing into e.g.
-- [[Category:en:Former names of places]] in an invocation like
-- {{place|en|@former name of:Democratic Republic of the Congo|country|r/Central Africa|;|used from 1971–1997}};
-- the `used from 1971–1997` gets treated as a placetype and we're called on it.
if form_of_directive and not no_fallback and (splits[2] or export.get_placetype_data(placetype, from_category)) then
insert_placetype_and_fallbacks(nil, form_of_directive .. " place")
end
return equivs
end
function export.get_equiv_placetype_prop_from_equivs(equivs, fun, continue_on_nil_only)
for _, equiv in ipairs(equivs) do
local retval = fun(equiv.placetype)
if continue_on_nil_only and retval ~= nil or not continue_on_nil_only and retval then
return retval, equiv
end
end
return nil, nil
end
--[==[
Given a placetype `placetype` and a function `fun` of one argument, iteratively call the function on equivalent
placetypes fetched from `get_placetype_equivs` until the function returns a non-falsy value (i.e. not {nil} or {false});
but if `continue_on_nil_only` is specified, the iterations continue until the function returns non non-{nil} value.
FIXME: We should make `continue_on_nil_only` the default; but this requires changing some callers.) When `fun` returns a
non-falsy or non-{nil} value, `get_equiv_placetype_prop` returns two values: the value returned by `fun` and the
equivalent placetype that triggered the non-falsy (or non-{nil}) return value. If `fun` never returns a non-falsy (or
non-{nil}) value, `get_equiv_placetype_prop` returns {nil} for both return values. If `placetype` is passed in as {nil},
the return value is the result of calling `fun` on {nil} (whatever it is) with {nil} for the second return value.
]==]
function export.get_equiv_placetype_prop(placetype, fun, props)
if not placetype then
return fun(nil), nil
end
return export.get_equiv_placetype_prop_from_equivs(export.get_placetype_equivs(placetype, props), fun,
props and props.continue_on_nil_only)
end
--[==[
Return the article that is used with an entry placetype. We proceed as follows:
# See if there is a recognized qualifier at the beginning that specifies an article (including `false` for no article).
This takes precedence over anything else, so that e.g. `various capitals` gets no article rather than "`the"`.
# Then check the placetype or any equivalent placetype for the `entry_placetype_use_the` property, indicating that
`"the"` should be used.
# Otherwise we look to see if the placetype itself (not any equivalents, even those involving deleting a qualifier from
the beginning) has an entry in `placetype_data` that specifies the indefinite article using `entry_placetype_use_the`
(principally for use with placetypes like `union territory`).
# Otherwise, we use [[Module:en-utilities]] to apply the standard algorithm to generate `"an"` for words beginning with
a vowel and `"a"` otherwise.
If `ucfirst` is true, the first letter of the article is made upper-case.
]==]
function export.get_placetype_article(placetype, ucfirst)
local art
local qualifier, reduced_placetype = placetype:match("^(.-) (.*)$")
if qualifier then
local canon = export.placetype_qualifiers[qualifier]
if type(canon) == "table" then
art = canon.article
end
end
if art == false then
return art
end
if art == nil then
local placetype_use_the = export.get_equiv_placetype_prop(placetype,
function(pt) return export.get_placetype_prop(pt, "entry_placetype_use_the") end)
if placetype_use_the then
art = "the"
else
art = export.get_placetype_prop(placetype, "entry_placetype_indefinite_article")
if not art then
art = --[[require(en_utilities_module).get_indefinite_article(placetype)]] ""
end
end
end
if ucfirst then
art = m_strutils.ucfirst(art)
end
return art
end
--[==[
Return the preposition that should be used after `placetype` when occurring as an entry placetype or in categories
(e.g. `city >in< France` but `country >of< South America`). The preposition defaults to `"ใน"` if not specified.
]==]
function export.get_placetype_entry_preposition(placetype)
local pt_prep = export.get_equiv_placetype_prop(placetype,
function(pt) return export.get_placetype_prop(pt, "preposition") end
)
return pt_prep or "ใน"
end
--[==[
Given a place desc (see top of file) and a holonym object (see top of file), add a key/value into the place desc's
`holonyms_by_placetype` field corresponding to the placetype and placename of the holonym. For example, corresponding
to the holonym "c/Italy", a key "ประเทศ" with the list value {"Italy"} will be added to the place desc's
`holonyms_by_placetype` field. If there is already a key with that place type, the new placename will be added to the
end of the value's list.
]==]
function export.key_holonym_into_place_desc(place_desc, holonym)
if not holonym.placetype then
return
end
-- Key in equivalent placetypes, so that e.g. `cities/San Francisco` gets keyed under `city`; but don't do
-- fallbacks, as it doesn't seem correct for the "do other holonyms of the same placetype" algorithm to do holonyms
-- of different types just because they have the same fallback.
local equiv_placetypes = export.get_placetype_equivs(holonym.placetype, {no_fallback = true})
local unlinked_placename = holonym.unlinked_placename
for _, equiv in ipairs(equiv_placetypes) do
local placetype = equiv.placetype
if not place_desc.holonyms_by_placetype then
place_desc.holonyms_by_placetype = {}
end
if not place_desc.holonyms_by_placetype[placetype] then
place_desc.holonyms_by_placetype[placetype] = {unlinked_placename}
else
insert(place_desc.holonyms_by_placetype[placetype], unlinked_placename)
end
end
end
--[=[
Construct a formatted link from the raw link spec `link` given the canonical singular placetype `sg_placetype`. If the
placetype was originally plural, `orig_placetype` should contain this plural value; otherwise it should be nil. This
will construct the appropriate type of link that displays as `orig_placetype` (or otherwise `sg_placetype`) but links to
whatever the `link` spec specifies (which may be `sg_placetype`, a Wikipedia article, etc.). `ptdata` is the placetype
data structure for the placetype, and `from_category` indicates that we are generating the description of a category
(otherwise we are generating the display form of an entry placetype).
]=]
local function make_placetype_link(link, sg_placetype, orig_placetype, ptdata, from_category, noerror)
if not from_category and ptdata.disallow_in_entries then
if noerror then
return "[not meant to be specified directly, with warning: " .. ptdata.disallow_in_entries .. "]"
else
process_error("Placetype %s is not meant to be specified directly: " .. ptdata.disallow_in_entries, sg_placetype)
end
end
if link == nil then
internal_error("Placetype data present for placetype %s but no link= setting given", sg_placetype)
elseif link == true then
if orig_placetype then
return ("[[%s|%s]]"):format(sg_placetype, orig_placetype)
else
return ("[[%s]]"):format(sg_placetype)
end
elseif link == false then
process_error("Placetype %s is not meant to be specified directly, but is only for internal use", sg_placetype)
elseif link == "w" then
return ("[[w:%s|%s]]"):format(sg_placetype, orig_placetype or sg_placetype)
elseif link == "separately" then
if orig_placetype then
local sg_words = split(sg_placetype, " ")
local orig_words = split(orig_placetype, " ")
if #sg_words ~= #orig_words then
internal_error("Can't construct 'separately' link for plural placetype %s as original placetype %s " ..
"has different number of words", orig_placetype, sg_placetype)
else
for i = 1, #sg_words do
if sg_words[i] == orig_words[i] then
sg_words[i] = ("[[%s]]"):format(sg_words[i])
else
sg_words[i] = ("[[%s|%s]]"):format(sg_words[i], orig_words[i])
end
end
return concat(sg_words, " ")
end
else
return (sg_placetype:gsub("([^ ]+)", "[[%1]]"))
end
elseif link:find("^%+") then
link = link:sub(2) -- discard initial +
return ("[[%s|%s]]"):format(link, orig_placetype or sg_placetype)
elseif not orig_placetype then
return link
else
return --[[require(en_utilities_module).pluralize(link)]] link
end
end
--[==[
Get the display form of a placetype by looking it up in `placetype_data`. If the placetype is recognized, or is the
plural of a recognized placetype, the corresponding linked display form is returned (with plural placetypes displaying
as plural but linked to the singular form of the placetype). Otherwise, return nil. If we're generating the description
of a category, `category_type` should be set to one of `"top-level"` (for top-level categories like
[[:Category:Neighborhoods]]), `"noncity"` (for non-city categories like [[:Category:Neighborhoods in Illinois, USA]]) or
`"city"` (for city categories like [[:Category:Neighborhoods of Chicago]]). Otherwise, we're generating the description
for use in formatting a {{tl|place}} call, and category-only placetypes ending in `!` will be ignored, along with
special `category_link*` settings. `return_full` is used along with `category_type` and will preferably return the
"full" variant of category link settings, i.e. `full_category_link*`; if they don't exist, the `category_link*` value is
prepended with `"names of"`. `noerror` says to not throw an error when encountering entry placetypes that would be
disallowed.
]==]
function export.get_placetype_display_form(placetype, category_type, return_full, noerror)
local from_category = not not category_type
local canon_placetype, ptdata, ptmatch = export.get_placetype_data(placetype, from_category)
if canon_placetype then
local raw_link
local function is_linked_string(str)
return type(str) == "string" and str:find("%[%[")
end
if category_type then
local fetched_full
local function fetch_maybe_full(prop)
local retval = ptdata["full_" .. prop]
if retval ~= nil then
if return_full then
return retval, true
else
internal_error("Saw full_" .. prop .. "=%s but `return_full` not set, can't handle", retval)
end
end
return ptdata[prop], false
end
local function maybe_prefix(str)
if return_full and not fetched_full then
return "names of " .. str
else
return str
end
end
-- Careful with `false` as possible value.
if category_type == "top-level" then --ห้ามแปล
raw_link, fetched_full = fetch_maybe_full("category_link_top_level")
elseif category_type == "noncity" then --ห้ามแปล
raw_link, fetched_full = fetch_maybe_full("category_link_before_noncity")
elseif category_type == "city" then --ห้ามแปล
raw_link, fetched_full = fetch_maybe_full("category_link_before_city")
else
internal_error('Unrecognized value for `category_type` %s, should be "top-level", "noncity" or "city"', --ห้ามแปล
category_type)
end
if type(raw_link) == "string" then
return maybe_prefix(raw_link), ptdata
elseif raw_link ~= nil then
return raw_link, ptdata
end
raw_link, fetched_full = fetch_maybe_full("category_link")
if raw_link == false then
return raw_link, ptdata
end
if is_linked_string(raw_link) then
return maybe_prefix(raw_link), ptdata
end
if ptmatch == "plural" then
raw_link, fetched_full = fetch_maybe_full("plural_link")
if raw_link == false then
return raw_link, ptdata
end
if is_linked_string(raw_link) then
return maybe_prefix(raw_link), ptdata
end
end
if raw_link == nil then
raw_link, fetched_full = fetch_maybe_full("link")
end
if raw_link == false then
return raw_link, ptdata
end
return maybe_prefix(make_placetype_link(raw_link, canon_placetype,
placetype ~= canon_placetype and placetype or nil, ptdata, from_category, noerror)), ptdata
else
if ptmatch == "plural" then
raw_link = ptdata.plural_link
if raw_link == false then
process_error("Placetype %s cannot appear plural", placetype)
end
if is_linked_string(raw_link) then
return raw_link, ptdata
end
end
if raw_link == nil then
raw_link = ptdata.link
end
return make_placetype_link(raw_link, canon_placetype,
placetype ~= canon_placetype and placetype or nil, ptdata, from_category, noerror), ptdata
end
end
return nil
end
local function resolve_unlinked_placename_display_aliases(placetype, placename)
local equiv_placetypes = export.get_placetype_equivs(placetype)
for i, equiv in ipairs(equiv_placetypes) do
equiv_placetypes[i] = equiv.placetype
end
local all_display_aliases_found = {}
local all_others_found = {}
for group, key, spec in m_locations.iterate_matching_location {
placetypes = equiv_placetypes,
placename = placename,
alias_resolution = "display",
} do
if spec.alias_of and spec.display then
insert(all_display_aliases_found, {group, key, spec, spec.display_as_full})
else
insert(all_others_found, {group, key, spec})
end
end
if not all_display_aliases_found[1] then
return placename
elseif all_display_aliases_found[2] then
internal_error("Found multiple matching display aliases for placename %s, placetype %s: " ..
"all_display_aliases_found=%s, all_others_found=%s", placename, placetype, all_display_aliases_found,
all_others_found)
elseif all_others_found[1] then
internal_error("Found a display alias along with other possible meanings for placename %s, placetype %s: " ..
"all_display_aliases_found=%s, all_others_found=%s", placename, placetype, all_display_aliases_found,
all_others_found)
else
local group, key, spec, as_full = unpack(all_display_aliases_found[1])
local full, elliptical = m_locations.key_to_placename(group, key)
return as_full and full or elliptical
end
end
--[==[
If `placename` of type `placetype` is a display alias, convert it to its canonical form; otherwise, return unchanged.
Display aliases transform certain placenames into canonical displayed forms. For example, if any of `country/US`,
`country/USA` or `country/United States of America` (or `c/US`, etc.) are given, the result will be displayed as
`United States`.
'''NOTE''': Display aliases change what is displayed from what the editor wrote in the Wikitext. As a result, they
should (a) be non-political in nature, and (b) not involve a change where the word `the` needs to be added or removed.
For example, normalizing `US` and `USA` to `United States` for display purposes is OK but normalizing `Burma` to
`Myanmar` is not (instead a cat alias should be used) because the terms `Burma` and `Myanmar` have clear political
connotations. Similarly, we have a display alias that maps the old name of `Macedonia` as a country (but not a region!)
to `North Macedonia`, but `Republic of Macedonia` is mapped to `North Macedonia` only as a cat alias because the two
terms differ in their use of `the`. (For example, if we had a display alias mapping `Republic of Macedonia` to
`North Macedonia`, the call {{tl|place|en|the <<capital city>> of the <<c/Republic of Macedonia>>}} would wrongly
display as `the [[capital city]] of the [[North Macedonia]]`.) Generally, display normalizations tend to involve
alternative forms (e.g. abbreviations, ellipses, foreign spellings) where the normalization improves clarity and
consistency.
]==]
function export.resolve_placename_display_aliases(placetype, placename)
-- If the placename is a link, apply the alias inside the link.
-- This pattern matches both piped and unpiped links. If the link is not piped, the second capture (linktext) will
-- be empty.
local link, linktext = rmatch(placename, "^%[%[([^|%[%]]+)|?([^|%[%]]-)%]%]$")
if link then
if linktext ~= "" then
local alias = resolve_unlinked_placename_display_aliases(placetype, linktext)
return "[[" .. link .. "|" .. alias .. "]]"
else
local alias = resolve_unlinked_placename_display_aliases(placetype, link)
return "[[" .. alias .. "]]"
end
else
return resolve_unlinked_placename_display_aliases(placetype, placename)
end
end
--[==[
Generate the "prefixed" version of a bare key, i.e. prefix it with `the` if correct for this key.
]==]
function export.get_prefixed_key(key, spec)
if spec.the then
return "the " .. key
else
return key
end
end
-- Necessary for use by [[Module:place]]. FIXME: Reorganize the modules so this isn't necessary.
export.iterate_matching_location = m_locations.iterate_matching_location
--[=[
Iterator that iterates over holonyms in `place_desc`. If `first_holonym_index` is given, start iterating at the
specified holonym and stop either when there are no more holonyms or a holonym with modifier `:also` is found. If
`first_holonym_index` is nil or omitted, iterate over all holonyms regardless. If `include_raw_text_holonyms` is
specified, raw text holonyms (those not of the form `placetype/placename`) are returned as well; they can be identified
by the fact that the `placetype` field in the holonym structure is nil. Two values are returned at each iteration, the
holonym index and holonym structure, similar to `ipairs()`.
]=]
function export.get_holonyms_to_check(place_desc, first_holonym_index, include_raw_text_holonyms)
local stop_at_also = not not first_holonym_index
return function(place_desc, index)
while true do
index = index + 1
local this_holonym = place_desc.holonyms[index]
-- If we were passed in a starting holonym index, go up to but not including a holonym marked with `:also`
-- (continue_cat_loop); the categorization code will then restart the loop at that holonym. That holonym
-- will have `:also` marked on it, so make sure not to stop immediately if the first holonym is marked with
-- `:also`.
if not this_holonym or stop_at_also and index > first_holonym_index and this_holonym.continue_cat_loop then
return nil
end
-- If not placetype, we're processing raw text, which we normally want to skip.
if include_raw_text_holonyms or this_holonym.placetype then
return index, this_holonym
end
end
end, place_desc, first_holonym_index and first_holonym_index - 1 or 0
end
--[==[
If the holonym in `data` (in the format as passed to a category handler) refers to a known location, iterate over all
such known locations, returning for each location the corresponding key, spec and group as well as the trail of
ancestral containers. Unlike `iterate_matching_location()`, this specifically checks that there is no mismatch between
the location's containers at any level and any of the following holonyms in the {{tl|place}} spec. The fields in `data`
are:
* `holonym_placetype`: The placetype of the holonym. It can actually be a list of possible placetypes, as with
`iterate_matching_location()`.
* `holonym_placename`: The placename of the holonym.
* `holonym_index`: The index of the holonym among the holonyms in `place_desc`, or nil if the holonym is not among the
holonyms in `place_desc`. (If a holonym index is given, we check for container mismatches among the holonyms
following the specified index, stopping either when encountering a holonym marked with modifier `:also` or, if none
exist, when we run out of holonyms. If no holonym index is given, we check all holonyms for container mismatches.)
* `place_desc`: Description of the place; used for the holonyms, to check for container mismatches.
Returns four values: the location group, the canonical key by which the location is known, the spec object describing
the location and the trail of ancestral containers for the location. The first three values are the same as for
`iterate_matching_location`.
]==]
function export.iterate_matching_holonym_location(data)
local holonym_placetype, holonym_placename, holonym_index, place_desc =
data.holonym_placetype, data.holonym_placename, data.holonym_index, data.place_desc
local matching_location_iterator = m_locations.iterate_matching_location {
placetypes = holonym_placetype,
placename = holonym_placename,
}
return function()
while true do
local group, key, spec = matching_location_iterator()
if not group then
return nil
end
local container_trail = {}
-- For each level of container, check that there are no mismatches (i.e. other location of the same
-- placetype) mentioned. We allow a mismatch at a given level if there's also a match with the container
-- at that level. For example, in the case of Kansas City, defined in [[Module:place/locations]] as a city
-- in Missouri, if we define it as {{tl|place|city|s/Missouri,Kansas}}, we ignore the mismatching state of
-- Kansas because the correct state of Missouri was also mentioned. But imagine we are defining Newark,
-- Delaware as {{tl|place|city|s/Delaware|c/US}} and (as is the case) we have an entry for Newark, New
-- Jersey in [[Module:place/locations]]. Just because the containing location `US` matches isn't enough,
-- because Newark, NJ also has New Jersey as a containing location and there's a mismatch at that level. If
-- there are no mismatches at any level we assume we're dealing with the right known location.
--
-- If at a given level there are multiple containing locations, we count a match if any holonym matches any
-- containing location, and a mismatch only if a holonym exists of the same placetype that doesn't match any
-- containing location.
local containers_mismatch = false
for containers in m_locations.iterate_containers(group, key, spec) do
insert(container_trail, containers)
local match_at_level = false
local mismatch_at_level = false
for other_holonym_index, other_holonym in export.get_holonyms_to_check(place_desc,
holonym_index and holonym_index + 1 or nil) do
local other_source_holonym = other_holonym.augmented_from_holonym
if other_source_holonym and other_source_holonym.placetype == holonym_placetype and
other_source_holonym.unlinked_placename ~= holonym_placename then
-- Ignore holonyms added during the augmentation process for other holonyms of the same
-- placetype as the placetype of the holonym we're considering. See comment in
-- augment_holonyms_with_container() for why we do this.
-- continue; grrr, no 'continue' in Lua
else
local holonym_matches_at_level = false
local holonym_exists_with_same_placetype = false
for _, container in ipairs(containers) do
if not container.spec.no_check_holonym_mismatch then
local full_container_placename, elliptical_container_placename =
m_locations.key_to_placename(container.group, container.key)
local placetypes = container.spec.placetype
if type(placetypes) ~= "table" then
placetypes = {placetypes}
end
local placetype_equivs = {}
for _, pt in ipairs(placetypes) do
m_table.extend(placetype_equivs, export.get_placetype_equivs(pt))
end
local this_holonym_matches = export.get_equiv_placetype_prop_from_equivs(
placetype_equivs, function(placetype)
return other_holonym.placetype == placetype and
(other_holonym.unlinked_placename == full_container_placename or
other_holonym.unlinked_placename == elliptical_container_placename)
end
)
if this_holonym_matches then
holonym_matches_at_level = true
break
end
local this_holonym_exists_with_same_placetype = export.get_equiv_placetype_prop_from_equivs(
placetype_equivs, function(placetype)
return other_holonym.placetype == placetype
end
)
if this_holonym_exists_with_same_placetype then
-- We seem to have a mismatch at this level. But before we decide conclusively that this
-- is the case, check to see whether the putative mismatch is an alias and matches when
-- we resolve the alias.
for oh_group, oh_key, oh_spec, oh_container_trail in
export.iterate_matching_holonym_location {
holonym_placetype = other_holonym.placetype,
holonym_placename = other_holonym.unlinked_placename,
holonym_index = other_holonym_index,
place_desc = place_desc,
} do
local oh_full_placename, oh_elliptical_placename =
m_locations.key_to_placename(oh_group, oh_key)
if oh_full_placename == full_container_placename or
oh_elliptical_placename == elliptical_container_placename then
-- Alias matched when resolved.
this_holonym_matches = true
break
end
end
if this_holonym_matches then
-- Alias matched above when resolved.
holonym_matches_at_level = true
break
else
-- Not an alias, or doesn't match when resolved. We have a true mismatch.
holonym_exists_with_same_placetype = true
end
end
end
end
if holonym_matches_at_level then
match_at_level = true
break
end
if holonym_exists_with_same_placetype then
mismatch_at_level = true
end
end
end
if not match_at_level and mismatch_at_level then
containers_mismatch = true
break
end
end
if not containers_mismatch then
return group, key, spec, container_trail
end
end
end
end
--[==[
If the holonym in `data` (in the format as passed to a category handler) refers to a known location, find and return the
corresponding key, spec and group as well as the trail of ancestral containers. This is like
`iterate_matching_holonym_location()` but throws an error if more than one location matches. (An example where this
would happen is {{tl|place|en|neighborhood|city/Newcastle}}, because there are two known locations named Newcastle. To
fix this, specify additional following disambiguating holonyms, e.g.
{{tl|place|en|neighborhood|city/Newcastle|s/New South Wales}}.
]==]
function export.find_matching_holonym_location(data)
local all_found = {}
for group, key, spec, container_trail in export.iterate_matching_holonym_location(data) do
insert(all_found, {group, key, spec, container_trail})
end
if not all_found[1] then
return nil
elseif all_found[2] then
local holonym_placetype = data.holonym_placetype
if type(holonym_placetype) == "table" then
holonym_placetype = concat(holonym_placetype, ",")
end
local found_keys = {}
for _, found in ipairs(all_found) do
local _, key, _, _ = unpack(found)
insert(found_keys, key)
end
error(("Found multiple matching locations for holonym '%s/%s'; specify disambiguating context in the " ..
"containing holonyms: %s"):format(holonym_placetype, data.holonym_placename, dump(found_keys)))
else
return unpack(all_found[1])
end
end
------------------------------------------------------------------------------------------
-- Placename and placetype data --
------------------------------------------------------------------------------------------
--[==[ var:
This is a map from aliases to their canonical forms. Any placetypes appearing as keys here will be mapped to their
canonical forms in all respects, including the display form. Contrast entries in 'placetype_data' with a fallback, which
applies to categorization and other processes but not to display.
The most important aliases are for holonym placetypes, particularly those that occur often such as "ประเทศ", "รัฐ",
"จังหวัด" and the like. Particularly long placetypes that mostly occur as entry placetypes (e.g.
"census-designated place") can be given abbreviations, but it is generally preferred to spell out the entry placetype.
Note also that we purposely avoid certain abbreviations that would be ambiguous (e.g. "d", which could variously be
interpreted as "department", "อำเภอ" or "division").
]==]
export.placetype_aliases = {
["acomm"] = "autonomous community",
["adr"] = "administrative region",
["adterr"] = "administrative territory", -- Pakistan
["aobl"] = "autonomous oblast",
["aokr"] = "autonomous okrug",
["ap"] = "autonomous province",
["apref"] = "autonomous prefecture",
["aprov"] = "autonomous province",
["ar"] = "autonomous region",
["arch"] = "archipelago",
["arep"] = "autonomous republic",
["aterr"] = "autonomous territory",
["atu"] = "autonomous territorial unit",
["bor"] = "borough",
["c"] = "ประเทศ",
["can"] = "canton",
["carea"] = "council area",
["cc"] = "constituent country",
["cdblock"] = "community development block",
["cdep"] = "Crown dependency",
["CDP"] = "census-designated place",
["cdp"] = "census-designated place",
["clcity"] = "county-level city",
["co"] = "เทศมณฑล",
["cobor"] = "county borough",
["colcity"] = "county-level city",
["coll"] = "collectivity",
["comm"] = "community",
["cont"] = "ทวีป",
["contr"] = "continental region",
["contregion"] = "continental region",
["cpar"] = "civil parish",
["damun"] = "direct-administered municipality",
["dep"] = "dependency",
["department capital"] = "departmental capital",
["dept"] = "department",
["depterr"] = "dependent territory",
["dist"] = "อำเภอ",
["distmun"] = "district municipality",
["div"] = "division",
["emp"] = "จักรวรรดิ",
["fpref"] = "French prefecture",
["gov"] = "governorate",
["govnat"] = "governorate",
["home-rule city"] = "home rule city",
["home-rule municipality"] = "home rule municipality",
["inner-city area"] = "inner city area",
["ires"] = "Indian reservation",
["isl"] = "เกาะ",
["lbor"] = "London borough",
["lga"] = "local government area",
["lgarea"] = "local government area",
["lgd"] = "local government district",
["lgdist"] = "local government district",
["metbor"] = "metropolitan borough",
["metcity"] = "metropolitan city",
["metmun"] = "metropolitan municipality",
["mtn"] = "ภูเขา",
["mun"] = "เทศบาล",
["mundist"] = "municipal district",
["nonmetropolitan county"] = "non-metropolitan county",
["obl"] = "oblast",
["okr"] = "okrug",
["p"] = "จังหวัด",
["par"] = "parish",
["parmun"] = "parish municipality",
["pen"] = "peninsula",
["plcity"] = "prefecture-level city",
["plcolony"] = "Polish colony",
["pref"] = "prefecture",
["prefcity"] = "prefecture-level city",
["preflcity"] = "prefecture-level city",
["prov"] = "จังหวัด",
["r"] = "ภูมิภาค",
["range"] = "เทือกเขา",
["rcm"] = "regional county municipality",
["rcomun"] = "regional county municipality",
["rdist"] = "regional district",
["rep"] = "republic",
["rhrom"] = "rural hromada",
["riv"] = "แม่น้ำ",
["rmun"] = "regional municipality",
["robor"] = "royal borough",
["romp"] = "Roman province",
["runit"] = "regional unit",
["rurmun"] = "rural municipality",
["s"] = "รัฐ",
["sar"] = "special administrative region",
["shrom"] = "settlement hromada",
["spref"] = "subprefecture",
["sprefcity"] = "sub-prefectural city",
["sprovcity"] = "subprovincial city",
["submet city"] = "sub-metropolitan city",
["submetropolitan city"] = "sub-metropolitan city",
["sub-prefecture-level city"] = "sub-prefectural city",
["sub-provincial city"] = "subprovincial city",
["sub-provincial district"] = "subprovincial district",
["terr"] = "ดินแดน",
["terrauth"] = "territorial authority",
["twp"] = "township",
["twpmun"] = "township municipality",
["uauth"] = "unitary authority",
["ucomm"] = "unincorporated community",
["udist"] = "unitary district",
["uhrom"] = "urban hromada",
["uterr"] = "union territory",
["utwpmun"] = "united township municipality",
["val"] = "valley",
["vdc"] = "village development committee",
["vil"] = "village",
["voi"] = "voivodeship",
["wcomm"] = "Welsh community",
}
local no_link_def_article = {link = false, article = "the"}
local no_link_no_article = {link = false, article = false}
--[==[ var:
These qualifiers can be prepended onto any placetype and will be handled correctly. For example, the placetype
`large city` will be displayed as `large <nowiki>[[city]]</nowiki>` and categorized as if `city` were specified. If the
value in the following table is a string, the qualifier will display according to the string. If the value is `true`,
the qualifier will be linked to its corresponding Wiktionary entry. If the value is `false`, the qualifier will not be
linked but will appear as-is. Note that these qualifiers do not override placetypes with entries elsewhere that contain
those same qualifiers. For example, the entry for `inland sea` in `placetype_data` will apply in preference to treating
`inland sea` as equivalent to `sea`.
]==]
export.placetype_qualifiers = {
-- generic qualifiers
["huge"] = false,
["tiny"] = false,
["large"] = false,
["big"] = false,
["mid-size"] = false,
["mid-sized"] = false,
["small"] = false,
["sizable"] = false,
["important"] = false,
["long"] = false,
["short"] = false,
["major"] = false,
["minor"] = false,
["high"] = false,
["tall"] = false,
["low"] = false,
["left"] = false, -- left tributary
["right"] = false, -- right tributary
["modern"] = false, -- for use in opposition to "ancient" in another definition
-- "former" qualifiers
["abandoned"] = true,
["ancient"] = true,
["deserted"] = true,
["extinct"] = true,
["former"] = false,
["historic"] = "historical",
["historical"] = true,
["medieval"] = true,
["mediaeval"] = true,
["ruined"] = true,
["traditional"] = true,
-- sea qualifiers
["coastal"] = true,
["inland"] = true, -- note, we also have an entry in placetype_data for 'inland sea' to get a link to [[inland sea]]
["maritime"] = true,
["overseas"] = true,
["seaside"] = true,
["beachfront"] = true,
["beachside"] = true,
["riverside"] = true,
-- lake qualifiers
["freshwater"] = true,
["saltwater"] = true,
["endorheic"] = true,
["oxbow"] = true,
["ox-bow"] = "[[oxbow]]", -- [[ox-bow]] is a red link
["tidal"] = true,
-- land qualifiers
["hilltop"] = true,
["hilly"] = true,
["insular"] = true,
["peninsular"] = true,
["chalk"] = true,
["karst"] = true,
["limestone"] = true,
["mountainous"] = true,
["mountaintop"] = true,
["alpine"] = true,
["volcanic"] = true, -- for an island
-- political status qualifiers
["autonomous"] = true,
["incorporated"] = true,
["special"] = true,
["unincorporated"] = true,
["coterminous"] = true,
-- monetary status/etc. qualifiers
["fashionable"] = true,
["wealthy"] = true,
["affluent"] = true,
["declining"] = true,
-- city vs. rural qualifiers
["urban"] = true,
["suburban"] = true,
["exurban"] = true,
["outlying"] = true,
["remote"] = true,
["rural"] = true,
["outback"] = true,
["inner"] = false,
["inner-city"] = true,
["central"] = false,
["outer"] = false,
-- land use qualifiers
["residential"] = true,
["agricultural"] = true,
["business"] = true,
["commercial"] = true,
["industrial"] = true,
-- business use qualifiers
["railroad"] = true,
["railway"] = true,
["farming"] = true,
["fishing"] = true,
["mining"] = true,
["logging"] = true,
["cattle"] = true,
-- tourism use qualifiers
["resort"] = true, -- note, we also have 'resort city' and 'resort town', that take precedecne
["spa"] = true, -- note, we also have 'spa city' and 'spa town', that take precedecne
["ski"] = true, -- note, we also have 'ski resort city' and 'ski resort town', that take precedecne
-- religious qualifiers
["holy"] = true,
["sacred"] = true,
["religious"] = true,
["secular"] = true,
-- qualifiers for nonexistent places
["claimed"] = false,
["fictional"] = true,
["legendary"] = true,
["mythical"] = true,
["mythological"] = true,
-- directional qualifiers
["northern"] = false,
["southern"] = false,
["eastern"] = false,
["western"] = false,
["north"] = false,
["south"] = false,
["east"] = false,
["west"] = false,
["northeastern"] = false,
["southeastern"] = false,
["northwestern"] = false,
["southwestern"] = false,
["northeast"] = false,
["southeast"] = false,
["northwest"] = false,
["southwest"] = false,
-- seasonal qualifiers
["summer"] = true, -- e.g. for 'summer capital'
["winter"] = true,
-- legal status qualifiers
-- FIXME: Two-word qualifiers don't work yet. But you can enter "de-facto" and it's canonicalized to [[de facto]].
["official"] = true,
["unofficial"] = true,
["de facto"] = true, -- 'de facto capital'
["de-facto"] = "[[de facto]]", -- [[de-facto]] is a red link
["de jure"] = true, -- 'de jure capital'
["de-jure"] = "[[de jure]]", -- [[de-jure]] is a red link
-- NOTE: 'unrecognized/unrecognised' are handled as placetypes 'unrecognized country', 'unrecognized state'
-- misc. qualifiers
["planned"] = true,
["chartered"] = true,
["landlocked"] = true,
["uninhabited"] = true,
-- superlative qualifiers
["first"] = no_link_def_article,
["second"] = no_link_def_article, -- for "second largest" etc.
["third"] = no_link_def_article,
["fourth"] = no_link_def_article,
["last"] = no_link_def_article,
["only"] = no_link_def_article,
["sole"] = no_link_def_article,
["main"] = no_link_def_article,
["largest"] = no_link_def_article,
["biggest"] = no_link_def_article,
["smallest"] = no_link_def_article,
["shortest"] = no_link_def_article,
["longest"] = no_link_def_article,
["tallest"] = no_link_def_article,
["highest"] = no_link_def_article,
["lowest"] = no_link_def_article,
["leftmost"] = no_link_def_article,
["rightmost"] = no_link_def_article,
["innermost"] = no_link_def_article,
["outermost"] = no_link_def_article,
["northernmost"] = no_link_def_article,
["southernmost"] = no_link_def_article,
["westernmost"] = no_link_def_article,
["easternmost"] = no_link_def_article,
["northwesternmost"] = no_link_def_article,
["southwesternmost"] = no_link_def_article,
["northeasternmost"] = no_link_def_article,
["southeasternmost"] = no_link_def_article,
-- several/various
["several"] = no_link_no_article,
["various"] = no_link_no_article,
["numerous"] = no_link_no_article,
["multiple"] = no_link_no_article,
["many"] = no_link_no_article,
["other"] = no_link_no_article,
}
--[==[ var:
In this table, the key qualifiers should be treated the same as the value qualifiers for categorization purposes. This
is overridden by `placetype_data` and `qualifier_to_placetype_equivs`.
]==]
export.former_qualifiers = {
["abandoned"] = {"FORMER"},
["ancient"] = {"ANCIENT", "FORMER"},
["former"] = {"FORMER"},
["extinct"] = {"FORMER"},
["historic"] = {"FORMER"},
["historical"] = {"FORMER"},
["medieval"] = {"ANCIENT", "FORMER"},
["mediaeval"] = {"ANCIENT", "FORMER"},
["ruined"] = {"ANCIENT", "FORMER"},
["traditional"] = {"FORMER"},
}
--[==[ var:
In this table, any placetypes containing these qualifiers that do not occur in `placetype_data` should be mapped to the
specified placetypes for categorization purposes. Entries here are overridden by `placetype_data`.
]==]
export.qualifier_to_placetype_equivs = {
["fictional"] = "fictional location",
["legendary"] = "mythological location",
["mythical"] = "mythological location",
["mythological"] = "mythological location",
-- For e.g. Taiwan as a "claimed province" of China; parts of Belize as claimed by Guatemala; various islands
-- claimed by various parties in East Asia. FIXME: We should conditionalize on what is being claimed since there are
-- also claimed capitals, e.g. Israel and Palestine claim Jerusalem as their capital.
["claimed"] = "claimed political division",
}
--[==[ var:
Mapping from placetypes to the corresponding plural category-only placetype for a capital of that placetype. The reverse
mapping also exists.
]==]
export.placetype_to_capital_cat = {
["autonomous community"] = "autonomous community capitals",
["canton"] = "cantonal capitals",
["comarca"] = "comarca capitals",
["ประเทศ"] = "national capitals",
-- The following are not obviously different from 'county seats' but the latte terminology is used in the US.
["เทศมณฑล"] = "county capitals",
["department"] = "departmental capitals",
["อำเภอ"] = "district capitals",
["division"] = "division capitals",
["emirate"] = "emirate capitals",
["governorate"] = "governorate capitals",
["hromada"] = "hromada capitals",
["krai"] = "krai capitals",
["metropolitan city"] = "metropolitan city capitals",
["เทศบาล"] = "municipal capitals",
["oblast"] = "oblast capitals",
["okrug"] = "okrug capitals",
["prefecture"] = "prefectural capitals",
["จังหวัด"] = "provincial capitals",
["raion"] = "raion capitals",
["regency"] = "regency capitals",
["ภูมิภาค"] = "regional capitals",
["regional unit"] = "regional unit capitals",
["republic"] = "republic capitals",
["รัฐ"] = "state capitals",
["ดินแดน"] = "territorial capitals",
["voivodeship"] = "voivodeship capitals",
}
--[==[ var:
This contains placenames that should be preceded by an article (almost always "the"). '''NOTE''': There are multiple
ways that placenames can come to be preceded by "the":
# Listed here.
# Given in [[Module:place/locations]] with an initial "the". All such placenames are added to this map by the code
just below the map.
# The placetype of the placename has `holonym_use_the = true` in its placetype_data.
# A regex in placename_the_re matches the placename.
Note that "the" is added only before the first holonym in a place description.
]==]
export.placename_article = {
-- This should only contain info that can't be inferred from [[Module:place/locations]].
["archipelago"] = {
["Cyclades"] = "the",
["Dodecanese"] = "the",
},
["ประเทศ"] = {
["Holy Roman Empire"] = "the",
},
["จักรวรรดิ"] = {
["Holy Roman Empire"] = "the",
},
["เกาะ"] = {
["North Island"] = "the",
["South Island"] = "the",
},
["ภูมิภาค"] = {
["Balkans"] = "the",
["Russian Far East"] = "the",
["Caribbean"] = "the",
["Caucasus"] = "the",
["Middle East"] = "the",
["New Territories"] = "the",
["North Caucasus"] = "the",
["South Caucasus"] = "the",
["West Bank"] = "the",
["Gaza Strip"] = "the",
},
["valley"] = {
["San Fernando Valley"] = "the",
},
}
--[==[ var:
Regular expressions to apply to determine whether we need to put 'the' before a holonym. The key "*" applies to all
holonyms, otherwise only the regexes for the holonym's placetype apply.
]==]
export.placename_the_re = {
-- We don't need entries for peninsulas, seas, oceans, gulfs or rivers
-- because they have holonym_use_the = true.
["*"] = {"^Isle of ", " Islands$", " Mountains$", " Empire$", " Country$", " Region$", " District$", "^City of "},
["bay"] = {"^Bay of "},
["ทะเลสาบ"] = {"^Lake of "},
["ประเทศ"] = {"^Republic of ", " Republic$"},
["republic"] = {"^Republic of ", " Republic$"},
["ภูมิภาค"] = {" [Rr]egion$"},
["แม่น้ำ"] = {" River$"},
["local government area"] = {"^Shire of "},
["เทศมณฑล"] = {"^Shire of "},
["Indian reservation"] = {" Reservation", " Nation"},
["tribal jurisdictional area"] = {" Reservation", " Nation"},
}
--[==[ var:
If any of the following holonyms are present, the associated holonyms are automatically added to the end of the list of
holonyms for categorization (but not display) purposes.
]==]
export.cat_implications = {
["ภูมิภาค"] = {
["Eastern Europe"] = {"continent/Europe"},
["Central Europe"] = {"continent/Europe"},
["Western Europe"] = {"continent/Europe"},
["South Europe"] = {"continent/Europe"},
["Southern Europe"] = {"continent/Europe"},
["Northern Europe"] = {"continent/Europe"},
["Northeast Europe"] = {"continent/Europe"},
["Northeastern Europe"] = {"continent/Europe"},
["Southeast Europe"] = {"continent/Europe"},
["Southeastern Europe"] = {"continent/Europe"},
["North Caucasus"] = {"continent/Europe"},
["South Caucasus"] = {"continent/Asia"},
["South Asia"] = {"continent/Asia"},
["Southern Asia"] = {"continent/Asia"},
["East Asia"] = {"continent/Asia"},
["Eastern Asia"] = {"continent/Asia"},
["Central Asia"] = {"continent/Asia"},
["West Asia"] = {"continent/Asia"},
["Western Asia"] = {"continent/Asia"},
["Southeast Asia"] = {"continent/Asia"},
["North Asia"] = {"continent/Asia"},
["Northern Asia"] = {"continent/Asia"},
["Anatolia"] = {"continent/Asia"},
["Asia Minor"] = {"continent/Asia"},
["Mesopotamia"] = {"continent/Asia"},
["North Africa"] = {"continent/Africa"},
["Central Africa"] = {"continent/Africa"},
["West Africa"] = {"continent/Africa"},
["East Africa"] = {"continent/Africa"},
["Southern Africa"] = {"continent/Africa"},
["Central America"] = {"continent/Central America"},
["Caribbean"] = {"continent/North America"},
["Polynesia"] = {"continent/Oceania"},
["Micronesia"] = {"continent/Oceania"},
["Melanesia"] = {"continent/Oceania"},
["Siberia"] = {"country/Russia", "continent/Asia"},
["Russian Far East"] = {"country/Russia", "continent/Asia"},
["South Wales"] = {"constituent country/Wales", "continent/Europe"},
["Balkans"] = {"continent/Europe"},
["West Bank"] = {"country/Palestine", "continent/Asia"},
["Gaza"] = {"country/Palestine", "continent/Asia"},
["Gaza Strip"] = {"country/Palestine", "continent/Asia"},
}
}
------------------------------------------------------------------------------------------
-- Category and display handlers --
------------------------------------------------------------------------------------------
local function city_type_cat_handler(data)
local entry_placetype = data.entry_placetype
local generic_before_non_cities = export.get_placetype_prop(entry_placetype, "generic_before_non_cities")
if not generic_before_non_cities then
internal_error("city_type_cat_handler called on placetype %s that doesn't have a `generic_before_non_cities`" ..
" setting", entry_placetype)
end
local plural_entry_placetype = export.pluralize_placetype(entry_placetype)
local group, key, spec, container_trail = export.find_matching_holonym_location(data)
if group and not spec.is_former_place and not spec.is_city then
-- Categorize both in key, and in the larger polity that the key is part of, e.g. [[Hirakata]] goes in both
-- "Cities in Osaka Prefecture" and "Cities in Japan". (But don't do the latter if no_container_cat is set.)
local cap_plural_entry_placetype = ucfirst(plural_entry_placetype)
local retcats = {("%s%s%s"):format(cap_plural_entry_placetype, generic_before_non_cities, export.get_prefixed_key(key, spec))} --th
if container_trail[1] and not spec.no_container_cat then
for _, container in ipairs(container_trail[1]) do
insert(retcats, ("%s%s%s"):format(cap_plural_entry_placetype, generic_before_non_cities, export.get_prefixed_key(container.key, container.spec))) --th
end
end
return retcats
end
end
local function capital_city_cat_handler(data, non_city)
local holonym_placetype, holonym_placename, holonym_index, place_desc =
data.holonym_placetype, data.holonym_placename, data.holonym_index, data.place_desc
-- The first time we're called we want to return something; otherwise we will be called for later-mentioned
-- holonyms, which can result in wrongly classifying into e.g. `National capitals`. Simulate the loop in
-- find_placetype_cat_specs() over holonyms so we get the proper `Cities in ...` categories as well as the capital
-- category/categories we add below.
local retcats
if not non_city and place_desc.holonyms then
for h_index, holonym in export.get_holonyms_to_check(place_desc, holonym_index) do
local h_placetype, h_placename = holonym.placetype, holonym.unlinked_placename
retcats = city_type_cat_handler {
entry_placetype = "นคร",
holonym_placetype = h_placetype,
holonym_placename = h_placename,
holonym_index = h_index,
place_desc = place_desc,
}
if retcats then
break
end
end
end
if not retcats then
retcats = {}
end
-- Now find the appropriate capital-type category for the placetype of the holonym, e.g. 'State capitals'. If we
-- recognize the holonym among the known holonyms in [[Module:place/locations]], also add a category like 'State
-- capitals of the United States'. Truncate e.g. 'autonomous region' to 'region', 'union territory' to 'territory'
-- when looking up the type of capital category, if we can't find an entry for the holonym placetype itself (there's
-- an entry for 'autonomous community').
local capital_cat = export.placetype_to_capital_cat[holonym_placetype]
if not capital_cat then
capital_cat = export.placetype_to_capital_cat[holonym_placetype:gsub("^.* ", "")]
end
if capital_cat then
capital_cat = ucfirst(capital_cat)
local inserted_specific_variant_cat = false
if holonym_index then
-- Now find the first recognized holonym location. We don't stop when :also is seen because of the common pattern
-- where we use :also to specify that a given city is the capital at multiple surrounding levels.
local matching_group, matching_key, matching_spec, matching_container_trail, matching_holonym_index
for h_index = holonym_index, #place_desc.holonyms do
if place_desc.holonyms[h_index].placetype then
matching_group, matching_key, matching_spec, matching_container_trail = export.find_matching_holonym_location {
holonym_placetype = place_desc.holonyms[h_index].placetype,
holonym_placename = place_desc.holonyms[h_index].unlinked_placename,
holonym_index = h_index,
place_desc = place_desc,
}
if matching_group then
matching_holonym_index = h_index
break
end
end
end
if matching_holonym_index == holonym_index then
if matching_container_trail[1] and not matching_spec.no_container_cat then
for _, container in ipairs(matching_container_trail[1]) do
insert(retcats, ("%sของ%s"):format(capital_cat, export.get_prefixed_key(container.key,
container.spec)))
inserted_specific_variant_cat = true
end
end
elseif matching_holonym_index then
-- Check to make sure that the holonym placetype we were called on is listed among the
-- divtypes of the location we found.
local function insert_specific_variant_if_possible(key, spec)
return export.get_equiv_placetype_prop(holonym_placetype, function(pt)
local plural_holonym_placetype = export.pluralize_placetype(pt)
local saw_matching_div
if spec.divs then
local divs = spec.divs
if type(divs) ~= "table" then
divs = {divs}
end
for _, div in ipairs(divs) do
if type(div) ~= "table" then
div = {type = div}
end
if plural_holonym_placetype == div.type then
saw_matching_div = true
break
end
end
end
if saw_matching_div then
insert(retcats, ("%sของ%s"):format(capital_cat, export.get_prefixed_key(key, spec)))
return true
end
return false
end)
end
if insert_specific_variant_if_possible(matching_key, matching_spec) then
inserted_specific_variant_cat = true
elseif not matching_spec.no_container_cat then
for _, containers in ipairs(matching_container_trail) do
local saw_no_container_cat = false
for _, container in ipairs(containers) do
if insert_specific_variant_if_possible(container.key, container.spec) then
inserted_specific_variant_cat = true
break
end
saw_no_container_cat = saw_no_container_cat or container.spec.no_container_cat
end
if inserted_specific_variant_cat or saw_no_container_cat then
break
end
end
end
end
else
-- This happens when in an invocation like {{place|en|capital city|s/Haryana,Punjab}} for
-- [[Chandigarh]]. We fall back to older code that doesn't depend on the holonym index existing.
-- FIXME: This may not be necessary. In the example just given, when processing Haryana we add to
-- [[:Category:en:State capitals of India]], and nothing extra gets added when processing Punjab.
-- Possibly we can just skip this case entirely.
local group, key, spec, container_trail = export.find_matching_holonym_location(data)
if group and container_trail[1] and not spec.no_container_cat then
for _, container in ipairs(container_trail[1]) do
insert(retcats, ("%sของ%s"):format(capital_cat, export.get_prefixed_key(container.key,
container.spec)))
inserted_specific_variant_cat = true
end
end
end
if not inserted_specific_variant_cat then
insert(retcats, capital_cat)
end
else
-- We didn't recognize the holonym placetype; just put in 'Capital cities'.
insert(retcats, "เมืองหลวง")
end
return retcats
end
--[=[
This is invoked specially for all placetypes (see the `*` placetype key at the bottom of `placetype_data`). This is used
in two ways:
# To add pages to generic holonym categories like [[:Category:en:สถานที่ในMerseyside, England]] (and
[[:Category:en:สถานที่ในEngland]]) for any pages that have `co/Merseyside` as their holonym.
# To categorize demonyms in bare placename categories like [[:Category:en:Merseyside, England]] if the demonym
description mentions `co/Merseyside` and doesn't mention a more specific placename that also has a category. (In this
case there are none, but we can have demonyms at multiple levels, e.g. in France for individual villages, departments,
administrative regions, and for the entire country, and for example we only want to categorize a demonym into
[[:Category:France]] if no more specific category applies.) Unlike when invoked from {{tl|place}}, a demonym
invocation only adds the most specific holonym category and not the category of any containing polity (hence if we
add [[:Category:en:Merseyside, England]] we won't also add [[:Category:England]]).
This code also handles cities; e.g. for the first use case above, it would be used to add a page that has `city/Boston`
as a holonym to [[:Category:en:สถานที่ในBoston]], along with [[:Category:en:สถานที่ในMassachusetts, USA]] and
[[:Category:en:สถานที่ในthe United States]]. The city handler tries to deal with the possibility of multiple cities
having the same name. For example, the code in [[Module:place/locations]] knows about the city of [[Columbus]],
[[Ohio]], which has containing polities `Ohio` (a state) and `the United States` (a country). If either containing
polity is mentioned, the handler proceeds to return the key `Columbus` (along with `Ohio, USA` and `the United States`).
Otherwise, if any other state or country is mentioned, the handler returns nothing, and otherwise it assumes the
mentioned city is the one we're considering and returns `Columbus` etc. This works correctly if the place only mentions
Ohio and a holonym for a Columbus in a different country is encountered, because of the function
`augment_holonyms_with_container`, which adds the US as a holonym when Ohio is encountered.
The single parameter `data` is as in category handlers. The return value is a list of categories (without the preceding
language code).
]=]
local function generic_place_cat_handler(data)
local from_demonym = data.from_demonym
local retcats = {}
local function insert_retkey(key, spec)
if from_demonym then
insert(retcats, key)
else
insert(retcats, ("สถานที่ใน%s"):format(export.get_prefixed_key(key, spec)))
end
end
local group, key, spec, container_trail = export.find_matching_holonym_location(data)
if group then
if not spec.no_generic_place_cat then
-- This applies to continents and continental regions.
insert_retkey(key, spec)
end
-- Categorize both in key, and in the larger location(s) that the key is part of, e.g. [[Hirakata]] goes in
-- both [[Category:สถานที่ในOsaka Prefecture, Japan]] and [[Category:สถานที่ในJapan]]. But not when
-- no_container_cat is set (e.g. for 'United Kingdom').
if not spec.no_container_cat then
for _, container_set in ipairs(container_trail) do
local stop_adding_containers = false
for _, container in ipairs(container_set) do
if not container.spec.no_generic_place_cat then
insert_retkey(container.key, container.spec)
end
if container.spec.no_container_cat then
stop_adding_containers = true
end
end
if stop_adding_containers then
break
end
end
end
return retcats
end
end
--[==[
Special category handler run for all placetypes that checks for specified division placetypes of known locations and
categorizes appropriately.
]==]
function export.political_division_cat_handler(data)
if data.from_demonym then
return
end
local group, key, spec, container_trail = export.find_matching_holonym_location(data)
if group then
local divlists = {}
if spec.divs then
insert(divlists, spec.divs)
end
if spec.addl_divs then
insert(divlists, spec.addl_divs)
end
for _, divlist in ipairs(divlists) do
if type(divlist) ~= "table" then
divlist = {divlist}
end
for _, div in ipairs(divlist) do
if type(div) == "string" then
div = {type = div}
end
local sgdiv = export.maybe_singularize_placetype(div.type) or div.type
local prep = div.prep or "ของ"
local cat_as = div.cat_as or div.type
if type(cat_as) ~= "table" then
cat_as = {cat_as}
end
if not export.placetype_data[sgdiv] then
internal_error("Placetype %s associated with known location key %s and data %s not found in " ..
"`placetype_data`", sgdiv, key, spec)
end
if sgdiv == data.entry_placetype then
local retcats = {}
for _, pt_cat in ipairs(cat_as) do
if type(pt_cat) == "string" then
pt_cat = {type = pt_cat}
end
local pt_prep = pt_cat.prep or prep
insert(retcats, ucfirst(pt_cat.type) .. pt_prep .. export.get_prefixed_key(key, spec)) --th
end
return retcats
end
end
end
end
end
--[==[
This is used to add pages to "bare" categories like [[:Category:en:Georgia, USA]] for `[[Georgia]]` and any
foreign-language terms that are translations of the state of Georgia. We look at the page title (or its overridden value
in {{para|pagename}}) as well as the glosses in {{para|t}}/{{para|t2}} etc., various extra-info values such as the
modern names in {{para|modern}}, and any values specified using a form-of directive. We need to pay attention to the
entry placetypes specified so we don't overcategorize; e.g. the US state of Georgia is `[[Джорджия]]` in Russian but the
country of Georgia is `[[Грузия]]`, and if we just looked for matching names, we'd get both Russian terms categorized
into both [[:Category:ru:Georgia, USA]] and [[:Category:ru:Georgia]]. We also need to check the containing holonyms to
make sure there isn't a mismatch (so we don't e.g. categorize Newark, Delaware in [[:Category:en:Newark]], which is
intended for Newark, New Jersey).
]==]
function export.get_bare_categories(args, overall_place_spec)
local bare_cats = {}
local place_descs = overall_place_spec.descs
local possible_placetypes_by_place_desc = {}
for i, place_desc in ipairs(place_descs) do
possible_placetypes_by_place_desc[i] = {}
for _, placetype in ipairs(place_desc.placetypes) do
if not export.placetype_is_ignorable(placetype) then
local equivs = export.get_placetype_equivs(placetype, {register_former_as_non_former = true})
for _, equiv in ipairs(equivs) do
insert(possible_placetypes_by_place_desc[i], equiv.placetype)
end
end
end
end
local function check_term(term)
-- Treat Wikipedia links like local ones.
term = term:gsub("%[%[w:", "[["):gsub("%[%[wikipedia:", "[[")
term = export.remove_links_and_html(term)
term = term:gsub("^the ", "")
for i, place_desc in ipairs(place_descs) do
-- Iterate over all matching locations in case there are multiple, as with Delhi defined as
-- {{place|en|megacity/and/union territory|c/India|containing the national capital [[New Delhi]]}}.
for group, key, spec, container_trail in export.iterate_matching_holonym_location {
holonym_placetype = possible_placetypes_by_place_desc[i],
holonym_placename = term,
place_desc = place_desc,
} do
insert(bare_cats, key)
end
end
end
-- FIXME: Should we only do the following if the language is English (requires that the lang is passed in)?
-- We should always do it if `pagename` is given (as it is with {{tcl}}) but maybe not otherwise unless 1=en. There
-- are cases like [[Ankara]] = English name for capital of Turkey, but also the name in various languages for the
-- capital of Ghana (= English [[Accra]]). But this should get caught by mismatching the containing country. The
-- advantage of checking when the language isn't English is we catch those places that fail to give an English
-- translation but where the translation happens to be the same as the other-language spelling. However, I don't
-- know how often this situation occurs.
check_term(args.pagename or mw.title.getCurrentTitle().subpageText)
for _, t in ipairs(args.t) do
check_term(t)
end
local function check_termobj_list(terms)
for _, term in ipairs(terms) do
if term.eq then
check_term(term.eq)
end
if term.alt or term.term then
check_term(term.alt or term.term)
end
end
end
for _, extra_info_terms in ipairs(overall_place_spec.extra_info) do
local arg = extra_info_terms.arg
if arg == "modern" or arg == "now" or arg == "full" or arg == "short" then
check_termobj_list(extra_info_terms.terms)
end
end
for _, directive in ipairs(overall_place_spec.directives) do
check_termobj_list(directive.terms)
end
return bare_cats
end
--[==[
This is used to augment the holonyms associated with a place description with the containing polities. For example,
given the following:
`# {{tl|place|en|subprefecture|pref/Hokkaido}}.`
We auto-add Japan as another holonym so that the term gets categorized into [[:Category:Subprefectures of Japan]].
To avoid over-categorizing we need to check to make sure no other countries are specified as holonyms.
]==]
function export.augment_holonyms_with_container(place_descs)
for _, place_desc in ipairs(place_descs) do
if place_desc.holonyms then
-- This ends up containing a copy of the original holonyms, with the augmented holonyms inserted in their
-- appropriate position. We don't just put them at the end because some holonyms have use the `:also`
-- modifier, which causes category processing to restart at that point after generating categories for a
-- preceding holonym, and we don't want the preceding holonym's augmented holonyms interfering with
-- categorization of a later holonym. We proceed from right to left, and each time we augment, we copy
-- the holonyms with the augmented holonym(s) inserted appropriately and replace the place description's
-- holonyms with the augmented ones before the next iteration. The reason for this is so that e.g.
-- {{place|neighborhood|city/Birmingham|co/West Midlands|cc/England}} doesn't throw an error during the
-- augmentation process due to 'Birmingham' referring to two known locations (in England and Alabama). If
-- we go left to right, we will throw an ambiguity error on `city/Birmingham` because code to exclude
-- Birmingham, Alabama needs `c/United Kingdom` present (to cause a mismatch with `c/United States`),
-- which isn't yet present as the augmentation code hasn't gotten to `cc/England` yet. For similar
-- reasons, we need to include the augmented holonyms in the holonyms considered in the next iteration
-- rather than modifying the place description once at athe end.
for i = #place_desc.holonyms, 1, -1 do
local holonym = place_desc.holonyms[i]
if holonym.placetype and not export.placetype_is_ignorable(holonym.placetype) then
local group, key, spec, container_trail = export.find_matching_holonym_location {
holonym_placetype = holonym.placetype,
holonym_placename = holonym.unlinked_placename,
holonym_index = i,
place_desc = place_desc,
}
if group and container_trail[1] and not spec.no_auto_augment_container then
local augmented_holonyms = {}
for j = 1, i do
insert(augmented_holonyms, place_desc.holonyms[j])
end
for _, containers in ipairs(container_trail) do
local any_no_auto_augment_container = false
for _, container in ipairs(containers) do
any_no_auto_augment_container = any_no_auto_augment_container or
container.spec.no_auto_augment_container
local containing_type = container.spec.placetype
if type(containing_type) == "table" then
-- If the containing type is a list, use the first element as the canonical variant.
containing_type = containing_type[1]
end
local full_container_placename, elliptical_container_placename =
m_locations.key_to_placename(container.group, container.key)
-- Don't side-effect holonyms while processing them.
local new_holonym = {
-- By the time we run, the display has already been generated so we don't need to
-- set display_placename.
placetype = containing_type,
-- placename_to_key() for the group should correctly handle both full and elliptical
-- placenames, but the full placename seems less likely to be ambiguous. FIXME: We
-- should just store the key directly and use it when available to avoid having to
-- convert key to placename and back to key.
unlinked_placename = full_container_placename,
-- Indicate that this is an augmented holonym, and was derived from the specified
-- holonym. In iterate_matching_holonym_location(), we ignore augmented holonyms
-- derived from holonyms that are different from the holonym we're searching for but
-- of the same placetype. This is to correctly handle a situation like
-- {{place|river|dept/Ardèche,Gard,Vaucluse,Bouches-du-Rhône|c/France}}. Here,
-- `Ardèche` is in `r/Auvergne-Rhône-Alpes`, while `Gard` is in `r/Occitania` and
-- the other two are in `r/Provence-Alpes-Côte d'Azur`. Augmenting proceeds from
-- right to left, so after it adds `r/Provence-Alpes-Côte d'Azur` to
-- `Bouches-du-Rhône`, Vaucluse gets augmented correctly but `Gard` fails to match
-- in find_matching_holonym_location() because of the mismatch between augmented
-- `r/Provence-Alpes-Côte d'Azur` and actual `r/Occitania`. Similarly, all later
-- calls to find_matching_holonym_location() fail to match `Gard` (and likewise
-- `Ardèche`) against any known location. To deal with this, we mark augmented
-- holoynms as being augmented due to a source holonym, and when processing a given
-- holonym, ignore augmented holonyms from other holonyms of the same placetype.
-- The restriction to the same placetype is so that `Birmingham` still gets
-- correctly disambiguated to Birmingham, England in the example given above near
-- the top of this function, using the augmented holonym `c/United Kingdom` added by
-- the specified `cc/England` (whose placetype `constituent country` differs from
-- the placetype `city` of Birmingham).
augmented_from_holonym = holonym,
}
insert(augmented_holonyms, new_holonym)
-- But it is safe to modify other parts of the place_desc.
export.key_holonym_into_place_desc(place_desc, new_holonym)
end
if any_no_auto_augment_container then
break
end
end
for j = i + 1, #place_desc.holonyms do
insert(augmented_holonyms, place_desc.holonyms[j])
end
place_desc.holonyms = augmented_holonyms
end
end
end
end
end
end
-- Cat handler for district, areas, neighborhoods and suburbs. Districts are tricky because they can either be political
-- divisions or city neighborhoods. Areas similarly can be political divisions (rarely; specifically, in Kuwait), city
-- neighborhoods or larger geographical areas/regions. We handle this as follows:
-- (1) `placetype_data` cat entries for specific countries or country divisions take precedence over cat_handlers, so if
-- the user says {{tl|place|district|s/Maharashtra|c/India}}, we won't even be called because there is an entry that
-- categorizes into [[:Category|Districts of Maharashtra, India]].
-- (2) If we're called, we check the holonym we're called on to see if it is a recognized city, e.g. if we're called
-- using {{tl|place|district|city/Mumbai|s/Maharashtra|c/India}}. If so, we categorize under e.g.
-- [[:Category:Neighbourhoods of Mumbai]]. (Choosing the spelling "neighbourhoods" because we're in India.)
-- (3) If we're called and the holonym is not a recognized city, we check if the placetype has has_neighborhoods set.
-- If so, it's "city-like" and we categorize under the first containing polity that we recognize. For example, if
-- we're called using {{tl|place|district|town/Northampton|co/Hampshire|s/Massachusetts|c/US}}, we should recognize
-- town as "city-like" and categorize under [[:Category:Neighborhoods in Massachusetts]]. (Note "ใน" not "ของ", and
-- note the spelling "neighborhoods" because we're in the US.)
-- (4) If the holonym is not city-like, we do nothing. If there's a city or city-like placetype farther up (e.g. we're
-- called as {{tl|place|district|ward/Foo|mun/Bar|...}}), we will handle the city-like entity according to (2) or
-- (3) when called on that holonym. Otherwise either the categorization in (1) takes place or there's no
-- categorization.
local function district_neighborhood_cat_handler(data)
local function get_plural_entry_placetype(location_spec, container_trail)
if data.entry_placetype == "suburb" then
return "Suburbs"
else
-- Check for `british_spelling` setting on the spec itself or any container.
local uses_british_spelling = location_spec.british_spelling
if uses_british_spelling == nil and container_trail then
for _, container_set in ipairs(container_trail) do
local must_outer_break = false
for _, container in ipairs(container_set) do
if container.spec.british_spelling ~= nil then
uses_british_spelling = container.spec.british_spelling
must_outer_break = true
break
end
end
if must_outer_break then
break
end
end
end
return uses_british_spelling and "Neighbourhoods" or "Neighborhoods"
end
end
-- First check the immediate holonym to see if it's a city or a city-like top-level entity (Hong Kong, Bonaire,
-- etc.)
local group, key, spec, container_trail = export.find_matching_holonym_location(data)
if group and not spec.is_former_place and spec.is_city then
return {get_plural_entry_placetype(spec, container_trail) .. " of " .. export.get_prefixed_key(key, spec)}
end
-- If the entry placetype is neighbo(u)rhood, assume it is a neighborhood even if there isn't a city-like
-- entity father up the chain. (E.g. due to a mistaken use of m/ instead of mun/ for municipality.)
local has_neighborhoods
local entry_placetype = data.entry_placetype
if entry_placetype == "neighborhood" or entry_placetype == "neighbourhood" or entry_placetype == "suburb" then
has_neighborhoods = true
else
-- Otherwise, make sure the current holonym is city-like.
has_neighborhoods = export.get_equiv_placetype_prop(data.holonym_placetype, function(pt)
return export.get_placetype_prop(pt, "has_neighborhoods")
end, {continue_on_nil_only = true})
end
if has_neighborhoods then
-- Loop up the holonyms, looking for city and city-like entities in case of e.g. [[Sepulveda]] written
-- {{place|en|neighborhood|valley/San Fernando Valley|city/Los Angeles|s/California|c/USA}}
-- but also look for a recognizable poldiv, and if so categorize as "Neighborhoods in POLDIV". We need
-- to start with the current holonym, which is especially important for neighborhoods and suburbs that
-- may have the first holonym be a recognizable province, etc. but can't hurt otherwise. (Previously
-- we skipped the first/current holonym.)
for other_holonym_index, other_holonym in export.get_holonyms_to_check(data.place_desc,
data.holonym_index) do
local other_holonym_data = {
holonym_placetype = other_holonym.placetype,
holonym_placename = other_holonym.unlinked_placename,
holonym_index = other_holonym_index,
place_desc = data.place_desc,
}
local group, key, spec, container_trail = export.find_matching_holonym_location(other_holonym_data)
if group and not spec.is_former_place then
return {get_plural_entry_placetype(spec, container_trail) .. (spec.is_city and "ของ" or "ใน") ..
export.get_prefixed_key(key, spec)}
end
end
end
end
function export.check_already_seen_string(holonym_placename, already_seen_strings)
local canon_placename = ulower(m_links.remove_links(holonym_placename))
if type(already_seen_strings) ~= "table" then
already_seen_strings = {already_seen_strings}
end
for _, already_seen_string in ipairs(already_seen_strings) do
if canon_placename:find(already_seen_string) then
return true
end
end
return false
end
-- Prefix display handler that adds a prefix such as "Metropolitan Borough of " to the display
-- form of holonyms. We make sure the holonym doesn't contain the prefix or some variant already.
-- We do this by checking if any of the strings in ALREADY_SEEN_STRINGS, either a single string or
-- a list of strings, or the prefix if ALREADY_SEEN_STRINGS is omitted, are found in the holonym
-- placename, ignoring case and links. If the prefix isn't already present, we create a link that
-- uses the raw form as the link destination but the prefixed form as the display form, unless the
-- holonym already has a link in it, in which case we just add the prefix.
local function prefix_display_handler(prefix, holonym_placename, already_seen_strings)
if export.check_already_seen_string(holonym_placename, already_seen_strings or ulower(prefix)) then
return holonym_placename
end
if holonym_placename:find("%[%[") then
return prefix .. " " .. holonym_placename
end
return prefix .. " [[" .. holonym_placename .. "]]"
end
-- Suffix display handler that adds a suffix such as " parish" to the display form of holonyms.
-- Works identically to prefix_display_handler but for suffixes instead of prefixes.
local function suffix_display_handler(suffix, holonym_placename, already_seen_strings, include_suffix_in_link)
if export.check_already_seen_string(holonym_placename, already_seen_strings or ulower(suffix)) then
return holonym_placename
end
if holonym_placename:find("%[%[") then
return holonym_placename .. " " .. suffix
end
if include_suffix_in_link then
return "[[" .. holonym_placename .. " " .. suffix .. "]]"
else
return "[[" .. holonym_placename .. "]] " .. suffix
end
end
-- Display handler for boroughs. New York City boroughs are display as-is. Others are suffixed
-- with "borough".
local function borough_display_handler(holonym_placetype, holonym_placename)
local unlinked_placename = m_links.remove_links(holonym_placename)
if m_locations.new_york_boroughs[unlinked_placename] then
-- Hack: don't display "borough" after the names of NYC boroughs
return holonym_placename
end
return suffix_display_handler("borough", holonym_placename)
end
local function county_display_handler(holonym_placetype, holonym_placename)
local unlinked_placename = m_links.remove_links(holonym_placename)
-- Display handler for Irish counties. Irish counties are displayed as e.g. "County [[Cork]]".
if m_locations.ireland_counties["County " .. unlinked_placename .. ", Ireland"] or
m_locations.northern_ireland_counties["County " .. unlinked_placename .. ", Northern Ireland"] then
return prefix_display_handler("เทศมณฑล", holonym_placename)
end
-- Display handler for Taiwanese counties. Taiwanese counties are displayed as e.g. "[[Chiayi]] County".
if m_locations.taiwan_counties[unlinked_placename .. " County, Taiwan"] then
return suffix_display_handler("เทศมณฑล", holonym_placename)
end
-- Display handler for Romanian counties. Romanian counties are displayed as e.g. "[[Cluj]] County".
if m_locations.romania_counties[unlinked_placename .. " County, Romania"] then
return suffix_display_handler("เทศมณฑล", holonym_placename)
end
-- FIXME, we need the same for US counties but need to key off the country, not the specific county.
-- Others are displayed as-is.
return holonym_placename
end
-- Display handler for prefectures. Japanese prefectures are displayed as e.g. "[[Fukushima]] Prefecture".
-- Others are displayed as e.g. "[[Fthiotida]] prefecture".
local function prefecture_display_handler(holonym_placetype, holonym_placename)
local unlinked_placename = m_links.remove_links(holonym_placename)
local suffix = m_locations.japan_prefectures[unlinked_placename .. " Prefecture, Japan"] and "Prefecture" or "prefecture"
return suffix_display_handler(suffix, holonym_placename)
end
-- Display handler for provinces of Iran, Laos, North and South Korea, Thailand, Turkey and Vietnam. Recognized
-- provinces are displayed as e.g. "[[Gyeonggi]] Province" or "[[Antalya]] Province". Others are displayed as-is.
local function province_display_handler(holonym_placetype, holonym_placename)
local unlinked_placename = m_links.remove_links(holonym_placename)
if
m_locations.iran_provinces[unlinked_placename .. ", Iran"] or
m_locations.laos_provinces[unlinked_placename .. ", Laos"] or
m_locations.north_korea_provinces[unlinked_placename .. ", North Korea"] or
m_locations.south_korea_provinces[unlinked_placename .. ", South Korea"] or
m_locations.thailand_provinces[unlinked_placename .. ", ไทย"] or
m_locations.turkey_provinces[unlinked_placename .. ", Turkey"] or
m_locations.vietnam_provinces[unlinked_placename .. ", เวียดนาม"] then
return suffix_display_handler("จังหวัด", holonym_placename)
end
return holonym_placename
end
-- Display handler for Nigerian states. Nigerian states are display as "[[Kano]] State". Others are displayed as-is.
local function state_display_handler(holonym_placetype, holonym_placename)
local unlinked_placename = m_links.remove_links(holonym_placename)
if m_locations.nigeria_states[unlinked_placename .. " State, Nigeria"] then
return suffix_display_handler("รัฐ", holonym_placename)
end
return holonym_placename
end
-- Display handler for voivodeships. Display as e.g. [[Subcarpathian Voivodeship]].
local function voivodesip_display_handler(holonym_placetype, holonym_placename)
return suffix_display_handler("Voivodeship", holonym_placename, nil, "include_suffix_in_link")
end
------------------------------------------------------------------------------------------
-- Placetype data --
------------------------------------------------------------------------------------------
--[==[ var:
Main placetype data structure. This specifies, for each canonicalized placetype, various properties. The keys are
placetypes (in the singular, except for category-only placetypes, which are plural and followed by `!`), and the value
is a table of properties. The `"*"` key is special and is used for adding "generic" categories of the form
`สถานที่ใน``location`` `; it runs for all entry placetypes. Keys in the form of plural placetypes followed by `!` are
used only in [[Module:category tree/topic cat/data/Places]] for specifying the properties of categories containing the
specified placetype, esp. bare categories like [[:Category:States and territories]] (rather than qualified categories
like [[:Category:States and territories of Australia]]).
Keys under the value table for a given placetype of are two types: ''property keys'' (which specify the value of
specific properties) and ''categorization keys'' (which tell how to categorize certain sorts of holonyms if the
placetype in question occurs as an entry placetype). Categorization keys are either the special value `default` or are
wildcard strings with a slash in them, such as `"country/*"`. Note that only wildcard strings are currently allowed
directly in the placetype data; everything else is handled through category handlers, either per-placetype or special
(such as `political_division_cat_handler`). The algorithm for how category keys and handlers are used to generate
categories is described at the top of [[Module:place]].
There are several recognized property keys, of various types:
1. The following link-related property keys are recognized:
* `link`: '''Required''' except in category-only placetypes ending in `!`. Describes how to link and display the
placetype in the formatted description when occurring as an entry placetype. Also used for formatting pluralized
placetypes (which may occur in entry placetypes, esp. new-format ones, such as `two <<islands>>`) and may occur in
categories). The possible values are:
*# `true`: Link to the same-named Wiktionary entry. This creates a raw link, e.g. `<nowiki>[[city]]</nowiki>`, which is
converted to an English-specific link by JavaScript postprocessing. If the placetype is plural, this creates a
two-part raw link e.g. `<nowiki>[[city|cities]]</nowiki>`.
*# `"w"`: Link to the same-named Wikipedia entry. This creates a two-part link, e.g.
`<nowiki>[[w:census town|census town]]</nowiki>`, or `<nowiki>[[w:census town|census towns]]</nowiki>` if the
placetype is given plural.
*# `"+..."`: Create a two-part link to the entry following the `+` sign. For example, if `cercle` specifies
`"+w:cercles of Mali"`, a two-part link `<nowiki>[[w:cercles of Mali|cercle]]</nowiki>` will be generated, or
`<nowiki>[[w:cercles of Mali|cercles]]</nowiki>` if plural `cercles` is specified.
*# `"separately"`: Link each word separately. For example, if `administrative territory` specifies `"separately"`, it
will be linked as `<nowiki>[[administrative]] [[territory]]</nowiki>`, or as
`<nowiki>[[administrative]] [[territory|territories]]</nowiki>` if plural `administrative territories` is given.
*# another string: Use that string directly. If the placetype is plural, `pluralize()` in [[Module:en-utilities]] is
called on the string, which will correctly pluralize most strings, including those with links in them. (If there
are multiple links, the display form of the last link is pluralized.)
*# `false`: This placetype is not allowed as an entry placetype. An error will be thrown if this placetype is given as
an entry placetype. This is specified for internal-use placetypes, especially placetypes used in conjunction with
the qualifiers `former`, `ancient`, `historical` and such.
* `plural_link`: If specified and the placetype is plural, use the value in place of generating a pluralized version of
the link spec in `link`. Most commonly, this is either a string with links in it (which is used directly) or the
value `false`, indicating that the placetype cannot occur plural. (This is used for example by `caplc`, which displays
as `<nowiki>[[capital]] and [[large]]st [[city]]</nowiki>`, where a plural version doesn't make sense.) Generally if
this is specified, `plural` also needs to be specified to give a special placetype plural; this situation occurs
especially with multiword placetypes where something other than the last word is pluralized. An example is
`town with bystatus`, whose plural is `towns with bystatus`, which needs to be explicitly given. This example uses
`link = <nowiki>"[[town]] with [[bystatus#Norwegian Bokmål|bystatus]]"</nowiki>` ({{m|nb|bystatus}}) is a Norwegian
Bokmål word, and template calls aren't currently permitted in link strings), along with
`plural_link = <nowiki>"[[town]]s with [[bystatus#Norwegian Bokmål|bystatus]]"</nowiki>`.
* `category_link`: Spec indicating how to display the placetype when occurring in category descriptions. Defaults to
the value of `link`, and in turn is overridden by more specific `category_link_*` keys; see below. Category-only
placetypes (which are plural and end in `!`) usually use `category_link` in preference to `link`. The value of
`category_link` can be any of the types of specs given above, but most commonly is a plural string with links in it,
spelling out the description; in this case it is used directly. When both `category_link` and `link` are given, the
value in `category_link` is typically longer and more descriptive. For example, `polity` uses `link = true`, which
just generates a link `<nowiki>[[polity]]</nowiki>` or plural `<nowiki>[[polity|polities]]</nowiki>`, but specifies a
separate `category_link = <nowiki>"[[independent]] or [[semi-]][[independent]] [[polity|polities]]"</nowiki>`, which
clarifies in the category description what a polity is.
* `category_link_top_level`: Spec indicating how to display top-level (bare/unqualified) categories, i.e. categories
where the placetype is not followed by `in ``location`` ` or `of ``location`` `. If given, this overrides
`category_link` for this type of category.
* `category_link_before_noncity`: Spec indicating how to display qualified categories of the form
` ``placetypes`` in/of ``location`` ` where ``location`` does not refer to a city. If given, this overrides
`category_link` for this type of category.
* `category_link_before_city`: Spec indicating how to display qualified categories of the form
` ``placetypes`` in/of ``location`` ` where ``location`` refer to a city. If given, this overrides `category_link` for
this type of category. An example where this is given is `neighborhood`, which uses the following specs:<ol>
<li>`link = true`</li>
<li>`category_link = <nowiki>"[[neighborhood]]s, [[district]]s and other subportions of [[city|cities]]"</nowiki>`</li>
<li>`category_link_before_city = <nowiki>"[[neighborhood]]s, [[district]]s and other subportions"</nowiki>`</li>
</ol> This has the effect of making the entry placetype `neighborhood` display as just
`<nowiki>[[neighborhood]]</nowiki>`, while e.g. a category like `Neighborhoods of Chicago` displays as
`<nowiki>[[neighborhood]]s, [[district]]s and other subportions of [[Chicago]], ...</nowiki>` and a category like
`Neighborhoods in Illinois, USA` displays as
`<nowiki>[[neighborhood]]s, [[district]]s and other subportions of [[city|cities]] in [[Illinois]], ...</nowiki>`.
* `disallow_in_entries`: If specified, this placetype cannot occur as an entry placetype, and the specified value
(a message indicating what to use instead) is displayed in the error message.
* `disallow_in_holonyms`: If specified, this placetype cannot occur as a holonym placetype, and the specified value
(a message indicating what to use instead) is displayed in the error message.
2. There is currently one fallback-related property key recognized:
* `fallback`: If specified, its value is a placetype which will be used for categorization purposes if no categories
get added using the placetype itself. As an example, `branch` sets a fallback of `river` but also sets
`preposition = "ของ"`, meaning that {{tl|place|en|branch|riv/Mississippi}} displays as `a branch of the Mississippi`
(whereas `river` itself uses the preposition `in`), but otherwise categorizes the same as `river`. A more complex
example is `area`, which sets a fallback of `geographic and cultural area` and also sets a category handler that
checks for cities or city-like entities (e.g. boroughs) occurring as holonyms and categorizes the toponym under
[[:Category:Neighborhoods of CITY]] (for recognized cities) or otherwise [[:Category:Neighborhoods of POLDIV]] (for
the nearest containing recognized location). In addition, `area` is set as a political division of Kuwait, meaning if
`c/Kuwait` occurs as holonym, the toponym is categorized under [[:Category:Areas of Kuwait]]. If none of these
categories trigger, the fallback of `geographic and cultural area` will take effect, and the toponym will be
categorized as e.g. [[:Category:Geographic and cultural areas of England]].
3. There is currently one property to control irregular plurals of placetypes:
* `plural`: If specified, its value is the plural of the placetype. Otherwise, the default pluralization algorithm in
[[Module:en-utilities]] applies (which correctly pluralizes most words, including those ending in `-y`, `-ch`, `-sh`,
`-x`, etc.). The value of `plural` is also used when converting a pluralized placetype into its singular equivalent;
for example, since the placetype `kibbutz` has `plural = "kibbutzim"`, the placetype `kibbutzim` will be recognized
as a plural and singularized to `kibbutz`. For this reason, it's occasionally necessary to specify a `plural` value
even when the default pluralization algorithm works correctly, if the default singularization algorithm won't
correctly reverse the pluralization (as with `pass` and other terms ending in `-ss`).
4. The following property keys relate to generating categories for entry placetypes and specifying the parents of those
categories:
* `class`: The general class of placetype. This is used for various purposes: (a) to categorize placetypes preceded by
a qualifier such as `former`, `ancient`, `medieval` or `historical` (note that these placetypes are not all treated
alike); (b) to determine the parent category of bare placetype categories (e.g. [[:Category:Villages]] for placetype
`village`); (c) to determine whether to add a parent category `political divisions of specific countries` to
qualified placetype categories (e.g. [[:Category:Villages in Mali]]). The possible values are:
*# `polity`: a more-or-less sovereign/independent polity, such as a country, kingdom or empire.
*# `subpolity`: a non-sovereign division of a polity, above the level of an individual settlement.
*# `settlement`: a city or smaller equivalent, such as a village. This also includes administrative divisions of a
settlement, such as wards and barangays.
*# `non-admin settlement`: similar to a settlement but without administrative or political significance, such as an
unincorporated community, farm or neighborhood.
*# `capital`: a settlement that is a capital. A former capital is generally still in existence, just not the capital
any more.
*# `natural feature`: any non-man-made feature, such as a lake, mountain, island, ocean, etc.
*# `man-made structure`: a man-made feature below the level of a neighborhood, such as a house, airport, university,
metro station, park or the like.
*# `geographic region`: a geographic or cultural region or area that has no administrative significance. These may vary
greatly in size but typically have some sort of cultural significance (possibly historical). The `former`, `ancient`,
etc. qualifier has no effect on the category of these placetypes.
*# `generic place`: a place that isn't further qualified into any specific subtype.
* `former_type`: The class of placetype used for categorizing placetypes preceded by a qualifier such as `former`,
`ancient`, `medieval` or `historical`. The possible values are the same as for `class` but with the addition of
`dependent territory` (for colonies, protectorates and the like) and `!` (ignore the historical/former/ancient/etc.
qualifier; used e.g. with `fictional location` and `mythological location`). If not specified, the value of `class`
is used. When a qualifier such as `former`, `ancient`, `medieval` or `historical` is encountered (specifically, those
in `former_qualifiers`), it is mapped using `former_qualifiers` to the appropriate internal qualifier or qualifiers
(one or both of `ANCIENT` and/or `FORMER`, which are written in all-caps to distinguish them from user-specified
qualifiers), which is prepended to the value of `former_type` or `class` to form a placetype whose properties are
looked up to determine how to categorize the toponym in question. For example, if `medieval village` is given, we map
`medieval` to `ANCIENT` and `FORMER`, and `village` to its `class` of `settlement`, and enter the placetypes
`ANCIENT settlement` and `FORMER settlement` (in that order) into the list of equivalent placetypes returned by
`get_placetype_equivs`. In this case, there is an entry in `placetype_data` for `ANCIENT settlement`, so its default
category spec `Ancient settlements` is used as the category. If on the other hand `medieval kingdom` is given, where
`kingdom` has a `class` value `polity`, we first look up `ANCIENT polity`, see there is no entry in `placetype_data`
for it, and then look up `FORMER polity`, which exists and has a default category spec `Former polities`, which is
used as the category. Note that if the placetype following the "former" qualifier is recognized in `placetype_data`
but has no `former_type` or `class` and no fallback with a `former_type` or `class` specified, it is an internal
error; but if the placetype isn't recognized (e.g. something like `former greenhouse` is specified and we don't have
an entry for `greenhouse`), we just track the occurrence and end up not categorizing.
* `bare_category_parent`: This specifies the first parent category of a bare placetype category named according to the
placetype in question (e.g. [[:Category:Atolls]] for placetype `atoll`, or [[:Category:Named buildings]] for
placetype `named buildings!`). If not specified, the first parent category is determined by the value of `class`,
using the mapping `class_to_bare_category_parent` in [[Module:category tree/topic cat/data/Places]].
* `addl_bare_category_parents`: Extra parent categories to add a bare placetype category to (see `bare_category_parent`
just above).
* `bare_category_breadcrumb`: Breadcrumb for bare placetype categories. Also used as the sort key of
`bare_category_parent` if it is a string.
* `inherently_former`: If specified and the given placetype is used as an entry placetype, act as if `former` or
`ancient` (depending on the value of `inherently_former`) were prefixed to the placetype. This is for placetypes that
always refer to no-longer-existing entities, such as `satrapy` and `treaty port`. The value of `inherently_former` is
a list of internal qualifiers (one or more of `ANCIENT` and/or `FORMER`), just as for `former_qualifiers`, and the
implementation is the same.
* `cat_handler`: Handler used to generate the categories to add a given toponym to, if its entry placetype is the
placetype in question. Generally the `cat_handler` function checks the holonyms specified in order to determine which
category or categories to generate. For example, `district_neighborhood_cat_handler` handles placetypes `district`,
`neighborhood`, `subdivision`, `suburb` and the like, and either adds the toponym to a category like
`Neighborhoods of ``city`` ` (if a recognized city is given as a holonym), or otherwise a category like
`Neighborhoods in ``location`` ` (for the first recognized non-city location given as a holonym, if an unrecognized
city or city-like entity is given before the recognized non-city). The algorithm that runs the category handlers
iterates over holonyms from left to right, running the `cat_handler` function on each holonym in turn until one or
more categories are returned; see below for more specifics. (Note that countries for which e.g. a `district` is a
political division do not get the corresponding category added by the `district_neighborhood_cat_handler` function but
by `political_division_cat_handler`.) `cat_handler` functions are called with one argument, `data`, describing the
resolved entry placetype (i.e. after resolving placetype aliases and fallbacks) and the holonym being processed. The
return value should be a list of category specs (categories minus the langcode prefix, with `+++` standing for the
holonym key, or the value `true`, which stands for ` ``Placetypes`` in/of ``Holonym`` `, i.e. the pluralized placetype
with the appropriate preposition as specified in `placetype_data`). `data` contains the following fields:
** `entry_placetype`: the resolved entry placetype for the entry placetype being processed (i.e. it will always have an
entry in `placetype_data` but may not be the original placetype given by the user);
** `holonym_placetype` and `holonym_placename`: the holonym placetype and placename being processed;
** `holonym_index`: the index of the holonym being processed, or {nil} if we're handling an overriding holonym (FIXME:
we will change the overriding holonym algorithm so there will be an index even when processing overriding holonyms);
** `place_desc`: a full description of the {{tl|place}} call, as specified at the top of [[Module:place]];
** `from_demonym`: If set, we are called from [[Module:demonym]], triggered by {{tl|demonym-adj}} or
{{tl|demonym-noun}}, instead of being triggered by {{tl|place}}.
* `has_neighborhoods`: If `true`, the specified placetype is city-like. This is used in the
`district_neighborhood_cat_handler` to determine whether to add a category such as `Neighborhoods in ``location`` `;
see the section just above on `cat_handler`.
5. The following preposition-related property keys are recognized:
* `preposition`: The preposition used after this placetype when it occurs as an entry placetype. Defaults to `"ใน"`.
* `generic_before_non_cities`: If specified, the appropriate category description handler in
[[Module:category tree/topic cat/data/Places]] will recognize categories of the form
` ``Placetype`` in/of ``location`` ` for the specified placetype and preposition, if ``location`` is a non-city. This
is used to generate descriptions for categories added by category handlers and by explicit category specs in the
placetype data. All placetypes that specify `generic_before_non_cities` or `generic_before_cities` *MUST* also specify
a value for `class` so that the category tree code can determine whether it's a political or non-political division.
* `generic_before_cities`: Like `generic_before_non_cities` but for locations referring to cities.
6. The following property keys control the auto-addition of affixes when formatting holonyms of a particular placetype:
* `affix_type`: If specified, add the placetype as an affix before or after holonyms of this placetype. Possible values
are:
*# `"pref"` (the holonym will display as `(the) placetype of Holonym`, where `the` appears when the holonym directly
follows an entry placetype);
*# `"Pref"` (same as `"pref"` but the placetype is capitalized; each word is capitalized if there are multiple);
*# `"suf"` (the holonym will display as `Holonym placetype`);
*# `"Suf"` (the holonym will display as `Holonym Placetype`, i.e. same as `"suf"` but the placetype is capitalized).
* `suffix`: String to use in place of the placetype itself when the placetype is displayed as a suffix after a holonym.
Note that `suffix` can be used independently of `affix_type` because the user can also request a suffix explicitly
using a syntax like `adr:suf/Occitania`, which will display as `Occitania region` because the placetype
`administrative region` specifies `suffix = "ภูมิภาค"`.
* `prefix`: Like `suffix` but for use when the placetype is displayed as a prefix before the holonym.
* `affix`: Like `suffix` and `prefix` but for use when the placetype is displayed as an affix either before or after the
holonym. If both `suffix` or `prefix` and `affix` are given for a single placetype, `suffix` or `prefix` take
precedence.
* `no_affix_strings`: String or list of strings that, if they occur in the holonym, suppress the addition of any affix
requested using `affix_type`. Defaults to the placetype itself. For example, `autonomous okrug` specifies
`affix_type = "Suf"` so that `aokr/Nenets` displays as `Nenets Autonomous Okrug`, but also specifies
`no_affix_strings = "okrug"` so that `aokr/Nenets Okrug` or `aokr/Nenets Autonomous Okrug` displays as specified,
without a redundant `Autonomous Okrug` added. Matching is case-insensitive but whole-word.
* `display_handler`: A function of two arguments, `holonym_placetype` and `holonym_placename` (specifying a holonym).
Its return value is a string specifying the display form of the holonym.
7. The following property keys control the indefinite and definite articles used before entry placetypes and/or holonyms
of the specified placetype.
* `entry_placetype_use_the`: Use `"the"` before this placetype when it occurs as an entry placetype.
* `entry_placetype_indefinite_article`: Indefinite article used before this placetype when it occurs as an entry
placetype (usually `"a"`, specifically for placetypes beginning with u- that don't take the indefinite article
`"an"`). Defaults to the appropriate indefinite article (`"a"` or `"an"` depending on whether the placetype begins
with a vowel). Overridden by `entry_placetype_use_the`, and unlike for most properties, does not apply to equivalent
placetypes (i.e. fallbacks or those formed by removing a qualifier from the beginning); only to the exact placetype
specified.
* `holonym_use_the`: Use `"the"` before holonyms of this placetype.
'''NOTE:'''
# The `link` property must be specified on all placetypes, except those ending in `!` (category-only placetypes), which
must have either `link` or `category_link` specified.
# Either the `class` or `former_type` property must be specified on all placetypes not ending in `!` that do not have a
fallback (if a placetype has a fallback and omits the `class` and `former_type` properties, they are taken from the
fallback). An internal error will result if a placetype has no `class` or `former_type` property derivable either
directly or through a fallback, if an attempt is made to categorize a former/ancient/historical/etc. entity of this
placetype.
# It is possible to have multiple levels of fallback (e.g. `frazione` falls back to `hamlet`, which falls back
to `village`). Fallback loops will cause an internal error. All placetypes specified as fallbacks must exist in
`placetype_data` or an internal error occurs.
]==]
export.placetype_data = {
--[=[
If you need to sort the following, do this (using Vim):
1. Make sure all full-line comments are within the { ... } table, or are moved after and on the same line as single-line
entries.
2. Make sure the table uses tabs everywhere for indent, and not spaces.
3. Mark the top of the table with `ma`, go to the bottom and execute the following two lines in sequence:
:'a,.s/\n/\\n/g
:s/\\n\(\t\[\)/\r\1/g
The first command converts every newline to a literal `\n` sequence, so the whole thing becomes a single line, while
the second command restores the newlines before the beginning of each entry. The effect is to convert all entries to
a single line while not losing any information. (Potentially a negative lookahead could be used to do it all in one
command.)
4. Execute the following to sort:
:'a,.!perl -pe 's/^(\t\[")(.*?)(".*)$/$2 @@@ $1$2$3/' | sort -f | perl -pe 's/.*? @@@ //'
Note that a simple `sort -f` (where `-f` means case-insensitive) would almost work, but it would sort "hill station"
before "hill" and "county borough" before "เทศมณฑล" because the space after e.g. "hill station" sorts before the
quotation mark after e.g. "hill". The above command deals with this by extracting the key, prepending it followed by
` @@@ `, sorting, and then removing key (the classic decorate-sort-undecorate pattern).
5. Put the table back to multi-line format by marking the top of the table with `ma`, going to the bottom and executing
:'a,.s/\\n/\r/g
Note that for some reason, in order to get a match a newline in the left side of a replacement, you must use \n, but
to insert a newline in the right sode of a replacement you must use \r.
]=]
["*"] = {
link = false,
cat_handler = generic_place_cat_handler,
},
["administrative atoll"] = {
-- Maldives
link = "+w:administrative divisions of the Maldives",
preposition = "ของ",
class = "subpolity",
},
["administrative capital"] = {
link = "w",
fallback = "capital city",
},
["administrative center"] = {
link = "w",
fallback = "non-city capital",
},
["administrative centre"] = {
link = "w",
fallback = "administrative center",
},
["administrative county"] = {
link = "w",
fallback = "เทศมณฑล",
},
["administrative district"] = {
link = "w",
fallback = "อำเภอ",
},
["administrative headquarters"] = {
link = "separately",
fallback = "administrative centre",
},
["administrative region"] = {
link = true,
preposition = "ของ",
suffix = "ภูมิภาค", -- but prefix is still "administrative region (of)"
fallback = "ภูมิภาค",
class = "subpolity",
},
["administrative seat"] = {
link = "w",
fallback = "administrative centre",
},
["administrative territory"] = {
link = "separately",
preposition = "ของ",
suffix = "ดินแดน", -- but prefix is still "administrative territory (of)"
fallback = "ดินแดน",
class = "subpolity",
},
["administrative unit"] = {
-- Grrr, it's difficult to generalize about "administrative units". In Albania, "administrative unit" is an
-- official term for a city-level division of municipalities; Wikipedia renders it using the more practical term
-- "commune". In Pakistan, "administrative unit" is a collective term used to refer to all the different types
-- of first-level divisions (four provinces, one federal territory, and two "disputed territories", i.e. Azad
-- Kashmir and Gilgit-Balistan, that are variously described). For this reason, we set no fallback, but we need
-- to include this so that it can be used as a placetype for Albania, categorizing as communes.
link = "w",
class = "subpolity",
},
["administrative village"] = {
link = "w",
preposition = "ของ",
has_neighborhoods = true,
class = "settlement",
},
["aimag"] = {
-- used in Mongolia, Russia and China (Inner Mongolia); in Mongolia, equivalent to a province;
-- in China, equivalent to a prefecture (below a province); in Russia, equivalent to a municipal district.
link = "w",
fallback = "prefecture",
},
["airport"] = {
link = true,
class = "man-made structure",
default = {true},
},
["alliance"] = {
link = true,
fallback = "confederation",
},
["archipelago"] = {
link = true,
fallback = "เกาะ",
},
["area"] = {
link = true,
preposition = "ของ",
fallback = "geographic and cultural area",
-- Areas can either be administrative divisions (specifically of Kuwait) or geographic areas. Assume the former
-- when categorizing 'Areas' but the latter when handling e.g. 'historical area'.
class = "subpolity",
former_type = "geographic region",
cat_handler = district_neighborhood_cat_handler,
},
["arm"] = {
link = true,
preposition = "ของ",
class = "natural feature",
default = {"ทะเล"},
},
["arrondissement"] = {
link = true,
preposition = "ของ",
-- FIXME!!! Grrrrr!!! In some countries, arrondissements are divisions of cities; in others, they are divisions
-- of departments or provinces. Need to conditionalize on the country for both of the following.
class = "subpolity",
has_neighborhoods = true,
},
["associated province"] = {
link = "separately",
fallback = "จังหวัด",
},
["atoll"] = {
-- FIXME! Atolls are administrative divisions of the Maldives but natural features elsewhere. Need to
-- conditionalize `class` on the country. See also `administrative atoll`.
link = true,
class = "natural feature",
bare_category_parent = "เกาะ",
default = {true},
},
["autonomous city"] = {
link = "w",
preposition = "ของ",
fallback = "นคร",
has_neighborhoods = true,
},
["autonomous community"] = {
-- Spain; refers to regional entities, not village-like entities, as might be expected from "community"
link = true,
preposition = "ของ",
class = "subpolity",
},
["autonomous island"] = {
-- Comoros; seems like an administrative atoll of the Maldives.
link = "+w:autonomous islands of Comoros",
preposition = "ของ",
class = "subpolity",
},
["autonomous oblast"] = {
link = true,
preposition = "ของ",
affix_type = "Suf",
no_affix_strings = "oblast",
class = "subpolity",
},
["autonomous okrug"] = {
link = true,
preposition = "ของ",
affix_type = "Suf",
no_affix_strings = "okrug",
class = "subpolity",
},
["autonomous prefecture"] = {
link = true,
fallback = "prefecture",
},
["autonomous province"] = {
link = "w",
fallback = "จังหวัด",
},
["autonomous region"] = {
link = "w",
preposition = "ของ",
fallback = "administrative region",
-- "administrative region" sets an affix of "ภูมิภาค" but we want to display as "Tibet Autonomous Region"
-- if the user writes 'ar:Suf/Tibet'.
affix = "autonomous region",
},
["autonomous republic"] = {
link = "w",
preposition = "ของ",
class = "subpolity",
},
["autonomous territorial unit"] = {
-- Moldova; only two of them, one for Gagauzia and one for Transnistria.
link = "w",
preposition = "ของ",
class = "subpolity",
},
["autonomous territory"] = {
link = "w",
fallback = "dependent territory",
},
["bailiwick"] = {
-- Jersey, etc.
link = true,
fallback = "องค์การทางการเมือง",
},
["barangay"] = {
-- Philippines
link = true,
class = "settlement",
-- Barangays are formal administrative divisions of a city rather than informal neighborhoods, but can use
-- some of the properties of a neighborhood.
fallback = "neighborhood",
},
["barrio"] = {
-- Spanish-speaking countries; Philippines
link = true,
-- FIXME: Not completely correct, in some countries barrios are formal administrative divisions of a city.
-- `class` will need to conditionalize on the country to be completely correct.
fallback = "neighborhood",
},
["basin"] = {
link = true,
fallback = "ทะเลสาบ",
},
["bay"] = {
link = true,
preposition = "ของ",
class = "natural feature",
addl_bare_category_parents = {"bodies of water"},
default = {true},
},
["beach"] = {
link = true,
class = "natural feature",
addl_bare_category_parents = {"water"},
default = {true},
},
["beach resort"] = {
link = "w",
fallback = "resort town",
},
["bishopric"] = {
link = true,
fallback = "องค์การทางการเมือง",
},
["bodies of water!"] = {
-- FIXME: This is (maybe?) a type category not a name category. There should be an option for this. We need to
-- straighten out the type vs. name vs. related-to issue.
category_link = "[[body of water|bodies of water]]",
class = "natural feature",
addl_bare_category_parents = {"ธรณีสัณฐาน", "ecosystems", "water"},
},
["borough"] = {
link = true,
preposition = "ของ",
display_handler = borough_display_handler,
has_neighborhoods = true,
-- "former borough" could be a former settlement or a former part of a city but seems more likely to
-- be a former subpolity, particularly in England. FIXME, we really need a handler to take care of this
-- properly.
class = "subpolity",
-- Grr, some boroughs are city-like but some (e.g. in Britain) may be larger.
},
["borough seat"] = {
link = true,
entry_placetype_use_the = true,
preposition = "ของ",
has_neighborhoods = true,
class = "capital",
},
["branch"] = {
link = true,
preposition = "ของ",
fallback = "แม่น้ำ",
},
["bridge"] = {
link = true,
class = "man-made structure",
default = {"Named bridges"},
},
["building"] = {
link = true,
class = "man-made structure",
default = {"Named buildings"},
},
["built-up area"] = {
link = "w",
fallback = "area",
},
["burgh"] = {
link = true,
fallback = "borough",
},
["business park"] = {
link = true,
fallback = "park",
},
["caliphate"] = {
link = true,
fallback = "องค์การทางการเมือง",
},
["canton"] = {
link = true,
preposition = "ของ",
affix_type = "suf",
class = "subpolity",
},
["cape"] = {
link = true,
fallback = "headland",
},
["capital"] = {
link = true,
fallback = "capital city",
},
["capital city"] = {
link = true,
category_link = "[[capital city|capital cities]]: the [[seat of government|seats of government]] for a country or [[political]] [[division]] of a country",
entry_placetype_use_the = true,
preposition = "ของ",
has_neighborhoods = true,
class = "capital",
bare_category_parent = "นคร",
cat_handler = capital_city_cat_handler,
default = {true},
-- The following is necessary so that e.g. [[Melbourne]] defined as {{place|en|capital city|s/Victoria|c/Australia}}
-- gets categorized in the bare category [[Category:en:Melbourne]]; otherwise placetype 'capital city' wouldn't
-- match against the placetype 'city' of Melbourne.
fallback = "นคร",
},
["caplc"] = {
link = "[[capital]] and [[large]]st [[city]]",
plural_link = false,
fallback = "capital city",
},
["captaincy"] = {
link = true,
preposition = "ของ",
class = "subpolity",
inherently_former = {"FORMER"},
},
["caravan city"] = {
link = "w",
fallback = "นคร",
class = "settlement",
inherently_former = {"ANCIENT", "FORMER"},
},
["castle"] = {
link = true,
fallback = "building",
},
["cathedral city"] = {
link = true,
fallback = "นคร",
},
["cattle station"] = {
-- Australia
link = true,
fallback = "farm",
},
["census area"] = {
link = true,
affix_type = "Suf",
has_neighborhoods = true,
class = "non-admin settlement",
},
["census-designated place"] = {
-- United States
link = true,
class = "non-admin settlement",
},
["census division"] = {
-- Canada
link = "w",
preposition = "ของ",
class = "subpolity",
},
["census town"] = {
link = "w",
fallback = "เมือง",
},
["central business district"] = {
link = true,
fallback = "neighborhood",
},
["cercle"] = {
-- Mali
link = "+w:cercles of Mali",
preposition = "ของ",
class = "subpolity",
},
["ceremonial county"] = {
link = true,
fallback = "เทศมณฑล",
},
["chain of islands"] = {
link = "[[chain]] of [[island]]s",
plural = "chains of islands",
plural_link = "[[chain]]s of [[island]]s",
fallback = "เกาะ",
},
["channel"] = {
link = true,
fallback = "strait",
},
["charter community"] = {
-- Northwest Territories, Canada
link = "w",
fallback = "village",
},
["นคร"] = {
link = true,
generic_before_non_cities = "ใน",
has_neighborhoods = true,
class = "settlement",
cat_handler = city_type_cat_handler,
default = {true},
},
["city-state"] = {
link = true,
category_link = "[[sovereign]] [[microstate]]s consisting of a single [[city]] and [[w:dependent territory|dependent territories]]",
has_neighborhoods = true,
class = "settlement",
["continent/*"] = {"City-states", "Cities in +++", "Countries in +++", "National capitals"},
default = {"City-states", "นคร", "ประเทศ", "National capitals"},
},
["civil parish"] = {
-- Mostly England; similar to municipalities
link = true,
preposition = "ของ",
affix_type = "suf",
has_neighborhoods = true,
class = "subpolity",
},
["claimed political division"] = {
link = "[[claim]]ed [[political]] [[division]]",
class = "subpolity",
default = {true},
},
["co-capital"] = {
link = "[[co-]][[capital]]",
fallback = "capital city",
},
["coal city"] = {
link = "+w:coal town",
fallback = "นคร",
},
["coal town"] = {
link = "w",
fallback = "เมือง",
},
["collectivity"] = {
link = "w",
preposition = "ของ",
-- No default; these are weird one-off governmental divisions in France (esp. for overseas collectivities)
class = "subpolity",
},
["colony"] = {
link = true,
fallback = "dependent territory",
},
["comarca"] = {
-- per Wikipedia: traditional region or local administrative division found in Portugal, Spain, and some of
-- their former colonies, like Brazil, Nicaragua, and Panama. In the Valencian Community, for example, it
-- sits between municipalities and provinces, something like a county or district.
link = true,
preposition = "ของ",
class = "subpolity",
},
["commandery"] = {
link = true,
preposition = "ของ",
class = "subpolity",
inherently_former = {"ANCIENT", "FORMER"},
},
["commonwealth"] = {
link = true,
preposition = "ของ",
-- No default; applies specifically to Puerto Rico
class = "subpolity",
},
["commune"] = {
link = true,
fallback = "เทศบาล",
},
["community"] = {
link = true,
category_link = "[[community|communities]] of all sizes",
fallback = "village",
},
["community development block"] = {
-- in India; appears to be similar to a rural municipality; groups several villages, unclear if there will be
-- neighborhoods so I'm not setting `has_neighborhoods` for now
link = "w",
affix_type = "suf",
no_affix_strings = "block",
class = "subpolity",
},
["comune"] = {
-- Italy, Switzerland
link = true,
fallback = "เทศบาล",
},
["condominium"] = {
link = true,
fallback = "องค์การทางการเมือง",
},
["confederacy"] = {
link = true,
fallback = "confederation",
},
["confederation"] = {
link = true,
fallback = "องค์การทางการเมือง",
},
["constituency"] = {
-- currently we have them as political divisions of Namibia but many countries have them
link = true,
preposition = "ของ",
class = "subpolity",
},
["constituent country"] = {
link = true,
preposition = "ของ",
class = "subpolity",
},
["constituent part"] = {
link = "separately",
preposition = "ของ",
class = "subpolity",
},
["constituent republic"] = {
-- Of Russia, Yugoslavia, etc.
link = "separately",
preposition = "ของ",
class = "subpolity",
},
["counties and county-level cities!"] = {
-- This is used when grouping counties and county-level cities under prefecture-level cities in China.
category_link = "[[county|counties]] and [[county-level city|county-level cities]]",
class = "subpolity",
},
["continent"] = {
link = true,
category_link = false, -- can't occur as a bare category
class = "natural feature",
default = {"Continents and continental regions"},
},
["continental region"] = {
link = "separately",
category_link = false, -- can't occur as a bare category
class = "geographic region",
fallback = "continent",
},
["continents and continental regions!"] = {
category_link = "[[continent]]s and [[continent]]-[[level]] [[region]]s (e.g. [[Polynesia]])",
class = "geographic region",
},
["council area"] = {
link = true,
-- in Scotland; similar to a county
preposition = "ของ",
affix_type = "suf",
class = "subpolity",
},
["ประเทศ"] = {
link = true,
class = "polity", --ห้ามแปล class
["continent/*"] = {true, "ประเทศ"},
default = {true},
},
["country-like entities!"] = {
category_link = "[[polity|polities]] not normally considered [[country|countries]] but treated similarly for categorization purposes; typically, [[unrecognized]] [[de-facto]] countries or [[w:dependent territory|dependent territories]]",
class = "polity", --ห้ามแปล class
},
["เทศมณฑล"] = {
link = true,
preposition = "ของ",
display_handler = county_display_handler,
class = "subpolity",
},
["county borough"] = {
link = true,
-- in Wales; similar to a county
preposition = "ของ",
affix_type = "suf",
fallback = "borough",
class = "subpolity",
},
["county seat"] = {
link = true,
entry_placetype_use_the = true,
preposition = "ของ",
has_neighborhoods = true,
class = "capital",
},
["county town"] = {
link = true,
entry_placetype_use_the = true,
preposition = "ของ",
fallback = "เมือง",
has_neighborhoods = true,
class = "capital",
},
["county-administered city"] = {
-- In Taiwan, per Wikipedia similar to a Taiwanese township or district, which is a small city.
-- NOT anything like a "county-level city" in PR China, which is a county masquerading as a city.
link = "w",
fallback = "นคร",
has_neighborhoods = true,
class = "settlement",
},
["county-controlled city"] = {
-- Taiwan
link = "w",
fallback = "county-administered city",
},
["county-level city"] = {
-- PR China
link = "w",
fallback = "prefecture-level city",
},
["crater lake"] = {
link = true,
fallback = "ทะเลสาบ",
},
["creek"] = {
link = true,
fallback = "stream",
},
["Crown colony"] = {
link = "+crown colony",
fallback = "crown colony",
},
["crown colony"] = {
link = true,
fallback = "colony",
},
["Crown dependency"] = {
link = true,
fallback = "dependent territory",
},
["crown dependency"] = {
link = true,
fallback = "dependent territory",
},
["cultural area"] = {
link = "w",
fallback = "geographic and cultural area",
},
["cultural region"] = {
link = "w",
fallback = "geographic and cultural area",
},
["delegation"] = {
-- Tunisia
link = "+w:delegations of Tunisia",
preposition = "ของ",
class = "subpolity",
},
["department"] = {
link = true,
preposition = "ของ",
affix_type = "suf",
class = "subpolity",
},
["departmental capital"] = {
link = "separately",
fallback = "capital city",
},
["dependency"] = {
link = true,
fallback = "dependent territory",
},
["dependent territory"] = {
link = "w",
preposition = "ของ",
class = "subpolity",
former_type = "dependent territory",
bare_category_parent = "political divisions",
["country/*"] = {true},
default = {true},
},
["desert"] = {
link = true,
class = "natural feature",
addl_bare_category_parents = {"ecosystems"},
default = {true},
},
["deserted mediaeval village"] = {
link = "w",
fallback = "deserted medieval village",
},
["deserted medieval village"] = {
link = "w",
fallback = "ANCIENT settlement",
},
["direct-administered municipality"] = {
-- China
link = "+w:direct-administered municipalities of China",
fallback = "เทศบาล",
},
["direct-controlled municipality"] = {
-- several countries
link = "w",
fallback = "เทศบาล",
},
["distributary"] = {
link = true,
preposition = "ของ",
fallback = "แม่น้ำ",
},
["อำเภอ"] = {
link = true,
preposition = "ของ",
affix_type = "suf",
-- Grrr! FIXME! Here is where we need handlers for `class`. Using similar logic to
-- district_neighborhood_cat_handler, we need to check if we're below or above a city to determine if the class
-- is "settlement" or "subpolity".
class = "subpolity",
cat_handler = district_neighborhood_cat_handler,
-- No default. Countries for which districts are political divisions will get entries.
},
["districts and autonomous regions!"] = {
-- This and other similar "combined placetypes" are for use in the plural when grouping first-level
-- administrative regions of certain countries, in this case Portugal.
category_link = "[[district]]s and [[autonomous region]]s",
class = "subpolity",
},
["districts and autonomous territorial units!"] = {
-- This and other similar "combined placetypes" are for use in the plural when grouping first-level
-- administrative regions of certain countries, in this case Moldova.
category_link = "[[district]]s and [[w:autonomous territorial unit|autonomous territorial unit]]s",
class = "subpolity",
},
["district capital"] = {
link = "separately",
fallback = "capital city",
},
["district headquarters"] = {
link = "separately",
fallback = "administrative centre",
},
["district municipality"] = {
-- In Canada, a district municipality is equivalent to a rural municipality and won't have neighborhoods; in
-- South Africa, district municipalities group local municipalities and hence won't have neighborhoods.
link = "w",
preposition = "ของ",
affix_type = "suf",
no_affix_strings = {"อำเภอ", "เทศบาล"},
fallback = "เทศบาล",
class = "subpolity",
},
["division"] = {
link = true,
preposition = "ของ",
class = "subpolity",
},
["division capital"] = {
link = "separately",
fallback = "capital city",
},
["dome"] = {
link = true,
fallback = "ภูเขา",
},
["dormant volcano"] = {
link = true,
fallback = "volcano",
},
["duchy"] = {
link = true,
fallback = "องค์การทางการเมือง",
},
["emirate"] = {
link = true,
preposition = "ของ",
-- FIXME: Can be subpolities (of the United Arab Emirates).
fallback = "องค์การทางการเมือง",
},
["จักรวรรดิ"] = {
link = true,
fallback = "องค์การทางการเมือง",
},
["enclave"] = {
link = true,
preposition = "ของ",
-- Enclaves can theoretically be any size but assume a subpolity.
class = "subpolity",
},
["entity"] = {
-- Bosnia and Herzegovina
link = "+w:entities of Bosnia and Herzegovina",
preposition = "ของ",
class = "subpolity",
},
["escarpment"] = {
link = true,
fallback = "ภูเขา",
},
["ethnographic region"] = {
-- used in Lithuania
link = "+w:ethnographic regions of Lithuania",
fallback = "geographic and cultural area",
},
["exclave"] = {
link = true,
preposition = "ของ",
-- exclaves can theoretically be any size but assume a subpolity.
class = "subpolity",
},
["external territory"] = {
link = "separately",
fallback = "dependent territory",
},
["farm"] = {
link = true,
class = "non-admin settlement",
default = {"Farms and ranches"},
},
["farms and ranches!"] = {
category_link = "[[farm]]s and [[ranch]]es",
class = "non-admin settlement",
},
["federal city"] = {
link = "w",
preposition = "ของ",
fallback = "นคร",
},
["federal district"] = {
link = true,
preposition = "ของ",
-- Might have neighborhoods as federal districts are often cities (e.g. Mexico City)
has_neighborhoods = true,
class = "settlement",
},
["federal subject"] = {
-- In Russia; a generic term for first-level administrative divisions (republics, oblasts, okrugs, krais,
-- autonomous okrugs and autonomous oblasts).
link = "w",
preposition = "ของ",
class = "subpolity",
},
["federal territory"] = {
link = "w",
fallback = "ดินแดน",
},
["fictional location"] = {
link = "separately",
former_type = "!",
class = "hypothetical location",
bare_category_parent = "สถานที่",
default = {true},
},
["First Nations reserve"] = {
-- Canada
link = "[[First Nations]] [[w:Indian reserve|reserve]]",
-- Wikipedia uses "Indian reserve"; presumably that is the legal term
fallback = "Indian reserve",
class = "subpolity",
},
["fjord"] = {
link = true,
class = "natural feature",
addl_bare_category_parents = {"bodies of water"},
default = {true},
},
["footpath"] = {
link = true,
fallback = "road",
},
["forest"] = {
link = true,
class = "natural feature",
addl_bare_category_parents = {"ecosystems", "forestry"},
default = {true},
},
["fort"] = {
link = true,
fallback = "building",
},
["fortress"] = {
link = true,
-- The default plural algorithm gets this right but the singularization algorithm incorrectly converts
-- fortresses -> fortresse, so put an entry here to ensure we singularize correctly.
plural = "fortresses",
fallback = "building",
},
["frazione"] = {
link = "w",
fallback = "hamlet",
},
["freeway"] = {
link = true,
fallback = "road",
},
["French prefecture"] = {
link = "[[w:prefectures in France|prefecture]]",
entry_placetype_use_the = true,
preposition = "ของ",
has_neighborhoods = true,
class = "capital",
},
["geographic and cultural area"] = {
link = "+w:cultural area",
-- `generic_before_non_cities` is used when generating the category description of categories of the format
-- `Geographic and cultural areas of PLACE`. `preposition` is used when generating {{place}} description and
-- categories for any placetype that falls back to `geographic and cultural area`.
generic_before_non_cities = "ของ",
preposition = "ของ",
class = "geographic region",
bare_category_parent = "สถานที่",
["country/*"] = {true},
["constituent country/*"] = {true},
["continent/*"] = {true},
default = {true},
},
["geographic area"] = {
link = "+w:geographic region",
fallback = "geographic and cultural area",
},
["geographic region"] = {
link = "w",
fallback = "geographic and cultural area",
},
["geographical area"] = {
link = "w",
fallback = "geographic and cultural area",
},
["geographical region"] = {
link = "w",
fallback = "geographic and cultural area",
},
["geopolitical zone"] = {
-- Nigeria
link = true,
preposition = "ของ",
class = "subpolity",
},
["gewog"] = {
-- Bhutan
link = true,
preposition = "ของ",
class = "subpolity",
},
["ghost town"] = {
link = true,
generic_before_non_cities = "ใน",
class = "non-admin settlement",
bare_category_parent = "former settlements",
cat_handler = city_type_cat_handler,
default = {true},
},
["glen"] = {
link = true,
fallback = "valley",
},
["governorate"] = {
link = true,
preposition = "ของ",
affix_type = "suf",
class = "subpolity",
},
["greater administrative region"] = {
-- China (former division)
link = "w",
preposition = "ของ",
class = "subpolity",
inherently_former = {"FORMER"},
},
["gromada"] = {
-- Poland (former division)
link = "w",
preposition = "ของ",
affix_type = "Pref",
class = "subpolity",
inherently_former = {"FORMER"},
},
["group of islands"] = {
link = "[[group]] of [[island]]s",
plural = "groups of islands",
plural_link = "[[group]]s of [[island]]s",
fallback = "island group",
},
["gulf"] = {
link = true,
preposition = "ของ",
holonym_use_the = true,
class = "natural feature",
addl_bare_category_parents = {"bodies of water"},
default = {true},
},
["hamlet"] = {
link = true,
fallback = "village",
},
["harbor city"] = {
link = "separately",
fallback = "นคร",
},
["harbor town"] = {
link = "separately",
fallback = "เมือง",
},
["harbour city"] = {
link = "separately",
fallback = "นคร",
},
["harbour town"] = {
link = "separately",
fallback = "เมือง",
},
["headland"] = {
link = true,
class = "natural feature",
addl_bare_category_parents = {"ธรณีสัณฐาน"},
default = {true},
},
["headquarters"] = {
link = "w",
fallback = "administrative centre",
},
["heath"] = {
link = true,
fallback = "moor",
},
["hemisphere"] = {
link = true,
entry_placetype_use_the = true,
fallback = "continental region",
},
["highway"] = {
link = true,
fallback = "road",
},
["hill"] = {
link = true,
class = "natural feature",
addl_bare_category_parents = {"ธรณีสัณฐาน"},
default = {true},
},
["hill station"] = {
link = "w",
fallback = "เมือง",
},
["hill town"] = {
link = "w",
fallback = "เมือง",
},
["historic region"] = {
-- provided only for the link
link = "+w:historical region",
fallback = "FORMER geographic region",
},
["historical county"] = {
-- needed for historical counties of England/etc.
link = "+w:historic county",
fallback = "FORMER subpolity",
},
["historical region"] = {
-- provided only for the link
link = "w",
fallback = "FORMER geographic region",
},
["home rule city"] = {
link = "w",
fallback = "นคร",
},
["home rule municipality"] = {
link = "w",
fallback = "เทศบาล",
},
["hot spring"] = {
link = true,
fallback = "spring",
},
["house"] = {
link = true,
fallback = "building",
},
["housing estate"] = {
-- not the same as a housing project (i.e. public housing)
link = true,
-- not exactly the case but approximately
fallback = "neighborhood",
},
["hromada"] = {
-- Ukraine
link = "w",
disallow_in_entries = "Use placetype 'urban hromada', 'rural hromada' or 'settlement hromada' in place of bare 'hromada'",
disallow_in_holonyms = "Use placetype 'urban hromada'/'uhrom', 'rural hromada'/'rhrom' or 'settlement hromada'/'shrom' in place of bare 'hromada'",
preposition = "ของ",
affix_type = "suf",
class = "subpolity",
},
["inactive volcano"] = {
link = "w",
fallback = "dormant volcano",
},
["independent city"] = {
link = true,
fallback = "นคร",
},
["independent town"] = {
link = "+independent city",
fallback = "เมือง",
},
["Indian reservation"] = {
link = "w",
-- In the US. Also known as "Native American reservation" or "domestic dependent nation", and the reservations
-- themselves often use the term "nation" in their official name (e.g. the "Navajo Nation"). But Wikipedia puts
-- the article at [[w:Indian reservation]] and uses that term when describing e.g. what the Navajo Nation is,
-- so this must still be the legal term.
preposition = "ของ",
class = "subpolity",
default = {true},
},
["Indian reserve"] = {
link = "w",
-- In Canada. "First Nations reserve" sounds more modern/PC but Wikipedia uses "Indian reserve"; presumably that
-- is still the legal term.
preposition = "ของ",
class = "subpolity",
default = {true},
},
["inland sea"] = {
-- note, we also have 'inland' as a qualifier
link = true,
fallback = "ทะเล",
},
["inner city area"] = {
link = "[[inner city]] [[area]]",
fallback = "neighborhood",
},
["เกาะ"] = {
link = true,
preposition = "ของ",
class = "natural feature",
addl_bare_category_parents = {"ธรณีสัณฐาน"},
default = {true},
},
["island country"] = {
-- FIXME: The following should map to both 'island' and 'country'.
link = "w",
fallback = "ประเทศ",
},
["island group"] = {
link = "separately",
fallback = "เกาะ",
},
["island municipality"] = {
link = "w",
fallback = "เทศบาล",
},
["islet"] = {
link = "w",
fallback = "เกาะ",
},
["Israeli settlement"] = {
link = "w",
class = "settlement",
default = {true},
},
["judicial capital"] = {
link = "w",
fallback = "capital city",
},
["khanate"] = {
link = true,
fallback = "องค์การทางการเมือง",
},
["kibbutz"] = {
link = true,
plural = "kibbutzim",
class = "non-admin settlement",
default = {true},
},
["kingdom"] = {
link = true,
fallback = "monarchy",
},
["krai"] = {
link = true,
preposition = "ของ",
affix_type = "Suf",
class = "subpolity",
},
["ทะเลสาบ"] = {
link = true,
class = "natural feature",
addl_bare_category_parents = {"bodies of water"},
default = {true},
},
["ธรณีสัณฐาน!"] = {
category_link = "[[ธรณีสัณฐาน]]",
bare_category_parent = "สถานที่",
addl_bare_category_parents = {"โลก"},
},
["largest city"] = {
link = "[[large]]st [[city]]",
entry_placetype_use_the = true,
fallback = "นคร",
has_neighborhoods = true,
},
["league"] = {
link = true,
fallback = "confederation",
},
["legislative capital"] = {
link = "separately",
fallback = "capital city",
},
["library"] = {
link = true,
fallback = "building",
},
["lieutenancy area"] = {
-- used in the United Kingdom; per Wikipedia:
-- In England, lieutenancy areas are colloquially known as the ceremonial counties, although this phrase does
-- not appear in any legislation referring to them. The lieutenancy areas of Scotland are subdivisions of
-- Scotland that are more or less based on the counties of Scotland, making use of the major cities as separate
-- entities.[2] In Wales, the lieutenancy areas are known as the preserved counties of Wales and are based on
-- those used for lieutenancy and local government between 1974 and 1996. The lieutenancy areas of Northern
-- Ireland correspond to the six counties and two former county boroughs.[3]
link = "w",
fallback = "ceremonial county",
},
["local authority district"] = {
link = "w",
fallback = "local government district",
},
["local government area"] = {
-- Australia
link = "w",
preposition = "ของ",
class = "subpolity",
},
["local council"] = {
-- Malta; similar to municipalities
link = "+w:local councils of Malta",
preposition = "ของ",
fallback = "เทศบาล",
},
["local government district"] = {
link = "w",
preposition = "ของ",
affix_type = "suf",
affix = "อำเภอ",
class = "subpolity",
},
["local government district with borough status"] = {
link = "[[w:local government district|local government district]] with [[w:borough status|borough status]]",
plural = "local government districts with borough status",
plural_link = "[[w:local government district|local government districts]] with [[w:borough status|borough status]]",
preposition = "ของ",
affix_type = "suf",
affix = "อำเภอ",
class = "subpolity",
},
["local urban district"] = {
link = "w",
fallback = "unincorporated community",
},
["locality"] = {
link = "+w:locality (settlement)",
-- not necessarily true, but usually is the case
fallback = "village",
},
["London borough"] = {
link = "w",
preposition = "ของ",
affix_type = "pref",
affix = "borough",
fallback = "local government district with borough status",
has_neighborhoods = true,
},
["macroregion"] = {
link = true,
fallback = "ภูมิภาค",
},
["man-made structures!"] = {
category_link = "[[w:geographical feature#Engineered constructs|man-made structures]] such as [[airport]]s, [[university|universities]] and [[metro station]]s",
bare_category_parent = "สถานที่",
},
["manor"] = {
-- FIXME: or is this more like a farm?
link = true,
fallback = "building",
},
["marginal sea"] = {
link = true,
preposition = "ของ",
fallback = "ทะเล",
},
["market city"] = {
link = "+market town",
fallback = "นคร",
},
["market town"] = {
link = true,
fallback = "เมือง",
},
["massif"] = {
link = true,
fallback = "ภูเขา",
},
["megacity"] = {
link = true,
fallback = "นคร",
},
["metro station"] = {
link = true,
class = "man-made structure",
},
["metropolitan borough"] = {
link = true,
preposition = "ของ",
affix_type = "Pref",
no_affix_strings = {"borough", "นคร"},
fallback = "local government district",
has_neighborhoods = true,
},
["metropolitan city"] = {
-- These exist e.g. in Italy and are more like municipalities or even provinces than cities.
link = true,
preposition = "ของ",
affix_type = "Pref",
no_affix_strings = {"metropolitan", "นคร"},
class = "subpolity",
},
["metropolitan county"] = {
link = true,
fallback = "เทศมณฑล",
},
["metropolitan municipality"] = {
-- In South Africa, metropolitan municipalities group local municipalities and are like districts, between
-- provinces and municipalities.
-- In Turkey, metropolitan municipalities are provinces-level.
link = "w",
preposition = "ของ",
affix_type = "Suf",
no_affix_strings = {"metropolitan", "เทศบาล"},
fallback = "เทศบาล",
class = "subpolity",
},
["microdistrict"] = {
-- residential complex in post-Soviet states
link = true,
fallback = "neighborhood",
},
["micronations!"] = {
-- FIXME, merge with microstate
category_link = "[[micronation]]s",
bare_category_parent = "ประเทศ",
},
["microstate"] = {
link = true,
fallback = "ประเทศ",
},
["military base"] = {
link = "w",
class = "settlement", -- or "man-made structure"?
default = {true},
},
["minster town"] = {
-- England
link = "separately",
fallback = "เมือง",
},
["monarchy"] = {
link = true,
fallback = "องค์การทางการเมือง",
},
["moor"] = {
link = true,
class = "natural feature",
addl_bare_category_parents = {"ธรณีสัณฐาน", "ecosystems"},
default = {true},
},
["moorland"] = {
link = true,
fallback = "moor",
},
["motorway"] = {
link = true,
fallback = "road",
},
["ภูเขา"] = {
link = true,
class = "natural feature",
addl_bare_category_parents = {"ธรณีสัณฐาน"},
default = {true},
},
["mountain indigenous district"] = {
-- Taiwan
link = "+w:district (Taiwan)",
fallback = "อำเภอ",
},
["mountain indigenous township"] = {
-- Taiwan
link = "+w:township (Taiwan)",
fallback = "township",
},
["mountain pass"] = {
link = true,
-- The default plural algorithm gets this right but the singularization algorithm incorrectly converts
-- passes -> passe, so put an entry here to ensure we singularize correctly.
plural = "mountain passes",
class = "natural feature",
addl_bare_category_parents = {"ภูเขา"},
default = {true},
},
["เทือกเขา"] = {
link = true,
fallback = "ภูเขา",
},
["mountainous region"] = {
link = "separately",
fallback = "ภูมิภาค",
},
["mukim"] = {
-- Malaysia, Brunei, Indonesia, Singapore
link = true,
preposition = "ของ",
class = "subpolity",
},
["municipal district"] = {
link = "w",
-- meaning varies depending on the country; for now, assume no neighborhoods.
-- FIXME: has_neighborhoods might have to be a function that looks at the containing holonyms.
preposition = "ของ",
affix_type = "Pref",
no_affix_strings = "อำเภอ",
fallback = "เทศบาล",
},
["เทศบาล"] = {
link = true,
preposition = "ของ",
has_neighborhoods = true,
class = "subpolity",
},
["municipality with city status"] = {
link = "[[municipality]] with [[w:city status|city status]]",
plural = "municipalities with city status",
plural_link = "[[municipality|municipalities]] with [[w:city status|city status]]",
fallback = "เทศบาล",
},
["museum"] = {
link = true,
fallback = "building",
},
["mythological location"] = {
link = "separately",
former_type = "!",
class = "hypothetical location",
bare_category_parent = "สถานที่",
default = {true},
},
["named bridges!"] = {
category_link = "notable [[bridge]]s",
bare_category_parent = "man-made structures",
addl_bare_category_parents = {"bridges"},
},
["named buildings!"] = {
category_link = "notable [[house]]s, [[library|libraries]] and other [[building]]s",
bare_category_parent = "man-made structures",
addl_bare_category_parents = {"buildings"},
},
["named roads!"] = {
category_link = "notable [[road]]s, [[highway]]s, [[trail]]s and similar linear structures",
bare_category_parent = "man-made structures",
addl_bare_category_parents = {"roads"},
},
["national capital"] = {
link = "w",
fallback = "capital city",
},
["national park"] = {
link = true,
fallback = "park",
},
["natural features!"] = {
category_link = "[[w:geographical feature#Natural features|natural features]] such as [[lake]]s, [[mountain]]s, [[island]]s and [[ocean]]s",
bare_category_parent = "สถานที่",
},
["neighborhood"] = {
-- The majority of the properties here apply to both `neighborhoods` and `neighbourhoods`; the choice of which
-- one to use is made by district_neighborhood_cat_handler() based on the value of `british_spelling` for the
-- location (city, political division, etc.) of the holonym that follows the word "neighbo(u)hoods" in the
-- category name. It does *NOT* depend on whether the {{place}} call uses "neighborhoods" or "neighbourhoods".
-- (In general it can't, because other things like "urban areas", "อำเภอ", "subdivisions" and the like also
-- categorize as neighbo(u)rhoods.)
link = true,
-- See below. These are used by category handlers in [[Module:category tree/topic cat/data/Places]].
generic_before_non_cities = "ใน",
generic_before_cities = "ของ",
-- The following text is suitable for the top-level description of a neighborhood as well as categories of the
-- form `Neighborhoods in POLDIV` e.g. `Neighborhoods in Illinois, USA` but not for categories of the form
-- `Neighborhoods of Chicago`, where we'd get "... and other subportions of [[city|cities]] of [[Chicago]]".
category_link = "[[neighborhood]]s, [[district]]s and other subportions of [[city|cities]]",
category_link_before_city = "[[neighborhood]]s, [[district]]s and other subportions",
-- NOTE: This setting is needed for administrative divisions like barangays that fall back to `neighborhood`,
-- when set in [[Module:place/locations]] for a specific country (e.g. the Philippines). The above settings
-- for `generic_before_non_cities` and `generic_before_cities` are used by category handlers in
-- [[Module:category tree/topic cat/data/Places]] for `Neighborhoods in POLDIV` and `Neighborhoods of CITY`
-- categories. In fact, district_neighborhood_cat_handler() does not currently pay attention to them, but
-- generates "ของ" before cities and "ใน" before non-cities regardless. (FIXME: We should change that.)
preposition = "ของ",
class = "non-admin settlement",
cat_handler = district_neighborhood_cat_handler,
},
["neighbourhood"] = {
link = true,
category_link = "[[neighbourhood]]s, [[district]]s and other subportions of [[city|cities]]",
category_link_before_city = "[[neighbourhood]]s, [[district]]s and other subportions",
fallback = "neighborhood",
},
["new area"] = {
-- China (type of economic development zone, varying greatly in size)
link = "w",
preposition = "ใน",
class = "subpolity", --?
},
["new town"] = {
link = true,
fallback = "เมือง",
},
["non-city capital"] = {
link = "[[capital]]",
entry_placetype_use_the = true,
preposition = "ของ",
has_neighborhoods = true,
class = "capital",
cat_handler = function(data)
return capital_city_cat_handler(data, "non-city")
end,
-- FIXME, do we need the following?
default = {true},
},
["non-metropolitan county"] = {
link = "w",
fallback = "เทศมณฑล",
},
["non-metropolitan district"] = {
link = "w",
fallback = "local government district",
},
["non-sovereign kingdom"] = {
-- especially in Africa and Asia
link = "+w:non-sovereign monarchy",
generic_before_non_cities = "ใน",
class = "subpolity",
["country/*"] = {true},
["continent/*"] = {true},
default = {true},
},
["non-sovereign monarchy"] = {
link = "w",
fallback = "non-sovereign kingdom",
},
["oblast"] = {
link = true,
preposition = "ของ",
affix_type = "Suf",
class = "subpolity",
},
["oblasts and autonomous republics!"] = {
-- This and other similar "combined placetypes" are for use in the plural when grouping first-level
-- administrative regions of certain countries, in this case Ukraine.
category_link = "[[oblast]]s and [[w:autonomous republic|autonomous republic]]s",
class = "subpolity",
},
["มหาสมุทร"] = {
link = true,
holonym_use_the = true,
class = "natural feature",
addl_bare_category_parents = {"ทะเล", "bodies of water"},
default = {true},
},
["okrug"] = {
link = true,
preposition = "ของ",
affix_type = "Suf",
class = "subpolity",
},
["overseas collectivity"] = {
link = "w",
fallback = "collectivity",
},
["overseas department"] = {
link = "w",
fallback = "department",
},
["overseas territory"] = {
link = "w",
fallback = "dependent territory",
},
["parish"] = {
link = true,
preposition = "ของ",
affix_type = "suf",
class = "subpolity",
},
["parish municipality"] = {
-- in Quebec, often similar to a rural village; the famous [[Saint-Louis-du-Ha! Ha!]] is one of them.
link = "+w:parish municipality (Quebec)",
preposition = "ของ",
fallback = "เทศบาล",
has_neighborhoods = true,
},
["parish seat"] = {
link = true,
entry_placetype_use_the = true,
preposition = "ของ",
class = "capital",
has_neighborhoods = true,
},
["park"] = {
link = true,
class = "man-made structure",
default = {true},
},
["pass"] = {
link = "+mountain pass",
-- The default plural algorithm gets this right but the singularization algorithm incorrectly converts
-- passes -> passe, so put an entry here to ensure we singularize correctly.
plural = "passes",
fallback = "mountain pass",
},
["path"] = {
link = true,
fallback = "road",
},
["peak"] = {
link = true,
fallback = "ภูเขา",
},
["peninsula"] = {
link = true,
class = "natural feature",
addl_bare_category_parents = {"ธรณีสัณฐาน"},
default = {true},
},
["periphery"] = {
link = true,
preposition = "ของ",
class = "subpolity",
},
["สถานที่!"] = {
generic_before_non_cities = "ใน",
generic_before_cities = "ใน",
class = "generic place",
category_link = "[[place]]s of all sorts",
-- `category_link_top_level` control the description used in the top-level [[Category:Places]] and
-- language-specific variants such as [[Category:en:Places]]. The actual text for a language-spefic variant is
-- "{{{langname}}} names of [[geographical]] [[place]]s of all sorts; [[toponym]]s." where the "names of"
-- portion is automatically generated by the appropriate handler in
-- [[Module:category tree/topic cat/data/Places]].
category_link_top_level = "[[geographical]] [[place]]s of all sorts; [[toponym]]s",
bare_category_parent = "ชื่อ (หัวข้อ)",
},
["planned community"] = {
-- Include this so we don't categorize 'planned community' into villages, as 'community' does.
link = true,
class = "settlement",
has_neighborhoods = true,
},
["plateau"] = {
link = true,
class = "natural feature",
addl_bare_category_parents = {"ธรณีสัณฐาน"},
default = {true},
-- FIXME: Should generate both "Plateaus" and the appropriate 'geographic and cultural area' category
},
["Polish colony"] = {
link = "[[w:colony (Poland)|colony]]",
affix_type = "suf",
affix = "colony",
fallback = "village",
has_neighborhoods = true,
},
["political divisions!"] = {
category_link = "[[political]] [[division]]s and [[subdivision]]s, such as [[state]]s, [[province]]s, [[county|counties]] or [[district]]s",
bare_category_parent = "สถานที่",
},
["องค์การทางการเมือง"] = {
link = true,
category_link = "[[independent]] or [[semi-]][[independent]] [[polity|polities]]",
class = "polity", --ห้ามแปล class
bare_category_parent = "สถานที่",
default = {true},
},
["populated place"] = {
link = "+w:populated place",
-- not necessarily true, but usually is the case
fallback = "village",
},
["port"] = {
link = true,
class = "man-made structure",
default = {true},
},
["port city"] = {
-- FIXME: should categorize into "Ports" as well as "นคร"
link = true,
fallback = "นคร",
},
["port town"] = {
-- FIXME: should categorize into "Ports" as well as "เมือง"
link = "w",
fallback = "เมือง",
},
["prefecture"] = {
-- FIXME! `prefecture` is like a county in Japan and elsewhere but a department capital city in France.
-- May need `has_neighborhoods` to be a function.
link = true,
preposition = "ของ",
display_handler = prefecture_display_handler,
class = "subpolity",
},
["prefecture-level city"] = {
-- China; they are huge entities with a central city; not cities themselves.
link = "w",
preposition = "ของ",
class = "subpolity",
},
["preserved county"] = {
-- In Wales; they are former counties enshrined in law; there are 8 of them and each consists of one or more
-- "principal areas" (styled as "เทศมณฑล" or "county boroughs"), of which there are 22.
link = "w",
preposition = "ของ",
class = "subpolity",
inherently_former = {"FORMER"},
},
["primary area"] = {
-- a grouping of "อำเภอ" (neighborhoods) in Gothenburg, Sweden
link = "+w:sv:primärområde",
fallback = "neighborhood",
},
["principality"] = {
link = true,
fallback = "monarchy",
},
["promontory"] = {
link = true,
fallback = "headland",
},
["protectorate"] = {
link = true,
fallback = "dependent territory",
},
["จังหวัด"] = {
link = true,
preposition = "ของ",
display_handler = province_display_handler,
class = "subpolity",
},
["provinces and autonomous regions!"] = {
-- This and other similar "combined placetypes" are for use in the plural when grouping first-level
-- administrative regions of certain countries, in this case China.
category_link = "[[province]]s and [[autonomous region]]s",
class = "subpolity",
},
["provinces and territories!"] = {
-- This and other similar "combined placetypes" are for use in the plural when grouping first-level
-- administrative regions of certain countries, in this case Canada and Pakistan.
category_link = "[[province]]s and [[territory|territories]]",
class = "subpolity",
},
["provincial capital"] = {
link = true,
fallback = "capital city",
},
["raion"] = {
link = true,
preposition = "ของ",
affix_type = "Suf",
class = "subpolity",
},
["ranch"] = {
link = true,
fallback = "farm",
},
["range"] = {
-- FIXME: Where is this used? Is it a mountain range?
link = true,
holonym_use_the = true,
class = "natural feature",
},
["regency"] = {
link = true,
preposition = "ของ",
class = "subpolity",
},
["ภูมิภาค"] = {
link = true,
preposition = "ของ",
-- If 'region' isn't a specific administrative division, fall back to 'geographic and cultural area'
fallback = "geographic and cultural area",
-- "former region" is a subpolity but traditional/historic(al)/ancient/medieval/etc. is a geographic region
class = "geographic region",
},
["regional capital"] = {
link = "separately",
fallback = "capital city",
},
["regional county municipality"] = {
-- Quebec
link = "w",
preposition = "ของ",
affix_type = "Suf",
no_affix_strings = {"เทศบาล", "เทศมณฑล"},
fallback = "เทศบาล",
},
["regional district"] = {
link = "w",
preposition = "ของ",
affix_type = "Pref",
no_affix_strings = "อำเภอ",
fallback = "อำเภอ",
},
["regional municipality"] = {
link = "w",
preposition = "ของ",
affix_type = "Pref",
no_affix_strings = "เทศบาล",
fallback = "เทศบาล",
},
["regional unit"] = {
link = "w",
preposition = "ของ",
affix_type = "suf",
class = "subpolity",
},
["registration county"] = {
-- Used in Scotland for land registration purposes; formerly used in England, Wales and Ireland for statistical
-- purposes (registration of births, deaths and marriages, and for the output of census information).
link = "w",
fallback = "เทศมณฑล",
},
["republic"] = {
-- Of Russia, Yugoslavia, etc. "Republics" in general are sovereign but we use "ประเทศ" in that case.
link = true,
fallback = "constituent republic",
},
["research base"] = {
link = "+w:research station",
fallback = "research station",
},
["research station"] = {
link = "w",
class = "non-admin settlement", -- or "man-made structure"?
default = {true},
},
["reservoir"] = {
link = true,
fallback = "ทะเลสาบ",
},
["residential area"] = {
link = "separately",
fallback = "neighborhood",
},
["resort city"] = {
link = "w",
fallback = "นคร",
},
["resort town"] = {
link = "w",
fallback = "เมือง",
},
["แม่น้ำ"] = {
link = true,
generic_before_non_cities = "ใน",
holonym_use_the = true,
class = "natural feature",
addl_bare_category_parents = {"bodies of water"},
cat_handler = city_type_cat_handler,
["continent/*"] = {true},
default = {true},
},
["river island"] = {
link = "w",
fallback = "เกาะ",
},
["road"] = {
link = true,
class = "man-made structure",
default = {"Named roads"},
},
["Roman province"] = {
-- FIXME! Eliminate this in favor of 'former province|emp/Roman Empire'
link = "w",
default = {"Provinces of the Roman Empire"},
class = "subpolity",
},
["royal borough"] = {
link = "w",
preposition = "ของ",
affix_type = "Pref",
no_affix_strings = {"royal", "borough"},
fallback = "local government district with borough status",
has_neighborhoods = true,
},
["royal burgh"] = {
link = true,
fallback = "borough",
},
["royal capital"] = {
link = "w",
fallback = "capital city",
},
["rural committee"] = {
-- Hong Kong; a group of villages
link = "w",
affix_type = "Suf",
has_neighborhoods = true,
class = "settlement",
},
["rural community"] = {
-- New Brunswick
link = "+w:list of municipalities in New_Brunswick#Rural communities",
fallback = "เทศบาล",
},
["rural hromada"] = {
link = "[[rural]] [[w:hromada|hromada]]",
affix_type = "suf",
fallback = "hromada",
},
["rural municipality"] = {
link = "w",
preposition = "ของ",
affix_type = "Pref",
no_affix_strings = "เทศบาล",
fallback = "เทศบาล",
has_neighborhoods = true, --?
},
["rural township"] = {
-- Taiwan
link = "+w:rural township (Taiwan)",
fallback = "township",
},
["sanctuary"] = {
link = true,
fallback = "temple",
},
["satrapy"] = {
link = true,
preposition = "ของ",
class = "subpolity",
inherently_former = {"ANCIENT", "FORMER"},
},
["ทะเล"] = {
link = true,
holonym_use_the = true,
class = "natural feature",
addl_bare_category_parents = {"bodies of water"},
default = {true},
},
["seaport"] = {
link = true,
fallback = "port",
},
["seat"] = {
link = true,
fallback = "administrative centre",
},
["self-administered area"] = {
-- Myanmar (groups self-administered divisions and zones)
link = "+w:self-administered zone",
preposition = "ของ",
class = "subpolity",
},
["self-administered division"] = {
-- Myanmar (only one of them: Wa Self-Administered Division)
link = "w",
fallback = "self-administered area",
},
["self-administered zone"] = {
-- Myanmar (five of them)
link = "w",
fallback = "self-administered area",
},
["separatist state"] = {
link = "separately",
fallback = "unrecognized country",
},
["การตั้งถิ่นฐาน"] = {
link = true,
category_link = "[[settlement]]s such as [[city|cities]], [[village]]s and [[farm]]s",
bare_category_parent = "สถานที่",
-- not necessarily true, but usually is the case
fallback = "village",
},
["settlement hromada"] = {
link = "[[w:Populated สถานที่ในUkraine#Rural settlements|การตั้งถิ่นฐาน]] [[w:hromada|hromada]]",
affix_type = "suf",
fallback = "hromada",
},
["sheading"] = {
-- Isle of Man
link = true,
fallback = "อำเภอ",
},
["sheep station"] = {
-- Australia
link = true,
fallback = "farm",
},
["shire"] = {
link = true,
fallback = "เทศมณฑล",
},
["shire county"] = {
link = "w",
fallback = "เทศมณฑล",
},
["shire town"] = {
link = true,
fallback = "county seat",
},
["ski resort city"] = {
link = "[[ski resort]] [[city]]",
fallback = "นคร",
},
["ski resort town"] = {
link = "[[ski resort]] [[town]]",
fallback = "เมือง",
},
["spa city"] = {
link = "+w:spa town",
fallback = "นคร",
},
["spa town"] = {
link = "w",
fallback = "เมือง",
},
["space station"] = {
link = true,
fallback = "research station",
},
["special administrative region"] = {
-- in China; in practice they are city-like (Hong Kong, Macau); also [[Oecusse]] in East Timor is formally a
-- "special administrative region"; North Korea had one such region planned (Sinuiju) but abandoned; Indonesia
-- has similar "special regions" of Jakarta, Yogyakarta and Aceh; and South Sudan has three "special
-- administrative areas"
link = "+w:special administrative regions of China",
preposition = "ของ",
class = "subpolity",
has_neighborhoods = true, --?
-- no suffix since สถานที่ในHong Kong or Macau are listed without China, except Hong Kong and Macau themselves
-- they also contain regions (or areas), e.g. [[Kowloon]], so it would be confusing
suffix = "",
},
["special collectivity"] = {
link = "w",
fallback = "collectivity",
},
["special municipality"] = {
-- formerly linked to the Taiwan article but there are also special municipalities of the Netherlands
link = "w",
fallback = "เทศบาล",
},
["special ward"] = {
-- Tokyo
link = true,
fallback = "เทศบาล",
},
["spit"] = {
link = true,
fallback = "peninsula",
},
["spring"] = {
link = true,
class = "natural feature",
default = {true},
},
["star"] = {
link = true,
class = "natural feature",
default = {true},
},
["รัฐ"] = {
link = true,
preposition = "ของ",
class = "subpolity",
-- 'former/historical state' could refer either to a state of a country (a division) or a state = sovereign
-- entity. The latter appears more common (e.g. in various "ancient states" of East Asia).
former_type = "องค์การทางการเมือง",
},
["states and territories!"] = {
-- This and other similar "combined placetypes" are for use in the plural when grouping first-level
-- administrative regions of certain countries, in this case Australia.
category_link = "[[state]]s and [[territory|territories]]",
class = "subpolity",
},
["states and union territories!"] = {
-- This and other similar "combined placetypes" are for use in the plural when grouping first-level
-- administrative regions of certain countries, in this case India.
category_link = "[[state]]s and [[union territory|union territories]]",
class = "subpolity",
},
["state capital"] = {
link = true,
fallback = "capital city",
},
["state park"] = {
link = true,
fallback = "park",
},
["state-level new area"] = {
-- China (type of economic development zone, varying greatly in size)
link = "w",
fallback = "new area",
},
["statistical region"] = {
-- Slovenia
link = true,
fallback = "administrative region",
},
["statutory city"] = {
link = "w",
fallback = "นคร",
},
["statutory town"] = {
link = "w",
fallback = "เมือง",
},
["strait"] = {
link = true,
class = "natural feature",
addl_bare_category_parents = {"bodies of water"},
default = {true},
},
["stream"] = {
link = true,
fallback = "แม่น้ำ",
},
["street"] = {
link = true,
fallback = "road",
},
["strip"] = {
link = true,
fallback = "geographic region",
},
["strip of land"] = {
link = "[[strip]] of [[land]]",
plural = "strips of land",
plural_link = "[[strip]]s of [[land]]",
fallback = "geographic region",
},
["sub-metropolitan city"] = {
link = "+w:List of cities in Nepal#Sub-metropolitan cities",
fallback = "นคร",
},
["sub-prefectural city"] = {
link = "w",
fallback = "subprovincial city",
},
["ตำบล"] = {
link = true,
preposition = "ของ",
has_neighborhoods = true, --?
-- FIXME: subdistricts can be neighborhood-like (of Jakarta) or larger (in China); need a handler
class = "subpolity",
default = {true},
},
["subdivision"] = {
link = true,
preposition = "ของ",
affix_type = "suf",
-- FIXME: subdivisions can be neighborhood-like or larger; need a handler
class = "subpolity",
cat_handler = district_neighborhood_cat_handler,
},
["submerged ghost town"] = {
-- FIXME: Consider just having "submerged" as a qualifier.
link = "[[submerged]] [[ghost town]]",
fallback = "ghost town",
},
["subnational kingdom"] = {
link = "+w:subnational monarchy",
fallback = "non-sovereign kingdom",
},
["subnational monarchy"] = {
link = "w",
fallback = "non-sovereign kingdom",
},
["subprefecture"] = {
link = true,
affix_type = "suf",
preposition = "ของ",
class = "subpolity",
},
["subprovince"] = {
link = true,
preposition = "ของ",
class = "subpolity",
},
["subprovincial city"] = {
link = "w",
-- China; special status given to certain prefecture-level cities
fallback = "prefecture-level city",
},
["subprovincial district"] = {
link = "w",
-- China; special status given to Binhai New Area and Pudong New Area, which are county-level districts
preposition = "ของ",
class = "subpolity",
},
["subregion"] = {
link = true,
fallback = "geographic region",
},
["suburb"] = {
link = true,
-- The following text is suitable for the top-level description of a suburb as well as categories of the form
-- 'Suburbs in POLDIV' e.g. 'Suburbs in Illinois, USA' but not for categories of the form 'Suburbs of Chicago',
-- where we'd get "[[suburb]]s of [[city|cities]] of [[Chicago]]".
category_link = "[[suburb]]s of [[city|cities]]",
category_link_before_city = "[[suburb]]s",
-- See comments under "neighborhood" for the following three settings. They are used by
-- [[Module:category tree/topic cat/data/Places]] for generating the text of 'Suburbs in/of PLACE' categories
-- but currently ignored by district_neighborhood_cat_handler (which actually generates the categories for a
-- given page), which hardcodes "ใน" for non-cities and "ของ" for cities. (FIXME: Change this.)
generic_before_non_cities = "ใน",
generic_before_cities = "ของ",
preposition = "ของ",
has_neighborhoods = true, --?
class = "non-admin settlement", --?
cat_handler = district_neighborhood_cat_handler,
},
["suburban area"] = {
link = "w",
fallback = "suburb",
},
["subway station"] = {
link = "w",
fallback = "metro station",
},
["sum"] = {
-- In China, Mongolia, Russia; something like a county in Mongolia but a township in China (Inner Mongolia),
-- and equivalent to a [[selsoviet]] in the parts of Russia where it's in use (a rural council, below a raion).
link = "+w:sum (administrative division)",
-- This fallback is somewha arbitrary. We could use "เทศมณฑล" but that has a display handler
-- which we don't want to be active (FIXME: If the display handler would be active, that's a bug).
fallback = "division",
},
["supercontinent"] = {
link = true,
fallback = "continent",
},
["tehsil"] = {
link = true,
affix_type = "suf",
no_affix_strings = {"tehsil", "tahsil"},
class = "subpolity",
},
["temple"] = {
link = true,
fallback = "building",
},
["territorial authority"] = {
link = "w",
fallback = "อำเภอ",
},
["ดินแดน"] = {
link = true,
preposition = "ของ",
class = "subpolity",
},
["theme"] = {
link = "+w:theme (Byzantine district)",
preposition = "ของ",
class = "subpolity",
},
["เมือง"] = {
link = true,
generic_before_non_cities = "ใน",
has_neighborhoods = true,
class = "settlement",
cat_handler = city_type_cat_handler,
default = {true},
},
["town with bystatus"] = {
-- can't use templates in links currently
link = "[[town]] with [[bystatus#Norwegian Bokmål|bystatus]]",
plural = "towns with bystatus",
plural_link = "[[town]]s with [[bystatus#Norwegian Bokmål|bystatus]]",
fallback = "เมือง",
},
["township"] = {
link = true,
has_neighborhoods = true,
class = "settlement", --?
default = {true},
},
["township municipality"] = {
-- Quebec
link = "+w:township municipality (Quebec)",
preposition = "ของ",
fallback = "เทศบาล",
has_neighborhoods = true, --?
},
["traditional county"] = {
link = true,
fallback = "เทศมณฑล",
},
["traditional region"] = {
-- FIXME: Verify this works. Same for 'historic(al) region'.
-- provided only for the link
link = "w",
fallback = "FORMER geographic region",
},
["trail"] = {
link = true,
fallback = "road",
},
["treaty port"] = {
link = "w",
fallback = "นคร",
class = "settlement",
inherently_former = {"FORMER"},
},
["tributary"] = {
link = true,
preposition = "ของ",
fallback = "แม่น้ำ",
},
["underground station"] = {
link = "w",
fallback = "metro station",
},
["unincorporated area"] = {
link = "w",
-- I don't know if this fallback makes sense everywhere.
fallback = "unincorporated community",
},
["unincorporated community"] = {
link = true,
generic_before_non_cities = "ใน",
class = "non-admin settlement",
},
["unincorporated territory"] = {
link = "w",
fallback = "ดินแดน",
},
["union territory"] = {
-- India
link = true,
preposition = "ของ",
entry_placetype_indefinite_article = "a",
class = "subpolity",
},
["unitary authority"] = {
-- UK, New Zealand
link = true,
entry_placetype_indefinite_article = "a",
fallback = "local government district",
},
["unitary district"] = {
link = "w",
entry_placetype_indefinite_article = "a",
fallback = "local government district",
},
["united township municipality"] = {
-- Quebec
link = "+w:united township municipality (Quebec)",
entry_placetype_indefinite_article = "a",
fallback = "township municipality",
has_neighborhoods = true, --?
},
["university"] = {
link = true,
entry_placetype_indefinite_article = "a",
class = "man-made structure",
default = {true},
},
["unrecognised country"] = {
link = "w",
fallback = "unrecognized country",
},
["unrecognized and nearly unrecognized countries!"] = {
category_link = "[[de facto]] [[independent]] [[state]]s with little or no {{w|international recognition}}",
bare_category_parent = "country-like entities",
},
["unrecognized country"] = {
link = "w",
class = "polity", --ห้ามแปล class
default = {"Unrecognized and nearly unrecognized countries"},
},
["unrecognised state"] = {
link = "w",
fallback = "unrecognized country",
},
["unrecognized state"] = {
link = "w",
fallback = "unrecognized country",
},
["urban area"] = {
link = "separately",
fallback = "neighborhood",
},
["urban hromada"] = {
link = "[[urban]] [[w:hromada|hromada]]",
affix_type = "suf",
fallback = "hromada",
},
["urban service area"] = {
-- A strange beast existing in Alberta; technically a type of hamlet but in practice used for much larger
-- cities and treated equivalent to a city. (There are only two of them, [[Fort McMurray]] and [[Sherwood Park]]).
link = "w",
fallback = "นคร",
},
["urban township"] = {
link = "w",
fallback = "township",
},
["urban-type settlement"] = {
-- appears to be a particular type of small urban settlement in post-Soviet states,
-- had an administrative function.
link = "w",
fallback = "เมือง",
},
["valley"] = {
link = true,
class = "natural feature",
addl_bare_category_parents = {"ธรณีสัณฐาน", "water"},
default = {true},
},
["viceroyalty"] = {
-- in essence, a type of colony
link = true,
fallback = "dependent territory",
},
["village"] = {
link = true,
generic_before_non_cities = "ใน",
category_link = "[[village]]s, [[hamlet]]s, and other small [[community|communities]] and [[settlement]]s",
class = "settlement",
cat_handler = city_type_cat_handler,
default = {true},
},
["village development committee"] = {
-- former administrative structure in Nepal; also exists in India but not as a formal unit
link = "+w:village development committee (Nepal)",
inherently_former = {"FORMER"},
fallback = "village",
},
["village municipality"] = {
-- Quebec
link = "+w:village municipality (Quebec)",
preposition = "ของ",
fallback = "เทศบาล",
has_neighborhoods = true, --?
},
["voivodeship"] = {
-- Poland
link = true,
display_handler = voivodeship_display_handler,
preposition = "ของ",
class = "subpolity",
},
["volcano"] = {
link = true,
plural = "volcanoes",
class = "natural feature",
addl_bare_category_parents = {"ธรณีสัณฐาน"},
default = {true, "ภูเขา"},
},
["ward"] = {
link = true,
class = "settlement",
-- Wards are formal administrative divisions of a city but have some properties of neighborhoods.
fallback = "neighborhood",
},
["watercourse"] = {
link = true,
fallback = "channel",
},
["Welsh community"] = {
-- Wales
link = "[[w:community (Wales)|community]]",
preposition = "ของ",
affix_type = "suf",
affix = "community",
has_neighborhoods = true,
class = "settlement",
},
["zone"] = {
-- administrative division of Ethiopia, Qatar, Nepal, India
link = "+w:zone#Place names",
preposition = "ของ",
class = "subpolity",
},
----------------------------------------------------------------------------------------------
-- Categories for former places --
----------------------------------------------------------------------------------------------
["ANCIENT capital"] = {
link = false,
entry_placetype_use_the = true,
preposition = "ของ",
has_neighborhoods = true,
class = "capital",
-- FIXME: Consider removing 'ancient settlements' here. Ancient capitals, like former capitals, often still
-- exist but just aren't the capital any more. Maybe we should have an 'Ancient capitals' category.
default = {"Ancient settlements", "Former capitals"},
},
["ANCIENT non-admin settlement"] = {
link = false,
class = "non-admin settlement",
fallback = "ANCIENT settlement",
},
["ANCIENT settlement"] = {
link = false,
has_neighborhoods = true,
class = "settlement",
default = {"Ancient settlements"},
},
["ancient settlements!"] = {
category_link = "former [[city|cities]], [[town]]s and [[village]]s that existed in [[antiquity]]",
bare_category_parent = "former settlements",
},
["FORMER capital"] = {
link = false,
entry_placetype_use_the = true,
preposition = "ของ",
has_neighborhoods = true,
class = "capital",
default = {"Former capitals"},
},
["former capitals!"] = {
category_link = "former [[capital]] [[city|cities]] and [[town]]s",
bare_category_parent = "การตั้งถิ่นฐาน",
},
["former counties and county-level cities!"] = {
-- For categorizing former counties and county-level cities of China
category_link = "no-longer existing [[county|counties]] and [[county-level city|county-level cities]]",
bare_category_breadcrumb = "counties and county-level cities",
bare_category_parent = "former political divisions",
},
["FORMER county"] = {
-- For categorizing former counties and county-level cities of China
link = false,
fallback = "FORMER subpolity",
},
["FORMER county-level city"] = {
-- For categorizing former counties and county-level cities of China
link = false,
fallback = "FORMER subpolity",
},
["former countries and country-like entities!"] = {
category_link = "[[country|countries]] and similar [[polity|polities]] that no longer exist",
bare_category_breadcrumb = "countries and country-like entities",
bare_category_parent = "former polities",
},
["FORMER country"] = {
link = false,
class = "polity", --ห้ามแปล class
default = {"Former countries and country-like entities"},
},
["former dependent territories!"] = {
category_link = "[[w:dependent territory|dependent territories]] (colonies, dependencies, protectorates, etc.) that no longer exist",
bare_category_breadcrumb = "dependent territories",
bare_category_parent = "former political divisions",
},
["FORMER dependent territory"] = {
link = false,
preposition = "ของ",
class = "subpolity",
default = {"Former dependent territories"},
},
["former districts!"] = {
-- For categorizing former districts of China
category_link = "no-longer-existing [[district]]s",
bare_category_breadcrumb = "อำเภอ",
bare_category_parent = "former political divisions",
},
["FORMER district"] = {
-- For categorizing former districts of China
link = false,
fallback = "FORMER subpolity",
},
["FORMER geographic region"] = {
link = false,
fallback = "geographic and cultural area",
},
["FORMER man-made structure"] = {
link = false,
class = "man-made structure",
default = {"Former man-made structures"},
},
["former man-made structures!"] = {
category_link = "man-made structures such as [[airport]]s and [[park]]s that no longer exist",
bare_category_breadcrumb = "man-made structures",
bare_category_parent = "former places",
},
["former municipalities!"] = {
-- For categorizing former municipalities of the Netherlands
category_link = "no-longer-existing [[municipality|municipalities]]",
bare_category_breadcrumb = "เทศบาล",
bare_category_parent = "former political divisions",
},
["FORMER municipality"] = {
-- For categorizing former municipalities of the Netherlands
link = false,
fallback = "FORMER subpolity",
},
["FORMER natural feature"] = {
link = false,
class = "natural feature",
default = {"Former natural features"},
},
["former natural features!"] = {
category_link = "natural features such as [[lake]]s, [[river]]s and [[island]]s that no longer exist",
bare_category_breadcrumb = "natural features",
bare_category_parent = "former places",
},
["FORMER non-admin settlement"] = {
link = false,
class = "non-admin settlement",
fallback = "FORMER settlement",
},
["former places!"] = {
category_link = "[[place]]s of all sorts that no longer exist",
bare_category_breadcrumb = "former",
bare_category_parent = "สถานที่",
},
["former political divisions!"] = {
category_link = "[[political]] [[division]]s (states, provinces, counties, etc.) that no longer exist",
bare_category_breadcrumb = "political divisions",
bare_category_parent = "former places",
},
["former polities!"] = {
category_link = "[[polity|polities]] (countries, kingdoms, empires, etc.) that no longer exist",
bare_category_breadcrumb = "องค์การทางการเมือง",
bare_category_parent = "former places",
},
["FORMER polity"] = {
link = false,
class = "polity", --ห้ามแปล class
default = {"Former polities"},
},
["former prefectures!"] = {
-- For categorizing former prefectures of China
category_link = "no-longer-existing [[prefecture]]s",
bare_category_breadcrumb = "prefectures",
bare_category_parent = "former political divisions",
},
["FORMER prefecture"] = {
-- For categorizing former prefectures of China
link = false,
fallback = "FORMER subpolity",
},
["former provinces!"] = {
-- For categorizing former provinces of China, etc.
category_link = "no-longer-existing [[province]]s",
bare_category_breadcrumb = "จังหวัด",
bare_category_parent = "former political divisions",
},
["FORMER province"] = {
-- For categorizing ancient/historical/former provinces of the Roman Empire
link = false,
fallback = "FORMER subpolity",
},
["former region"] = {
-- A former region is considered a former political division, but not a 'historical/traditional/etc.' region.
link = "separately",
preposition = "ของ",
inherently_former = {"FORMER"},
class = "subpolity",
},
["FORMER settlement"] = {
link = false,
has_neighborhoods = true,
class = "settlement",
default = {"Former settlements"},
},
["former settlements!"] = {
category_link = "[[city|cities]], [[town]]s and [[village]]s that no longer exist or have been merged or reclassified",
bare_category_breadcrumb = "การตั้งถิ่นฐาน",
bare_category_parent = "former political divisions",
},
["FORMER subpolity"] = {
link = false,
preposition = "ของ",
class = "subpolity",
default = {"Former political divisions"},
},
----------------------------------------------------------------------------------------------
-- form-of categories --
----------------------------------------------------------------------------------------------
---------- Abbreviations ----------
["abbreviations of counties!"] = {
-- For categorizing abbreviations of counties of e.g. England
full_category_link = "{{glossary|abbreviation}}s of [[name]]s of [[county|counties]]",
bare_category_breadcrumb = "เทศมณฑล",
bare_category_parent = "abbreviations of political divisions",
},
["abbreviations of countries!"] = {
full_category_link = "{{glossary|abbreviation}}s of [[name]]s of [[country|countries]]",
bare_category_breadcrumb = "ประเทศ",
bare_category_parent = "abbreviations of places",
},
["abbreviations of departments!"] = {
-- For categorizing abbreviations of departments of e.g. France
full_category_link = "{{glossary|abbreviation}}s of [[name]]s of [[department]]s",
bare_category_breadcrumb = "departments",
bare_category_parent = "abbreviations of political divisions",
},
["abbreviations of districts!"] = {
-- For categorizing abbreviations of districts of e.g. ???
full_category_link = "{{glossary|abbreviation}}s of [[name]]s of [[district]]s",
bare_category_breadcrumb = "อำเภอ",
bare_category_parent = "abbreviations of political divisions",
},
["abbreviations of divisions!"] = {
-- For categorizing abbreviations of divisions of e.g. Bangladesh
full_category_link = "{{glossary|abbreviation}}s of [[name]]s of [[division]]s",
bare_category_breadcrumb = "divisions",
bare_category_parent = "abbreviations of political divisions",
},
["abbreviations of former countries!"] = {
full_category_link = "{{glossary|abbreviation}}s of [[country|countries]] that no longer [[exist]]",
bare_category_breadcrumb = "ประเทศ",
bare_category_parent = "abbreviations of former places",
},
["abbreviations of former places!"] = {
full_category_link = "{{glossary|abbreviation}}s of [[place]]s that no longer [[exist]]",
bare_category_breadcrumb = "abbreviations",
bare_category_parent = "former places",
addl_bare_category_parents = {{name = "abbreviations of places", sort = "former"}},
},
["abbreviations of places!"] = {
full_category_link = "{{glossary|abbreviation}}s of [[name]]s of [[place]]s",
bare_category_breadcrumb = "abbreviations",
bare_category_parent = "สถานที่",
},
["abbreviations of political divisions!"] = {
full_category_link = "{{glossary|abbreviation}}s of [[name]]s of [[political]] [[division]]s",
bare_category_breadcrumb = "political divisions",
bare_category_parent = "abbreviations of places",
},
["abbreviations of prefectures!"] = {
-- For categorizing abbreviations of prefectures of e.g. Japan
full_category_link = "{{glossary|abbreviation}}s of [[name]]s of [[prefecture]]s",
bare_category_breadcrumb = "prefectures",
bare_category_parent = "abbreviations of political divisions",
},
["abbreviations of provinces!"] = {
-- For categorizing abbreviations of provinces of e.g. Canada
full_category_link = "{{glossary|abbreviation}}s of [[name]]s of [[province]]s",
bare_category_breadcrumb = "จังหวัด",
bare_category_parent = "abbreviations of political divisions",
},
["abbreviations of provinces and territories!"] = {
full_category_link = "{{glossary|abbreviation}}s of [[name]]s of [[province]]s and [[territory|territories]]",
bare_category_breadcrumb = "provinces and territories",
bare_category_parent = "abbreviations of political divisions",
},
["abbreviations of regions!"] = {
-- For categorizing abbreviations of regions of e.g. Italy
full_category_link = "{{glossary|abbreviation}}s of [[name]]s of [[administrative region]]s",
bare_category_breadcrumb = "ภูมิภาค",
bare_category_parent = "abbreviations of political divisions",
},
["abbreviations of states!"] = {
-- For categorizing abbreviations of states of e.g. the United States
full_category_link = "{{glossary|abbreviation}}s of [[name]]s of [[state]]s",
bare_category_breadcrumb = "รัฐ",
bare_category_parent = "abbreviations of political divisions",
},
["abbreviations of states and territories!"] = {
full_category_link = "{{glossary|abbreviation}}s of [[name]]s of [[state]]s and [[territory|territories]]",
bare_category_breadcrumb = "states and territories",
bare_category_parent = "abbreviations of political divisions",
},
["abbreviations of states and union territories!"] = {
full_category_link = "{{glossary|abbreviation}}s of [[name]]s of [[state]]s and [[union territory|union territories]]",
bare_category_breadcrumb = "states and union territories",
bare_category_parent = "abbreviations of political divisions",
},
["abbreviations of territories!"] = {
full_category_link = "{{glossary|abbreviation}}s of [[name]]s of [[territory|territories]]",
bare_category_breadcrumb = "ดินแดน",
bare_category_parent = "abbreviations of political divisions",
},
["ABBREVIATION_OF country"] = {
link = false,
default = {"Abbreviations of countries"},
},
["ABBREVIATION_OF county"] = {
link = false,
fallback = "ABBREVIATION_OF subpolity",
},
["ABBREVIATION_OF department"] = {
link = false,
fallback = "ABBREVIATION_OF subpolity",
},
["ABBREVIATION_OF district"] = {
link = false,
fallback = "ABBREVIATION_OF subpolity",
},
["ABBREVIATION_OF division"] = {
link = false,
fallback = "ABBREVIATION_OF subpolity",
},
["ABBREVIATION_OF FORMER country"] = {
link = false,
default = {"Abbreviations of former countries"},
},
["ABBREVIATION_OF FORMER place"] = {
link = false,
default = {"Abbreviations of former places"},
},
["ABBREVIATION_OF place"] = {
link = false,
default = {"Abbreviations of places"},
},
["ABBREVIATION_OF prefecture"] = {
link = false,
fallback = "ABBREVIATION_OF subpolity",
},
["ABBREVIATION_OF province"] = {
link = false,
fallback = "ABBREVIATION_OF subpolity",
},
["ABBREVIATION_OF region"] = {
link = false,
fallback = "ABBREVIATION_OF subpolity",
},
["ABBREVIATION_OF state"] = {
link = false,
fallback = "ABBREVIATION_OF subpolity",
},
["ABBREVIATION_OF subpolity"] = {
link = false,
default = {"Abbreviations of political divisions"},
},
["ABBREVIATION_OF territory"] = {
link = false,
fallback = "ABBREVIATION_OF subpolity",
},
["ABBREVIATION_OF union territory"] = {
link = false,
fallback = "ABBREVIATION_OF subpolity",
},
---------- Archaic forms ----------
["archaic forms of places!"] = {
full_category_link = "{{glossary|archaic}} [[form]]s of [[name]]s of [[place]]s",
bare_category_breadcrumb = "archaic forms",
bare_category_parent = "สถานที่",
},
["ARCHAIC_FORM_OF place"] = {
link = false,
default = {"Archaic forms of places"},
},
---------- Clippings ----------
["clippings of places!"] = {
full_category_link = "{{glossary|clipping}}s of [[name]]s of [[place]]s",
bare_category_breadcrumb = "clippings",
bare_category_parent = "สถานที่",
},
["CLIPPING_OF place"] = {
link = false,
default = {"Clippings of places"},
},
---------- Dated forms ----------
["dated forms of places!"] = {
full_category_link = "{{glossary|dated}} [[form]]s of [[name]]s of [[place]]s",
bare_category_breadcrumb = "dated forms",
bare_category_parent = "สถานที่",
},
["DATED_FORM_OF place"] = {
link = false,
default = {"Dated forms of places"},
},
---------- Derogatory names ----------
["derogatory names for cities!"] = {
full_category_link = "{{glossary|derogatory}} [[name]]s for [[city|cities]]",
bare_category_breadcrumb = "นคร",
bare_category_parent = "derogatory names for places",
addl_bare_category_parents = {"nicknames for cities"},
},
["derogatory names for continents!"] = {
full_category_link = "{{glossary|derogatory}} [[name]]s for [[continent]]s",
bare_category_breadcrumb = "ทวีป",
bare_category_parent = "derogatory names for places",
addl_bare_category_parents = {"nicknames for continents"},
},
["derogatory names for countries!"] = {
full_category_link = "{{glossary|derogatory}} [[name]]s for [[country|countries]]",
bare_category_breadcrumb = "ประเทศ",
bare_category_parent = "derogatory names for places",
addl_bare_category_parents = {"nicknames for countries"},
},
["derogatory names for places!"] = {
full_category_link = "{{glossary|derogatory}} [[name]]s for [[place]]s",
bare_category_breadcrumb = "derogatory names",
bare_category_parent = "nicknames for places",
},
["derogatory names for states!"] = {
full_category_link = "{{glossary|derogatory}} [[name]]s for [[state]]s",
bare_category_breadcrumb = "รัฐ",
bare_category_parent = "derogatory names for places",
addl_bare_category_parents = {"nicknames for states"},
},
["DEROGATORY_NAME_FOR capital"] = {
link = false,
default = {"Derogatory names for cities"},
},
["DEROGATORY_NAME_FOR city"] = {
link = false,
default = {"Derogatory names for cities"},
},
["DEROGATORY_NAME_FOR continent"] = {
link = false,
default = {"Derogatory names for continents"},
},
["DEROGATORY_NAME_FOR country"] = {
link = false,
default = {"Derogatory names for countries"},
},
["DEROGATORY_NAME_FOR metropolitan city"] = {
-- "metropolitan city" doesn't fall back to "นคร"
link = false,
default = {"Derogatory names for cities"},
},
["DEROGATORY_NAME_FOR place"] = {
link = false,
default = {"Derogatory names for places"},
},
["DEROGATORY_NAME_FOR prefecture-level city"] = {
-- "prefecture-level city" doesn't fall back to "นคร" but things like "county-level city" and
-- "subprovincial city" fall back to "prefecture-level city"
link = false,
default = {"Derogatory names for cities"},
},
["DEROGATORY_NAME_FOR state"] = {
link = false,
default = {"Derogatory names for states"},
},
["DEROGATORY_NAME_FOR town"] = {
link = false,
default = {"Derogatory names for cities"},
},
---------- Ellipses ----------
["ellipses of places!"] = {
full_category_link = "{{glossary|ellipsis|ellipses}} of [[name]]s of [[place]]s",
bare_category_breadcrumb = "ellipses",
bare_category_parent = "สถานที่",
},
["ELLIPSIS_OF place"] = {
link = false,
default = {"Ellipses of places"},
},
---------- Former long-form names ----------
["former long-form names of countries!"] = {
full_category_link = "no-longer-[[use]]d [[long]]-[[form]] (but typically [[unofficial]]) [[name]]s of [[country|countries]]",
bare_category_breadcrumb = "ประเทศ",
bare_category_parent = "former long-form names of places",
addl_bare_category_parents = {{name = "former names of countries", sort = "long-form"}},
},
["former long-form names of places!"] = {
full_category_link = "no-longer-[[use]]d [[long]]-[[form]] (but typically [[unofficial]]) [[name]]s of [[place]]s",
bare_category_breadcrumb = "long-form",
bare_category_parent = "former names of places",
},
["FORMER_LONG_FORM_OF country"] = {
link = false,
default = {"Former long-form names of countries"},
},
["FORMER_LONG_FORM_OF place"] = {
link = false,
default = {"Former long-form names of places"},
},
---------- Former names ----------
["former names of capitals!"] = {
full_category_link = "[[former]] [[name]]s of [[capital city|capital cities]] that generally still exist but under a different name",
bare_category_breadcrumb = "capitals",
bare_category_parent = "former names of settlements",
},
["former names of countries!"] = {
full_category_link = "[[former]] [[name]]s of [[country|countries]] that generally still exist but under a different name",
bare_category_breadcrumb = "ประเทศ",
bare_category_parent = "former names of places",
},
["former names of places!"] = {
full_category_link = "[[former]] [[name]]s of [[place]]s that generally still exist but under a different name",
bare_category_breadcrumb = "former names",
bare_category_parent = "สถานที่",
},
["former names of political divisions!"] = {
full_category_link = "[[former]] [[name]]s of [[political]] [[division]]s (states, provinces, counties, etc.) that generally still exist but under a different name",
bare_category_breadcrumb = "political divisions",
bare_category_parent = "former names of places",
},
["former names of polities!"] = {
full_category_link = "[[former]] [[name]]s of [[polity|polities]] (e.g. [[country|countries]]) that generally still exist but under a different name",
bare_category_breadcrumb = "องค์การทางการเมือง",
bare_category_parent = "former names of places",
},
["former names of settlements!"] = {
full_category_link = "[[former]] [[name]]s of [[city|cities]], [[town]]s, [[village]]s, etc. that generally still exist but under a different name",
bare_category_breadcrumb = "การตั้งถิ่นฐาน",
bare_category_parent = "former names of political divisions",
},
["FORMER_NAME_OF capital"] = {
link = false,
default = {"Former names of capitals"},
},
["FORMER_NAME_OF country"] = {
link = false,
default = {"Former names of countries"},
},
["FORMER_NAME_OF place"] = {
link = false,
default = {"Former names of places"},
},
["FORMER_NAME_OF polity"] = {
link = false,
default = {"Former names of polities"},
},
["FORMER_NAME_OF region"] = {
link = false,
fallback = "FORMER_NAME_OF subpolity",
},
["FORMER_NAME_OF settlement"] = {
link = false,
default = {"Former names of settlements"},
},
["FORMER_NAME_OF subpolity"] = {
link = false,
default = {"Former names of political divisions"},
},
---------- Former nicknames ----------
["former nicknames for cities!"] = {
full_category_link = "no-longer-used [[nickname]]s for [[city|cities]], e.g. the [[Eternal City]] for [[Kyoto]] during the {{w|Heian period}} ({{circa2|800–1100|short=yes}} {{AD}})",
bare_category_breadcrumb = "นคร",
bare_category_parent = "former nicknames for places",
addl_bare_category_parents = {"nicknames for cities"},
},
["former nicknames for places!"] = {
full_category_link = "no-longer-used [[nickname]]s for [[place]]s",
bare_category_breadcrumb = "former",
bare_category_parent = "nicknames for places",
addl_bare_category_parents = {{name = "former names of places", sort = "nicknames"}},
},
["FORMER_NICKNAME_FOR capital"] = {
link = false,
default = {"Former nicknames for cities"},
},
["FORMER_NICKNAME_FOR city"] = {
link = false,
default = {"Former nicknames for cities"},
},
["FORMER_NICKNAME_FOR metropolitan city"] = {
-- "metropolitan city" doesn't fall back to "นคร"
link = false,
default = {"Former nicknames for cities"},
},
["FORMER_NICKNAME_FOR place"] = {
link = false,
default = {"Former nicknames for places"},
},
["FORMER_NICKNAME_FOR prefecture-level city"] = {
-- "prefecture-level city" doesn't fall back to "นคร" but things like "county-level city" and
-- "subprovincial city" fall back to "prefecture-level city"
link = false,
default = {"Former nicknames for cities"},
},
["FORMER_NICKNAME_FOR town"] = {
link = false,
default = {"Former nicknames for cities"},
},
---------- Former official names ----------
["former official names of countries!"] = {
full_category_link = "no-longer-[[use]]d [[official]] [[name]]s of [[country|countries]]",
bare_category_breadcrumb = "ประเทศ",
bare_category_parent = "former official names of places",
addl_bare_category_parents = {{name = "former names of countries", sort = "official"}},
},
["former official names of places!"] = {
full_category_link = "no-longer-[[use]]d [[official]] [[name]]s of [[place]]s",
bare_category_breadcrumb = "official",
bare_category_parent = "former names of places",
},
["FORMER_OFFICIAL_NAME_OF country"] = {
link = false,
default = {"Former official names of countries"},
},
["FORMER_OFFICIAL_NAME_OF place"] = {
link = false,
default = {"Former official names of places"},
},
---------- Long-form names ----------
["long-form names of countries!"] = {
full_category_link = "[[long]]-[[form]] (but typically [[unofficial]]) [[name]]s of [[country|countries]]",
bare_category_breadcrumb = "ประเทศ",
bare_category_parent = "long-form names of places",
},
["long-form names of places!"] = {
full_category_link = "[[long]]-[[form]] (but typically [[unofficial]]) [[name]]s of [[place]]s",
bare_category_breadcrumb = "long-form names",
bare_category_parent = "สถานที่",
},
["LONG_FORM_OF country"] = {
link = false,
default = {"Long-form names of countries"},
},
["LONG_FORM_OF place"] = {
link = false,
default = {"Long-form names of places"},
},
---------- Nicknames ----------
["nicknames for cities!"] = {
full_category_link = "[[nickname]]s for [[city|cities]], e.g. the [[Big Apple]] for [[New York City]]",
bare_category_breadcrumb = "นคร",
bare_category_parent = "nicknames for places",
addl_bare_category_parents = {"นคร"},
},
["nicknames for continents!"] = {
full_category_link = "[[nickname]]s for [[continent]]s",
bare_category_breadcrumb = "ทวีป",
bare_category_parent = "nicknames for places",
addl_bare_category_parents = {"ทวีป"},
},
["nicknames for countries!"] = {
full_category_link = "[[nickname]]s for [[country|countries]]",
bare_category_breadcrumb = "ประเทศ",
bare_category_parent = "nicknames for places",
addl_bare_category_parents = {"ประเทศ"},
},
["nicknames for places!"] = {
full_category_link = "[[nickname]]s for [[place]]s",
bare_category_breadcrumb = "สถานที่",
bare_category_parent = "nicknames",
addl_bare_category_parents = {"สถานที่"},
},
["nicknames for states!"] = {
-- For categorizing nicknames for states of e.g. the United States
full_category_link = "[[nicknames]] for [[state]]s",
bare_category_breadcrumb = "รัฐ",
bare_category_parent = "nicknames for places",
addl_bare_category_parents = {"รัฐ"},
},
["NICKNAME_FOR capital"] = {
link = false,
default = {"Nicknames for cities"},
},
["NICKNAME_FOR city"] = {
link = false,
default = {"Nicknames for cities"},
},
["NICKNAME_FOR continent"] = {
link = false,
default = {"Nicknames for continents"},
},
["NICKNAME_FOR country"] = {
link = false,
default = {"Nicknames for countries"},
},
["NICKNAME_FOR metropolitan city"] = {
-- "metropolitan city" doesn't fall back to "นคร"
link = false,
default = {"Nicknames for cities"},
},
["NICKNAME_FOR place"] = {
link = false,
default = {"Nicknames for places"},
},
["NICKNAME_FOR prefecture-level city"] = {
-- "prefecture-level city" doesn't fall back to "นคร" but things like "county-level city" and
-- "subprovincial city" fall back to "prefecture-level city"
link = false,
default = {"Nicknames for cities"},
},
["NICKNAME_FOR state"] = {
link = false,
default = {"Nicknames for states"},
},
["NICKNAME_FOR town"] = {
link = false,
default = {"Nicknames for cities"},
},
---------- Obsolete forms ----------
["obsolete forms of places!"] = {
full_category_link = "{{glossary|obsolete}} [[form]]s of [[name]]s of [[place]]s",
bare_category_breadcrumb = "obsolete forms",
bare_category_parent = "สถานที่",
},
["OBSOLETE_FORM_OF place"] = {
link = false,
default = {"Obsolete forms of places"},
},
---------- Official names ----------
["official names of countries!"] = {
full_category_link = "[[official]] [[name]]s of [[country|countries]]",
bare_category_breadcrumb = "ประเทศ",
bare_category_parent = "official names of places",
},
["official names of former countries!"] = {
full_category_link = "[[official]] [[name]]s of [[country|countries]] that no longer [[exist]]",
bare_category_breadcrumb = "ประเทศ",
bare_category_parent = "official names of former places",
},
["official names of former places!"] = {
full_category_link = "[[official]] [[name]]s of [[place]]s that no longer [[exist]]",
bare_category_breadcrumb = "official names",
bare_category_parent = "former places",
addl_bare_category_parents = {{name = "official names of places", sort = "former"}},
},
["official names of places!"] = {
full_category_link = "[[official]] [[name]]s of [[place]]s",
bare_category_breadcrumb = "official names",
bare_category_parent = "สถานที่",
},
["OFFICIAL_NAME_OF country"] = {
link = false,
default = {"Official names of countries"},
},
["OFFICIAL_NAME_OF FORMER country"] = {
link = false,
default = {"Official names of former countries"},
},
["OFFICIAL_NAME_OF FORMER place"] = {
link = false,
default = {"Official names of former places"},
},
["OFFICIAL_NAME_OF place"] = {
link = false,
default = {"Official names of places"},
},
---------- Official nicknames ----------
["official nicknames for places!"] = {
full_category_link = "[[official]] [[nickname]]s for [[place]]s",
bare_category_breadcrumb = "official",
bare_category_parent = "nicknames for places",
},
["official nicknames for states!"] = {
-- For categorizing official nicknames for states of e.g. the United States
full_category_link = "[[official]] [[nicknames]] for [[state]]s",
bare_category_breadcrumb = "official",
bare_category_parent = "nicknames for states",
addl_bare_category_parents = {"รัฐ"},
},
["OFFICIAL_NICKNAME_FOR place"] = {
link = false,
default = {"Official nicknames for places"},
},
["OFFICIAL_NICKNAME_FOR state"] = {
link = false,
default = {"Official nicknames for states"},
},
}
export.plural_placetype_to_singular = {}
for sg_placetype, spec in pairs(export.placetype_data) do
if spec.plural then
export.plural_placetype_to_singular[spec.plural] = sg_placetype
end
end
return export
mnlfks1x4rjaf9dt88kmmznb0uvx2cd
5720689
5720688
2026-04-21T01:29:08Z
OctraBot
3198
5720689
Scribunto
text/plain
local export = {}
export.force_cat = false -- set to true for testing
local m_locations = require("Module:place/locations")
local m_links = require("Module:links")
local m_table = require("Module:table")
local m_strutils = require("Module:string utilities")
local debug_track_module = "Module:debug/track"
local en_utilities_module = "Module:en-utilities"
local dump = mw.dumpObject
local insert = table.insert
local concat = table.concat
local internal_error = m_locations.internal_error
export.internal_error = internal_error
local process_error = m_locations.process_error
export.process_error = process_error
local unpack = unpack or table.unpack -- Lua 5.2 compatibility
local ucfirst = m_strutils.ucfirst
local ulower = m_strutils.lower
local rmatch = m_strutils.match
local split = m_strutils.split
--[==[ intro:
This module contains placetype data used by [[Module:place]] and {{tl|place}}, along with a significant amount of code
to work with both placetypes and locations, as well as some placename-related info (FIXME: Consider moving it to
[[Module:place/locations]]). See also [[Module:place/locations]], which has definitions of all known locations. You must
currently load this module using {{cd|require()}}, not using {{cd|mw.loadData()}}.
In particular, it contains two fundamental and tricky functions:
# `get_placetype_equivs`, which finds the equivalent placetypes to look under in order to find a given property, and in
the process correctly handles placetypes with qualifiers (including qualifiers that act similar to "type-raising"
operators in that they do something non-trivial to the placetype to their right) as well as form-of directives and
fallbacks.
# `find_matching_holonym_location`, which looks up a holonym to find a matching known location, but in the process
checks holonyms to the right to make sure there isn't a clash between the user-specified containing holonyms and the
containers of the known location being considered. This is done to prevent overcategorizing when either there are two
known locations with the same name (e.g. Birmingham in England and Birmingham, Alabama in the US), or more generally
two locations with the same name, one of which is a known location but where the other is not (e.g. we're processing
non-known-location Mérida, Spain and don't want it categorized like known location Mérida, Yucatán, Mexico).
Both of these functions are invoked repeatedly, and probably are invoked several times on the same inputs and as a
result are candidates for memoization to speed up the operation of {{tl|place}}.
]==]
------------------------------------------------------------------------------------------
-- Basic utilities --
------------------------------------------------------------------------------------------
--[==[
Return true if `force_cat` is set either in this module or in [[Module:place/locations]].
]==]
function export.get_force_cat()
return export.force_cat or m_locations.force_cat
end
-- Add the page to a tracking "category". To see the pages in the "category",
-- go to [[Wiktionary:Tracking/place/PAGE]] and click on "What links here".
local function track(page)
require(debug_track_module)("place/" .. page)
return true
end
function export.remove_links_and_html(text)
text = m_links.remove_links(text)
return text:gsub("<.->", "")
end
--[==[
Return the singular version of a maybe-plural placetype, or nil if not plural. This correctly handles placetypes with
irregular plurals such as `kibbutzim` plural of `kibbutz` by looking up in a table constructed from the `plural` values
specified in `placetype_data`. If a special plural value is not found, the regular singularization algorithm in
[[Module:en-utilities]] is invoked, which reverses the y -> ies change after vowels and the 'es' addition after sh/ch/x,
and otherwise just subtracts a final 's' (which will incorrectly generate 'passe' for plural 'passes'; FIXME: consider
changing this for words ending in '-sses'). If the generated singular is the same as the passed-in value, nil is
returned.
]==]
function export.maybe_singularize_placetype(placetype)
if not placetype then
return nil
end
if export.plural_placetype_to_singular[placetype] then
return export.plural_placetype_to_singular[placetype]
end
local retval = --[[require(en_utilities_module).singularize(placetype)]] placetype
if retval == placetype then
return nil
end
return retval
end
-- Return the correct plural of a placetype, and (if `do_ucfirst` is given) make the first letter uppercase. We first
-- look up the plural in `placetype_data`, falling back to pluralize() in [[Module:en-utilities]], which is almost
-- always correct.
function export.pluralize_placetype(placetype, do_ucfirst)
local ptdata = export.placetype_data[placetype]
if ptdata and ptdata.plural then
placetype = ptdata.plural
else
placetype = --[[require(en_utilities_module).pluralize(placetype)]] placetype
end
if do_ucfirst then
return ucfirst(placetype)
else
return placetype
end
end
--[==[
Get the data associated with a placetype, which may be in its singular or plural form. If `from_category` is specified,
we also look for category-only placetypes (generally plural) followed by `!`. Return three values: (a) the placetype
under which the data can be looked up (i.e. in its singular form if the passed-in `placetype` is plural and did not
match a category-only placetype followed by `!`); (b) the placetype data structure; (c) the type of `placetype` match
that occurred, one of `"direct"` if the canonical placetype is the same as the passed-in `placetype` and also the same
as the key under which `ptdata` was looked up, or `"direct-category"` if the `ptdata` was looked up under a key formed
from the passed-in `placetype` by adding `!`, or `"plural"` if the `ptdata` was looked up under the singularized version
of the plural passed-in `placetype`.
]==]
function export.get_placetype_data(placetype, from_category)
local ptdata = export.placetype_data[placetype]
if ptdata then
return placetype, ptdata, "direct"
end
if from_category then
ptdata = export.placetype_data[placetype .. "!"]
if ptdata then
return placetype .. "!", ptdata, "direct-category"
end
end
local sg_placetype = export.maybe_singularize_placetype(placetype)
if sg_placetype then
ptdata = export.placetype_data[sg_placetype]
if ptdata then
return sg_placetype, ptdata, "plural"
end
end
return nil
end
--[==[
Check for special pseudo-placetypes that should be ignored for categorization purposes.
]==]
function export.placetype_is_ignorable(placetype)
return placetype == "and" or placetype == "or" or placetype == "และ" or placetype == "หรือ" or placetype:find("^%(")
end
function export.resolve_placetype_aliases(placetype)
return export.placetype_aliases[placetype] or placetype
end
--[==[
Return a property from `placetype_data` for a given placetype. If the placetype isn't found in `placetype_data`, or the
key isn't found in the placetype's entry in `placetype_data`, return nil.
]==]
function export.get_placetype_prop(placetype, key)
-- Usually we are called on equivalent placetypes returned from `get_placetype_equivs`, in which case placetype
-- aliases have been resolved, but sometimes not, e.g. when fetching the indefinite article in
-- get_placetype_article(). `resolve_placetype_aliases` is just a simple lookup and it doesn't hurt to do it twice.
placetype = export.resolve_placetype_aliases(placetype)
if export.placetype_data[placetype] then
return export.placetype_data[placetype][key]
else
return nil
end
end
--[==[
Given a placetype, split the placetype into one or more potential ''splits'', each consisting of a three-element list
{ {``prev_qualifiers``, ``this_qualifier``, ``reduced_placetype``}}, i.e.
# the concatenation of zero or more previously-recognized qualifiers on the left, normally canonicalized (if there are
zero such qualifiers, the value will be nil);
# a single recognized qualifier, normally canonicalized (if there is no qualifier, the value will be nil);
# the "reduced placetype" on the right.
Splitting between the qualifier in (2) and the reduced placetype in (3) happens at each space character, proceeding from
left to right, and stops if a qualifier isn't recognized. All placetypes are canonicalized by checking for aliases
in `placetype_aliases`, but no other checks are made as to whether the reduced placetype is recognized. Canonicalization
of qualifiers does not happen if `no_canon_qualifiers` is specified.
For example, given the placetype `"small beachside unincorporated community"`, the return value will be
{ {
{nil, nil, "small beachside unincorporated community"},
{nil, "small", "beachside unincorporated community"},
{"small", "[[beachfront]]", "unincorporated community"},
{"small [[beachfront]]", "[[unincorporated]]", "community"},
}}
Here, `"beachside"` is canonicalized to `"[[beachfront]]"` and `"unincorporated"` is canonicalized to
`"[[unincorporated]]"`, in both cases according to the entry in `placetype_qualifiers`.
On the other hand, if given `"small former haunted community"`, the return value will be
{ {
{nil, nil, "small former haunted community"},
{nil, "small", "former haunted community"},
{"small", "former", "haunted community"},
}}
because `"small"` and `"former"` but not `"haunted"` are recognized as qualifiers.
Finally, if given `"former adr"`, the return value will be
{ {
{nil, nil, "former adr"},
{nil, "former", "administrative region"},
}}
because `"adr"` is a recognized placetype alias for `"administrative region"`.
]==]
function export.split_qualifiers_from_placetype(placetype, no_canon_qualifiers)
local splits = {{nil, nil, export.resolve_placetype_aliases(placetype)}}
local prev_qualifier = nil
while true do
local qualifier, reduced_placetype = placetype:match("^(.-) (.*)$")
if qualifier then
local canon = export.placetype_qualifiers[qualifier]
if canon == nil then
break
end
local new_qualifier = qualifier
if type(canon) == "table" then
canon = canon.link
end
if not no_canon_qualifiers and canon ~= false then
if canon == true then
new_qualifier = "[[" .. qualifier .. "]]"
else
new_qualifier = canon
end
end
insert(splits, {prev_qualifier, new_qualifier, export.resolve_placetype_aliases(reduced_placetype)})
prev_qualifier = prev_qualifier and prev_qualifier .. " " .. new_qualifier or new_qualifier
placetype = reduced_placetype
else
break
end
end
return splits
end
--[==[
Given a `placetype` (which may be pluralized), return an ordered list of equivalent placetypes to look under to find the
placetype's properties (such as the category or categories to be inserted). The return value is actually an ordered list
of objects of the form `{qualifier=``qualifier``, placetype=``equiv_placetype``}` where ``equiv_placetype`` is a
placetype whose properties to look up, derived from the passed-in placetype or from a contiguous subsequence of the
words in the passed-in placetype (always including the rightmost word in the placetype, i.e. we successively chop off
qualifier words from the left and use the remainder to find equivalent placetypes). ``qualifier`` is the remaining words
not part of the subsequence used to find ``equiv_placetype``; or nil if all words in the passed-in placetype were used
to find ``equiv_placetype``. (FIXME: This qualifier is not currently used anywhere.) Only placetypes for which there is
an entry in `placetype_data` are included. The placetype passed in is always checked first, and will form the first
entry if it exists in `placetype_data`.
'''NOTE:''' This is a tricky function as it implements handling of (a) qualifiers, (b) fallback logic, (c)
"type-raising" qualifiers such as `former`/`ancient`/etc. as well as `fictional` and `mythological`, and (d) form-of
directives, which act somewhat similarly to `former`, and allows interaction between more than one of these
simultaneously (e.g. official names of former places, which have their own categorization).
If {{tl|place}} gets too slow, one potential speedup is to memoize the results of this function, as it appears to be
getting called more than once on the same inputs. Another similar potential speedup is to memoize the results of
`iterate_matching_holonym_location()`.
For example, given the placetype `left tributary`, the following placetype/qualifier combinations are checked in turn:
```
{qualifier = nil, placetype="left tributary"}
{qualifier = "left", placetype="tributary"}
{qualifier = "left", placetype="แม่น้ำ"}
```
and the return value will be
{ {
{qualifier = "left", placetype="tributary"},
{qualifier = "left", placetype="แม่น้ำ"},
}}
The algorithm first enters the placetype itself into the list, then checks for `left tributary` as a recognized
placetype in `placetype_data` and doesn't find it, so it doesn't enter it into the returned list (if it found it, it
would add it as well as any fallbacks directly after it). It then splits off the recognized qualifier `left` to form the
''reduced placetype'' `tributary`, which is entered into the list because it is found in `placetype_data`. Then, because
it has a fallback `river`, which exists in `placetype_data`, the fallback is entered next.
Another example is `small rural fraziones` (where a ''frazione'' is type of subdivision of a ''comune'' or municipality,
often specifically an outlying hamlet). the placetype/qualifier combinations checked are:
```
{qualifier = nil, placetype="small rural fraziones"}
{qualifier = nil, placetype="small rural frazione"}
{qualifier = "small", placetype="rural fraziones"}
{qualifier = "small", placetype="rural frazione"}
{qualifier = "small [[rural]]", placetype="fraziones"}
{qualifier = "small [[rural]]", placetype="frazione"}
{qualifier = "small [[rural]]", placetype="hamlet"}
{qualifier = "small [[rural]]", placetype="village"}
```
The return value ends up as
{qualifier = "small [[rural]]", placetype="frazione"},
{qualifier = "small [[rural]]", placetype="hamlet"},
{qualifier = "small [[rural]]", placetype="village"},
}}
Here, because the result of singularizing `fraziones` returns a different value from the placetype itself, that
singularized value is checked after the original plural value. Also, in the process of splitting off qualifiers,
they are canonicalized if the entry in `placetype_qualifiers` says to do so; in this case, links are placed around
`rural`. Finally, `frazione` has `hamlet` as its fallback, which in turn has `village` as its fallback, so both
fallbacks end up being returned.
`no_fallback`, if set, disables returning equivalent placetypes based on the `fallback` setting for a placetype. This is
used in the first of two loops in find_placetype_cat_specs() in [[Module:place]] to prefer exact matches for placetypes
such as barangays with later holonyms to matches based on a fallback such as `neighborhood` with an earlier holonym.
See the comment in that function in [[Module:place]] for a more detailed explanation of why this is needed. Only the
placetype itself, and any reduced placetypes created by chopping off recognized qualifiers at the beginning, are
returned; but we do not return reduced placetypes if a containing placetype exists in `placetype_data`. (For example,
`"overseas territory"` has a fallback `"dependent territory"`, and `"overseas"` is also a recognized qualifier. When
`no_fallback` is in place, without the above proviso, we would return `"overseas territory"` followed by `"ดินแดน"`
with the incorrect effect of classifying an `"overseas territory"` of the United Kingdom such as `"Gibraltar"` under
[[:Category:Territories of the United Kingdom]] instead of [[:Category:Dependent territories of the United Kingdom]].)
As an exception, if `historical`, `ancient`, `former` or the like are found, they proceed ignoring `no_fallback`,
because it seems tricky to handle them correctly in the presence of `no_fallback`, and historical/former placetypes
rarely occur with exact match category specs anyway.
`no_split_qualifiers` prevents splitting off recognized qualifiers and returning the remainder of the placetype as an
equivalent placetype. Only the passed-in placetype, and any fallbacks, will be returned. This is used in
[[Module:category tree/topic cat/data/Places]] when looking up placetypes found in categories. Such placetypes won't
have qualifiers and so it doesn't make sense to try and look for them.
`from_category`, if set, causes category-only placetypes (those ending in `!`) to also be checked.
`form_of_directive`, if set, causes the specified form-of directive (e.g. `FORMER_NAME_OF`) to be prepended to checked
placetypes, their directive-specific type (e.g. `FORMER_NAME_OF_type`), and their classes (`class`) to get the
appropriate placetypes to check for form-of-directive categories. It falls back to the prepended generic `place` as a
placetype, e.g. `FORMER_NAME_OF place`, if nothing else matches.
`no_check_for_inherently_former` is used internally to prevent an infinite loop when checking for `inherently_former`.
`register_former_as_non_former` is a major hack used in `get_bare_categories` to deal with the mismatch between e.g.
known location `Yugoslavia` declaring itself a `country` but definitions of it declaring it a `former country`. It
causes the non-former version of the specified placetype to be included in the returned equivalents along with the
former placetypes. [FIXME: This should apply only to the entries in `former_countries` but it's tricky to do that now;
fix this in the known-location refactor. -- The known-location refactor is already done but we haven't yet fixed this.]
]==]
function export.get_placetype_equivs(placetype, props)
local no_fallback, no_split_qualifiers, no_check_for_inherently_former, from_category, register_former_as_non_former
local form_of_directive
if props then
no_fallback, no_split_qualifiers, no_check_for_inherently_former, from_category, register_former_as_non_former =
props.no_fallback, props.no_split_qualifiers, props.no_check_for_inherently_former, props.from_category,
props.register_former_as_non_former
form_of_directive = props.form_of_directive
end
local equivs = {}
-- Insert `placetype` into `equivs`, along with any fallback placetypes listed in `placetype_data`. `qualifier` is
-- the preceding qualifier to insert into `equivs` along with the placetype (see comment at top of function). If
-- `from_category` is given, we also check for a category-specific entry consisting of the placetype followed by
-- `!`, and in all cases we also check to see if `placetype` is plural, and if so, insert the singularized version
-- along with its fallbacks (if any) in `placetype_data`. `form_of_prefix` is a form-of prefix such as
-- `OFFICIAL_NAME_OF`. If specified, we check the fallbacks of `placetype` without the prefix but then insert into
-- `equivs` the prefixed placetype. This way, if the user says e.g. {{tl|place|pt|@official name of:Cuba|island country|r/Caribbean}},
-- we will correctly categorize into [[:Category:Official names of countries]], rather than only trying to look up
-- `OFFICIAL_NAME_OF island country` and failing, falling back ultimately to [[:Category:Official names of places]].
local function insert_placetype_and_fallbacks(qualifier, placetype, form_of_prefix)
local function insert_equiv(pt)
if form_of_prefix then
-- Let's say the user says {{tl|place|pt|@official name of:Cuba|island country|r/Caribbean}} and we have
-- no entry for `OFFICIAL_NAME_OF island country` but we do for `OFFICIAL_NAME_OF country` (which we end
-- up processing because `island country` falls back to `country`), and that entry in turn is defined
-- using a fallback. We have to insert that fallback-of-fallback, and the easiest/cleanest way of
-- handling this is by calling ourselves recursively.
insert_placetype_and_fallbacks(qualifier, form_of_prefix .. " " .. pt)
else
insert(equivs, {qualifier=qualifier, placetype=pt})
end
end
-- Insert the placetype, along with any fallbacks.
local canon_placetype, ptdata, ptmatch = export.get_placetype_data(placetype, from_category)
if ptdata then
insert_equiv(canon_placetype)
if no_fallback then
return
end
local first_placetype = #equivs + 1
local prev_placetype = nil
while true do
local pt_value = export.placetype_data[canon_placetype]
if not pt_value then
internal_error("Fallback value %s specified for placetype %s but is not in `placetype_data`",
canon_placetype, prev_placetype)
end
if pt_value.fallback then
insert_equiv(pt_value.fallback)
local last_placetype = #equivs
if last_placetype - first_placetype >= 10 then
local fallback_loop = {}
for i = first_placetype, last_placetype do
insert(fallback_loop, equivs[i].placetype)
end
internal_error("Apparent loop in fallback chain: %s", table.concat(fallback_loop, " -> "))
end
prev_placetype = canon_placetype
canon_placetype = pt_value.fallback
else
break
end
end
end
end
-- Insert `placetype` into `equivs`, along with any fallback placetypes listed in `placetype_data`. This is a
-- wrapper around the more basic `insert_placetype_and_fallbacks()` which handles form-of directives. If there is no
-- form-of directive, this function directly calls `insert_placetype_and_fallbacks()`. We do things this way so that
-- form-of directives correctly combine with `former`-type qualifiers. Note that we also have special backups for
-- form-of directives that check `DIRECTIVE place` (and before that, `DIRECTIVE FORMER/ANCIENT place` is there's a
-- `former`-type directive); these backups live outside this function because we want them done once, late, rather
-- than in each invocation of `process_and_insert_placetype()`.
local function process_and_insert_placetype(qualifier, reduced_placetype)
if form_of_directive then
-- First check for e.g. `OFFICIAL_NAME_OF island country` and its fallbacks; then we look for fallbacks of
-- `island country` and check e.g. `OFFICIAL_NAME_OF country` and its fallbacks. All of this is handled by
-- `insert_placetype_and_fallbacks()` with appropriate parameters. After that, check the general class of
-- the directive, e.g. `subpolity` if something like `district` is given. (Eventually, we check for
-- `OFFICIAL_NAME_OF place` as a backup, but this happens at the end outside the loop over qualifiers.)
insert_placetype_and_fallbacks(qualifier, reduced_placetype, form_of_directive)
if not no_fallback then
local reduced_placetype_equivs = export.get_placetype_equivs(reduced_placetype)
local directive_type = export.get_equiv_placetype_prop_from_equivs(reduced_placetype_equivs,
function(pt) return export.get_placetype_prop(pt, form_of_directive .. "_type") or
export.get_placetype_prop(pt, "class") end
)
if not directive_type then
local pt_data = export.get_equiv_placetype_prop_from_equivs(reduced_placetype_equivs,
function(pt) return export.placetype_data[pt] end
)
if pt_data then
internal_error("For placetype %s in conjunction with form-of directive %s, placetype data " ..
'located but directive-specific type property %s missing, and so is "class"; ' ..
"placetypes searched are %s", reduced_placetype, form_of_directive,
form_of_directive .. "_type", reduced_placetype_equivs)
else
-- This should be allowed, as we allow unrecognized placetypes in general.
end
elseif directive_type ~= "!" then
insert_placetype_and_fallbacks(qualifier, directive_type, form_of_directive)
end
end
else
insert_placetype_and_fallbacks(qualifier, reduced_placetype)
end
end
-- Successively split off recognized qualifiers and loop over successively greater sets of qualifiers from the left
-- (unless `no_split_qualifiers` is specified, in which case we don't check for qualifiers).
local splits
if no_split_qualifiers then
splits = {{nil, nil, export.resolve_placetype_aliases(placetype)}}
else
splits = export.split_qualifiers_from_placetype(placetype)
end
for _, split in ipairs(splits) do
local prev_qualifier, this_qualifier, reduced_placetype = unpack(split, 1, 3)
-- If a special "former" qualifier like `former` or `historical` isn't present, and
-- `no_check_for_inherently_former` is not given (this flag is used to avoid infinite loops), check for
-- "inherently former" placetypes like `satrapy` and `treaty port` that always refer to no-longer-existing
-- placetypes, and handle accordingly.
local unlinked_this_qualifier
if this_qualifier and this_qualifier:find("%[") then
unlinked_this_qualifier = export.remove_links_and_html(this_qualifier)
else
unlinked_this_qualifier = this_qualifier
end
local former_qualifiers = this_qualifier and export.former_qualifiers[unlinked_this_qualifier] or nil
if not former_qualifiers and not no_check_for_inherently_former then
former_qualifiers = export.get_equiv_placetype_prop(reduced_placetype,
function(pt) return export.get_placetype_prop(pt, "inherently_former") end,
{no_check_for_inherently_former = true})
end
-- If a special "former" qualifier like `former` or `historical` is present, map it to the appropriate internal
-- qualifiers (`ANCIENT` and/or `FORMER`, which are written in all-caps to distinguish them from user-specified
-- qualifiers), fetch the `former_type` property, and treat the placetype as if a concatenation of the mapped
-- qualifier(s) and the value of `former_type`. For example, if `medieval village` is given, we map `medieval`
-- to `ANCIENT` and `FORMER`, and `village` to its `former_type` of `settlement`, and enter the placetypes
-- `ANCIENT settlement` and `FORMER settlement` (in that order) into `equivs`. If the placetype following the
-- "former" qualifier is recognized in `placetype_data` but has no `former_type` and no fallback with a
-- `former_type` specified, it is an internal error; but if the placetype isn't recognized (e.g. something like
-- `former greenhouse` is specified and we don't have an entry for `greenhouse`), just track the occurrence and
-- don't enter anything into `equivs`.
if former_qualifiers then
-- FIXME: Should we respect `no_fallback` here? My instinct says no.
local reduced_placetype_equivs = export.get_placetype_equivs(reduced_placetype, {
no_check_for_inherently_former = true
})
local former_type = export.get_equiv_placetype_prop_from_equivs(reduced_placetype_equivs,
function(pt) return export.get_placetype_prop(pt, "former_type") or
export.get_placetype_prop(pt, "class") end
)
if not former_type then
local pt_data = export.get_equiv_placetype_prop_from_equivs(reduced_placetype_equivs,
function(pt) return export.placetype_data[pt] end
)
if pt_data then
internal_error("For placetype %s, placetype data located but `former_type` missing; " ..
"placetypes searched are %s", reduced_placetype, reduced_placetype_equivs)
else
-- Enable error when we've verified there aren't any examples.
track("bad-former-placetype")
track("bad-former-placetype/" .. reduced_placetype)
--process_error("For placetype '%s', unrecognized placetype following 'former'-type " ..
-- "qualifier; searched placetype(s) %s", reduced_placetype, dump(reduced_placetype_equivs))
end
elseif former_type ~= "!" then
-- First check directly for `ANCIENT/FORMER` + the original following placetype. This makes it possible
-- for (e.g.) former provinces of the Roman empire to be categorized specially.
for _, former_qualifier in ipairs(former_qualifiers) do
process_and_insert_placetype(prev_qualifier, former_qualifier .. " " .. reduced_placetype)
end
for _, former_qualifier in ipairs(former_qualifiers) do
process_and_insert_placetype(prev_qualifier, former_qualifier .. " " .. former_type)
end
-- HACK! See explanation above for `register_former_as_non_former`.
if register_former_as_non_former then
process_and_insert_placetype(prev_qualifier, reduced_placetype)
end
-- If we're processing a form-of directive, after doing everything else we do
-- `DIRECTIVE ANCIENT/FORMER place` e.g. `OFFICIAL_NAME_OF FORMER place` as a backup.
if form_of_directive and not no_fallback then
for _, former_qualifier in ipairs(former_qualifiers) do
insert_placetype_and_fallbacks(prev_qualifier, form_of_directive .. " " .. former_qualifier ..
" place")
end
end
-- Don't continue processing equivs. The reason is probably the same as the `break` below for
-- qualifier_to_placetype_equivs[]; categories for `former BLAH` are set using `default`, and
-- non-former equivs will otherwise take precedence.
break
end
end
-- Then see if the rightmost split-off qualifier is in qualifier_to_placetype_equivs
-- (e.g. 'fictional *' -> 'fictional location'). If so, add the mapping.
if this_qualifier and export.qualifier_to_placetype_equivs[unlinked_this_qualifier] then
insert(equivs, {
qualifier=prev_qualifier,
placetype=export.qualifier_to_placetype_equivs[unlinked_this_qualifier]
})
-- Don't continue processing equivs; otherwise, if we specify 'mythological city', even though the
-- equivalent entry for 'mythological location' gets inserted ahead of the entry for 'city', the
-- latter ends up generating the category because the category for 'mythological location' is set as
-- the default value, which is used only when no non-default category can be found.
break
end
-- Finally, join the rightmost split-off qualifier to the previously split-off qualifiers to form a combined
-- qualifier, and add it along with reduced_placetype and any mapping in placetype_data for reduced_placetype.
-- NOTE: The first time through this loop, both `prev_qualifier` and `this_qualifier` are nil, and this inserts
-- the full placetype into `equivs`.
local qualifier = prev_qualifier and prev_qualifier .. " " .. this_qualifier or this_qualifier
process_and_insert_placetype(qualifier, reduced_placetype)
-- If `no_fallback` and there's an entry in `placetype_data` for this placetype, don't include any reduced
-- placetypes to avoid the "overseas territory treated as a territory" issue describe above.
if no_fallback then
local canon_placetype, ptdata, ptmatch = export.get_placetype_data(reduced_placetype, from_category)
if canon_placetype then
break
end
end
end
-- If we're processing a form-of directive, after doing everything else we do `DIRECTIVE place` e.g.
-- `OFFICIAL_NAME_OF place` as a backup; but only if either the placetype as a whole is recognized or the placetype
-- begins with a recognized qualifier. This latter check is to avoid categorizing into e.g.
-- [[Category:en:Former names of places]] in an invocation like
-- {{place|en|@former name of:Democratic Republic of the Congo|country|r/Central Africa|;|used from 1971–1997}};
-- the `used from 1971–1997` gets treated as a placetype and we're called on it.
if form_of_directive and not no_fallback and (splits[2] or export.get_placetype_data(placetype, from_category)) then
insert_placetype_and_fallbacks(nil, form_of_directive .. " place")
end
return equivs
end
function export.get_equiv_placetype_prop_from_equivs(equivs, fun, continue_on_nil_only)
for _, equiv in ipairs(equivs) do
local retval = fun(equiv.placetype)
if continue_on_nil_only and retval ~= nil or not continue_on_nil_only and retval then
return retval, equiv
end
end
return nil, nil
end
--[==[
Given a placetype `placetype` and a function `fun` of one argument, iteratively call the function on equivalent
placetypes fetched from `get_placetype_equivs` until the function returns a non-falsy value (i.e. not {nil} or {false});
but if `continue_on_nil_only` is specified, the iterations continue until the function returns non non-{nil} value.
FIXME: We should make `continue_on_nil_only` the default; but this requires changing some callers.) When `fun` returns a
non-falsy or non-{nil} value, `get_equiv_placetype_prop` returns two values: the value returned by `fun` and the
equivalent placetype that triggered the non-falsy (or non-{nil}) return value. If `fun` never returns a non-falsy (or
non-{nil}) value, `get_equiv_placetype_prop` returns {nil} for both return values. If `placetype` is passed in as {nil},
the return value is the result of calling `fun` on {nil} (whatever it is) with {nil} for the second return value.
]==]
function export.get_equiv_placetype_prop(placetype, fun, props)
if not placetype then
return fun(nil), nil
end
return export.get_equiv_placetype_prop_from_equivs(export.get_placetype_equivs(placetype, props), fun,
props and props.continue_on_nil_only)
end
--[==[
Return the article that is used with an entry placetype. We proceed as follows:
# See if there is a recognized qualifier at the beginning that specifies an article (including `false` for no article).
This takes precedence over anything else, so that e.g. `various capitals` gets no article rather than "`the"`.
# Then check the placetype or any equivalent placetype for the `entry_placetype_use_the` property, indicating that
`"the"` should be used.
# Otherwise we look to see if the placetype itself (not any equivalents, even those involving deleting a qualifier from
the beginning) has an entry in `placetype_data` that specifies the indefinite article using `entry_placetype_use_the`
(principally for use with placetypes like `union territory`).
# Otherwise, we use [[Module:en-utilities]] to apply the standard algorithm to generate `"an"` for words beginning with
a vowel and `"a"` otherwise.
If `ucfirst` is true, the first letter of the article is made upper-case.
]==]
function export.get_placetype_article(placetype, ucfirst)
local art
local qualifier, reduced_placetype = placetype:match("^(.-) (.*)$")
if qualifier then
local canon = export.placetype_qualifiers[qualifier]
if type(canon) == "table" then
art = canon.article
end
end
if art == false then
return art
end
if art == nil then
local placetype_use_the = export.get_equiv_placetype_prop(placetype,
function(pt) return export.get_placetype_prop(pt, "entry_placetype_use_the") end)
if placetype_use_the then
art = "the"
else
art = export.get_placetype_prop(placetype, "entry_placetype_indefinite_article")
if not art then
art = --[[require(en_utilities_module).get_indefinite_article(placetype)]] ""
end
end
end
if ucfirst then
art = m_strutils.ucfirst(art)
end
return art
end
--[==[
Return the preposition that should be used after `placetype` when occurring as an entry placetype or in categories
(e.g. `city >in< France` but `country >of< South America`). The preposition defaults to `"ใน"` if not specified.
]==]
function export.get_placetype_entry_preposition(placetype)
local pt_prep = export.get_equiv_placetype_prop(placetype,
function(pt) return export.get_placetype_prop(pt, "preposition") end
)
return pt_prep or "ใน"
end
--[==[
Given a place desc (see top of file) and a holonym object (see top of file), add a key/value into the place desc's
`holonyms_by_placetype` field corresponding to the placetype and placename of the holonym. For example, corresponding
to the holonym "c/Italy", a key "ประเทศ" with the list value {"Italy"} will be added to the place desc's
`holonyms_by_placetype` field. If there is already a key with that place type, the new placename will be added to the
end of the value's list.
]==]
function export.key_holonym_into_place_desc(place_desc, holonym)
if not holonym.placetype then
return
end
-- Key in equivalent placetypes, so that e.g. `cities/San Francisco` gets keyed under `city`; but don't do
-- fallbacks, as it doesn't seem correct for the "do other holonyms of the same placetype" algorithm to do holonyms
-- of different types just because they have the same fallback.
local equiv_placetypes = export.get_placetype_equivs(holonym.placetype, {no_fallback = true})
local unlinked_placename = holonym.unlinked_placename
for _, equiv in ipairs(equiv_placetypes) do
local placetype = equiv.placetype
if not place_desc.holonyms_by_placetype then
place_desc.holonyms_by_placetype = {}
end
if not place_desc.holonyms_by_placetype[placetype] then
place_desc.holonyms_by_placetype[placetype] = {unlinked_placename}
else
insert(place_desc.holonyms_by_placetype[placetype], unlinked_placename)
end
end
end
--[=[
Construct a formatted link from the raw link spec `link` given the canonical singular placetype `sg_placetype`. If the
placetype was originally plural, `orig_placetype` should contain this plural value; otherwise it should be nil. This
will construct the appropriate type of link that displays as `orig_placetype` (or otherwise `sg_placetype`) but links to
whatever the `link` spec specifies (which may be `sg_placetype`, a Wikipedia article, etc.). `ptdata` is the placetype
data structure for the placetype, and `from_category` indicates that we are generating the description of a category
(otherwise we are generating the display form of an entry placetype).
]=]
local function make_placetype_link(link, sg_placetype, orig_placetype, ptdata, from_category, noerror)
if not from_category and ptdata.disallow_in_entries then
if noerror then
return "[not meant to be specified directly, with warning: " .. ptdata.disallow_in_entries .. "]"
else
process_error("Placetype %s is not meant to be specified directly: " .. ptdata.disallow_in_entries, sg_placetype)
end
end
if link == nil then
internal_error("Placetype data present for placetype %s but no link= setting given", sg_placetype)
elseif link == true then
if orig_placetype then
return ("[[%s|%s]]"):format(sg_placetype, orig_placetype)
else
return ("[[%s]]"):format(sg_placetype)
end
elseif link == false then
process_error("Placetype %s is not meant to be specified directly, but is only for internal use", sg_placetype)
elseif link == "w" then
return ("[[w:%s|%s]]"):format(sg_placetype, orig_placetype or sg_placetype)
elseif link == "separately" then
if orig_placetype then
local sg_words = split(sg_placetype, " ")
local orig_words = split(orig_placetype, " ")
if #sg_words ~= #orig_words then
internal_error("Can't construct 'separately' link for plural placetype %s as original placetype %s " ..
"has different number of words", orig_placetype, sg_placetype)
else
for i = 1, #sg_words do
if sg_words[i] == orig_words[i] then
sg_words[i] = ("[[%s]]"):format(sg_words[i])
else
sg_words[i] = ("[[%s|%s]]"):format(sg_words[i], orig_words[i])
end
end
return concat(sg_words, " ")
end
else
return (sg_placetype:gsub("([^ ]+)", "[[%1]]"))
end
elseif link:find("^%+") then
link = link:sub(2) -- discard initial +
return ("[[%s|%s]]"):format(link, orig_placetype or sg_placetype)
elseif not orig_placetype then
return link
else
return --[[require(en_utilities_module).pluralize(link)]] link
end
end
--[==[
Get the display form of a placetype by looking it up in `placetype_data`. If the placetype is recognized, or is the
plural of a recognized placetype, the corresponding linked display form is returned (with plural placetypes displaying
as plural but linked to the singular form of the placetype). Otherwise, return nil. If we're generating the description
of a category, `category_type` should be set to one of `"top-level"` (for top-level categories like
[[:Category:Neighborhoods]]), `"noncity"` (for non-city categories like [[:Category:Neighborhoods in Illinois, USA]]) or
`"city"` (for city categories like [[:Category:Neighborhoods of Chicago]]). Otherwise, we're generating the description
for use in formatting a {{tl|place}} call, and category-only placetypes ending in `!` will be ignored, along with
special `category_link*` settings. `return_full` is used along with `category_type` and will preferably return the
"full" variant of category link settings, i.e. `full_category_link*`; if they don't exist, the `category_link*` value is
prepended with `"names of"`. `noerror` says to not throw an error when encountering entry placetypes that would be
disallowed.
]==]
function export.get_placetype_display_form(placetype, category_type, return_full, noerror)
local from_category = not not category_type
local canon_placetype, ptdata, ptmatch = export.get_placetype_data(placetype, from_category)
if canon_placetype then
local raw_link
local function is_linked_string(str)
return type(str) == "string" and str:find("%[%[")
end
if category_type then
local fetched_full
local function fetch_maybe_full(prop)
local retval = ptdata["full_" .. prop]
if retval ~= nil then
if return_full then
return retval, true
else
internal_error("Saw full_" .. prop .. "=%s but `return_full` not set, can't handle", retval)
end
end
return ptdata[prop], false
end
local function maybe_prefix(str)
if return_full and not fetched_full then
return "names of " .. str
else
return str
end
end
-- Careful with `false` as possible value.
if category_type == "top-level" then --ห้ามแปล
raw_link, fetched_full = fetch_maybe_full("category_link_top_level")
elseif category_type == "noncity" then --ห้ามแปล
raw_link, fetched_full = fetch_maybe_full("category_link_before_noncity")
elseif category_type == "city" then --ห้ามแปล
raw_link, fetched_full = fetch_maybe_full("category_link_before_city")
else
internal_error('Unrecognized value for `category_type` %s, should be "top-level", "noncity" or "city"', --ห้ามแปล
category_type)
end
if type(raw_link) == "string" then
return maybe_prefix(raw_link), ptdata
elseif raw_link ~= nil then
return raw_link, ptdata
end
raw_link, fetched_full = fetch_maybe_full("category_link")
if raw_link == false then
return raw_link, ptdata
end
if is_linked_string(raw_link) then
return maybe_prefix(raw_link), ptdata
end
if ptmatch == "plural" then
raw_link, fetched_full = fetch_maybe_full("plural_link")
if raw_link == false then
return raw_link, ptdata
end
if is_linked_string(raw_link) then
return maybe_prefix(raw_link), ptdata
end
end
if raw_link == nil then
raw_link, fetched_full = fetch_maybe_full("link")
end
if raw_link == false then
return raw_link, ptdata
end
return maybe_prefix(make_placetype_link(raw_link, canon_placetype,
placetype ~= canon_placetype and placetype or nil, ptdata, from_category, noerror)), ptdata
else
if ptmatch == "plural" then
raw_link = ptdata.plural_link
if raw_link == false then
process_error("Placetype %s cannot appear plural", placetype)
end
if is_linked_string(raw_link) then
return raw_link, ptdata
end
end
if raw_link == nil then
raw_link = ptdata.link
end
return make_placetype_link(raw_link, canon_placetype,
placetype ~= canon_placetype and placetype or nil, ptdata, from_category, noerror), ptdata
end
end
return nil
end
local function resolve_unlinked_placename_display_aliases(placetype, placename)
local equiv_placetypes = export.get_placetype_equivs(placetype)
for i, equiv in ipairs(equiv_placetypes) do
equiv_placetypes[i] = equiv.placetype
end
local all_display_aliases_found = {}
local all_others_found = {}
for group, key, spec in m_locations.iterate_matching_location {
placetypes = equiv_placetypes,
placename = placename,
alias_resolution = "display",
} do
if spec.alias_of and spec.display then
insert(all_display_aliases_found, {group, key, spec, spec.display_as_full})
else
insert(all_others_found, {group, key, spec})
end
end
if not all_display_aliases_found[1] then
return placename
elseif all_display_aliases_found[2] then
internal_error("Found multiple matching display aliases for placename %s, placetype %s: " ..
"all_display_aliases_found=%s, all_others_found=%s", placename, placetype, all_display_aliases_found,
all_others_found)
elseif all_others_found[1] then
internal_error("Found a display alias along with other possible meanings for placename %s, placetype %s: " ..
"all_display_aliases_found=%s, all_others_found=%s", placename, placetype, all_display_aliases_found,
all_others_found)
else
local group, key, spec, as_full = unpack(all_display_aliases_found[1])
local full, elliptical = m_locations.key_to_placename(group, key)
return as_full and full or elliptical
end
end
--[==[
If `placename` of type `placetype` is a display alias, convert it to its canonical form; otherwise, return unchanged.
Display aliases transform certain placenames into canonical displayed forms. For example, if any of `country/US`,
`country/USA` or `country/United States of America` (or `c/US`, etc.) are given, the result will be displayed as
`United States`.
'''NOTE''': Display aliases change what is displayed from what the editor wrote in the Wikitext. As a result, they
should (a) be non-political in nature, and (b) not involve a change where the word `the` needs to be added or removed.
For example, normalizing `US` and `USA` to `United States` for display purposes is OK but normalizing `Burma` to
`Myanmar` is not (instead a cat alias should be used) because the terms `Burma` and `Myanmar` have clear political
connotations. Similarly, we have a display alias that maps the old name of `Macedonia` as a country (but not a region!)
to `North Macedonia`, but `Republic of Macedonia` is mapped to `North Macedonia` only as a cat alias because the two
terms differ in their use of `the`. (For example, if we had a display alias mapping `Republic of Macedonia` to
`North Macedonia`, the call {{tl|place|en|the <<capital city>> of the <<c/Republic of Macedonia>>}} would wrongly
display as `the [[capital city]] of the [[North Macedonia]]`.) Generally, display normalizations tend to involve
alternative forms (e.g. abbreviations, ellipses, foreign spellings) where the normalization improves clarity and
consistency.
]==]
function export.resolve_placename_display_aliases(placetype, placename)
-- If the placename is a link, apply the alias inside the link.
-- This pattern matches both piped and unpiped links. If the link is not piped, the second capture (linktext) will
-- be empty.
local link, linktext = rmatch(placename, "^%[%[([^|%[%]]+)|?([^|%[%]]-)%]%]$")
if link then
if linktext ~= "" then
local alias = resolve_unlinked_placename_display_aliases(placetype, linktext)
return "[[" .. link .. "|" .. alias .. "]]"
else
local alias = resolve_unlinked_placename_display_aliases(placetype, link)
return "[[" .. alias .. "]]"
end
else
return resolve_unlinked_placename_display_aliases(placetype, placename)
end
end
--[==[
Generate the "prefixed" version of a bare key, i.e. prefix it with `the` if correct for this key.
]==]
function export.get_prefixed_key(key, spec)
if spec.the then
return "the " .. key
else
return key
end
end
-- Necessary for use by [[Module:place]]. FIXME: Reorganize the modules so this isn't necessary.
export.iterate_matching_location = m_locations.iterate_matching_location
--[=[
Iterator that iterates over holonyms in `place_desc`. If `first_holonym_index` is given, start iterating at the
specified holonym and stop either when there are no more holonyms or a holonym with modifier `:also` is found. If
`first_holonym_index` is nil or omitted, iterate over all holonyms regardless. If `include_raw_text_holonyms` is
specified, raw text holonyms (those not of the form `placetype/placename`) are returned as well; they can be identified
by the fact that the `placetype` field in the holonym structure is nil. Two values are returned at each iteration, the
holonym index and holonym structure, similar to `ipairs()`.
]=]
function export.get_holonyms_to_check(place_desc, first_holonym_index, include_raw_text_holonyms)
local stop_at_also = not not first_holonym_index
return function(place_desc, index)
while true do
index = index + 1
local this_holonym = place_desc.holonyms[index]
-- If we were passed in a starting holonym index, go up to but not including a holonym marked with `:also`
-- (continue_cat_loop); the categorization code will then restart the loop at that holonym. That holonym
-- will have `:also` marked on it, so make sure not to stop immediately if the first holonym is marked with
-- `:also`.
if not this_holonym or stop_at_also and index > first_holonym_index and this_holonym.continue_cat_loop then
return nil
end
-- If not placetype, we're processing raw text, which we normally want to skip.
if include_raw_text_holonyms or this_holonym.placetype then
return index, this_holonym
end
end
end, place_desc, first_holonym_index and first_holonym_index - 1 or 0
end
--[==[
If the holonym in `data` (in the format as passed to a category handler) refers to a known location, iterate over all
such known locations, returning for each location the corresponding key, spec and group as well as the trail of
ancestral containers. Unlike `iterate_matching_location()`, this specifically checks that there is no mismatch between
the location's containers at any level and any of the following holonyms in the {{tl|place}} spec. The fields in `data`
are:
* `holonym_placetype`: The placetype of the holonym. It can actually be a list of possible placetypes, as with
`iterate_matching_location()`.
* `holonym_placename`: The placename of the holonym.
* `holonym_index`: The index of the holonym among the holonyms in `place_desc`, or nil if the holonym is not among the
holonyms in `place_desc`. (If a holonym index is given, we check for container mismatches among the holonyms
following the specified index, stopping either when encountering a holonym marked with modifier `:also` or, if none
exist, when we run out of holonyms. If no holonym index is given, we check all holonyms for container mismatches.)
* `place_desc`: Description of the place; used for the holonyms, to check for container mismatches.
Returns four values: the location group, the canonical key by which the location is known, the spec object describing
the location and the trail of ancestral containers for the location. The first three values are the same as for
`iterate_matching_location`.
]==]
function export.iterate_matching_holonym_location(data)
local holonym_placetype, holonym_placename, holonym_index, place_desc =
data.holonym_placetype, data.holonym_placename, data.holonym_index, data.place_desc
local matching_location_iterator = m_locations.iterate_matching_location {
placetypes = holonym_placetype,
placename = holonym_placename,
}
return function()
while true do
local group, key, spec = matching_location_iterator()
if not group then
return nil
end
local container_trail = {}
-- For each level of container, check that there are no mismatches (i.e. other location of the same
-- placetype) mentioned. We allow a mismatch at a given level if there's also a match with the container
-- at that level. For example, in the case of Kansas City, defined in [[Module:place/locations]] as a city
-- in Missouri, if we define it as {{tl|place|city|s/Missouri,Kansas}}, we ignore the mismatching state of
-- Kansas because the correct state of Missouri was also mentioned. But imagine we are defining Newark,
-- Delaware as {{tl|place|city|s/Delaware|c/US}} and (as is the case) we have an entry for Newark, New
-- Jersey in [[Module:place/locations]]. Just because the containing location `US` matches isn't enough,
-- because Newark, NJ also has New Jersey as a containing location and there's a mismatch at that level. If
-- there are no mismatches at any level we assume we're dealing with the right known location.
--
-- If at a given level there are multiple containing locations, we count a match if any holonym matches any
-- containing location, and a mismatch only if a holonym exists of the same placetype that doesn't match any
-- containing location.
local containers_mismatch = false
for containers in m_locations.iterate_containers(group, key, spec) do
insert(container_trail, containers)
local match_at_level = false
local mismatch_at_level = false
for other_holonym_index, other_holonym in export.get_holonyms_to_check(place_desc,
holonym_index and holonym_index + 1 or nil) do
local other_source_holonym = other_holonym.augmented_from_holonym
if other_source_holonym and other_source_holonym.placetype == holonym_placetype and
other_source_holonym.unlinked_placename ~= holonym_placename then
-- Ignore holonyms added during the augmentation process for other holonyms of the same
-- placetype as the placetype of the holonym we're considering. See comment in
-- augment_holonyms_with_container() for why we do this.
-- continue; grrr, no 'continue' in Lua
else
local holonym_matches_at_level = false
local holonym_exists_with_same_placetype = false
for _, container in ipairs(containers) do
if not container.spec.no_check_holonym_mismatch then
local full_container_placename, elliptical_container_placename =
m_locations.key_to_placename(container.group, container.key)
local placetypes = container.spec.placetype
if type(placetypes) ~= "table" then
placetypes = {placetypes}
end
local placetype_equivs = {}
for _, pt in ipairs(placetypes) do
m_table.extend(placetype_equivs, export.get_placetype_equivs(pt))
end
local this_holonym_matches = export.get_equiv_placetype_prop_from_equivs(
placetype_equivs, function(placetype)
return other_holonym.placetype == placetype and
(other_holonym.unlinked_placename == full_container_placename or
other_holonym.unlinked_placename == elliptical_container_placename)
end
)
if this_holonym_matches then
holonym_matches_at_level = true
break
end
local this_holonym_exists_with_same_placetype = export.get_equiv_placetype_prop_from_equivs(
placetype_equivs, function(placetype)
return other_holonym.placetype == placetype
end
)
if this_holonym_exists_with_same_placetype then
-- We seem to have a mismatch at this level. But before we decide conclusively that this
-- is the case, check to see whether the putative mismatch is an alias and matches when
-- we resolve the alias.
for oh_group, oh_key, oh_spec, oh_container_trail in
export.iterate_matching_holonym_location {
holonym_placetype = other_holonym.placetype,
holonym_placename = other_holonym.unlinked_placename,
holonym_index = other_holonym_index,
place_desc = place_desc,
} do
local oh_full_placename, oh_elliptical_placename =
m_locations.key_to_placename(oh_group, oh_key)
if oh_full_placename == full_container_placename or
oh_elliptical_placename == elliptical_container_placename then
-- Alias matched when resolved.
this_holonym_matches = true
break
end
end
if this_holonym_matches then
-- Alias matched above when resolved.
holonym_matches_at_level = true
break
else
-- Not an alias, or doesn't match when resolved. We have a true mismatch.
holonym_exists_with_same_placetype = true
end
end
end
end
if holonym_matches_at_level then
match_at_level = true
break
end
if holonym_exists_with_same_placetype then
mismatch_at_level = true
end
end
end
if not match_at_level and mismatch_at_level then
containers_mismatch = true
break
end
end
if not containers_mismatch then
return group, key, spec, container_trail
end
end
end
end
--[==[
If the holonym in `data` (in the format as passed to a category handler) refers to a known location, find and return the
corresponding key, spec and group as well as the trail of ancestral containers. This is like
`iterate_matching_holonym_location()` but throws an error if more than one location matches. (An example where this
would happen is {{tl|place|en|neighborhood|city/Newcastle}}, because there are two known locations named Newcastle. To
fix this, specify additional following disambiguating holonyms, e.g.
{{tl|place|en|neighborhood|city/Newcastle|s/New South Wales}}.
]==]
function export.find_matching_holonym_location(data)
local all_found = {}
for group, key, spec, container_trail in export.iterate_matching_holonym_location(data) do
insert(all_found, {group, key, spec, container_trail})
end
if not all_found[1] then
return nil
elseif all_found[2] then
local holonym_placetype = data.holonym_placetype
if type(holonym_placetype) == "table" then
holonym_placetype = concat(holonym_placetype, ",")
end
local found_keys = {}
for _, found in ipairs(all_found) do
local _, key, _, _ = unpack(found)
insert(found_keys, key)
end
error(("Found multiple matching locations for holonym '%s/%s'; specify disambiguating context in the " ..
"containing holonyms: %s"):format(holonym_placetype, data.holonym_placename, dump(found_keys)))
else
return unpack(all_found[1])
end
end
------------------------------------------------------------------------------------------
-- Placename and placetype data --
------------------------------------------------------------------------------------------
--[==[ var:
This is a map from aliases to their canonical forms. Any placetypes appearing as keys here will be mapped to their
canonical forms in all respects, including the display form. Contrast entries in 'placetype_data' with a fallback, which
applies to categorization and other processes but not to display.
The most important aliases are for holonym placetypes, particularly those that occur often such as "ประเทศ", "รัฐ",
"จังหวัด" and the like. Particularly long placetypes that mostly occur as entry placetypes (e.g.
"census-designated place") can be given abbreviations, but it is generally preferred to spell out the entry placetype.
Note also that we purposely avoid certain abbreviations that would be ambiguous (e.g. "d", which could variously be
interpreted as "department", "อำเภอ" or "division").
]==]
export.placetype_aliases = {
["acomm"] = "autonomous community",
["adr"] = "administrative region",
["adterr"] = "administrative territory", -- Pakistan
["aobl"] = "autonomous oblast",
["aokr"] = "autonomous okrug",
["ap"] = "autonomous province",
["apref"] = "autonomous prefecture",
["aprov"] = "autonomous province",
["ar"] = "autonomous region",
["arch"] = "archipelago",
["arep"] = "autonomous republic",
["aterr"] = "autonomous territory",
["atu"] = "autonomous territorial unit",
["bor"] = "borough",
["c"] = "ประเทศ",
["can"] = "canton",
["carea"] = "council area",
["cc"] = "constituent country",
["cdblock"] = "community development block",
["cdep"] = "Crown dependency",
["CDP"] = "census-designated place",
["cdp"] = "census-designated place",
["clcity"] = "county-level city",
["co"] = "เทศมณฑล",
["cobor"] = "county borough",
["colcity"] = "county-level city",
["coll"] = "collectivity",
["comm"] = "community",
["cont"] = "ทวีป",
["contr"] = "continental region",
["contregion"] = "continental region",
["cpar"] = "civil parish",
["damun"] = "direct-administered municipality",
["dep"] = "dependency",
["department capital"] = "departmental capital",
["dept"] = "department",
["depterr"] = "dependent territory",
["dist"] = "อำเภอ",
["distmun"] = "district municipality",
["div"] = "division",
["emp"] = "จักรวรรดิ",
["fpref"] = "French prefecture",
["gov"] = "governorate",
["govnat"] = "governorate",
["home-rule city"] = "home rule city",
["home-rule municipality"] = "home rule municipality",
["inner-city area"] = "inner city area",
["ires"] = "Indian reservation",
["isl"] = "เกาะ",
["lbor"] = "London borough",
["lga"] = "local government area",
["lgarea"] = "local government area",
["lgd"] = "local government district",
["lgdist"] = "local government district",
["metbor"] = "metropolitan borough",
["metcity"] = "metropolitan city",
["metmun"] = "metropolitan municipality",
["mtn"] = "ภูเขา",
["mun"] = "เทศบาล",
["mundist"] = "municipal district",
["nonmetropolitan county"] = "non-metropolitan county",
["obl"] = "oblast",
["okr"] = "okrug",
["p"] = "จังหวัด",
["par"] = "parish",
["parmun"] = "parish municipality",
["pen"] = "peninsula",
["plcity"] = "prefecture-level city",
["plcolony"] = "Polish colony",
["pref"] = "prefecture",
["prefcity"] = "prefecture-level city",
["preflcity"] = "prefecture-level city",
["prov"] = "จังหวัด",
["r"] = "ภูมิภาค",
["range"] = "เทือกเขา",
["rcm"] = "regional county municipality",
["rcomun"] = "regional county municipality",
["rdist"] = "regional district",
["rep"] = "republic",
["rhrom"] = "rural hromada",
["riv"] = "แม่น้ำ",
["rmun"] = "regional municipality",
["robor"] = "royal borough",
["romp"] = "Roman province",
["runit"] = "regional unit",
["rurmun"] = "rural municipality",
["s"] = "รัฐ",
["sar"] = "special administrative region",
["shrom"] = "settlement hromada",
["spref"] = "subprefecture",
["sprefcity"] = "sub-prefectural city",
["sprovcity"] = "subprovincial city",
["submet city"] = "sub-metropolitan city",
["submetropolitan city"] = "sub-metropolitan city",
["sub-prefecture-level city"] = "sub-prefectural city",
["sub-provincial city"] = "subprovincial city",
["sub-provincial district"] = "subprovincial district",
["terr"] = "ดินแดน",
["terrauth"] = "territorial authority",
["twp"] = "township",
["twpmun"] = "township municipality",
["uauth"] = "unitary authority",
["ucomm"] = "unincorporated community",
["udist"] = "unitary district",
["uhrom"] = "urban hromada",
["uterr"] = "union territory",
["utwpmun"] = "united township municipality",
["val"] = "valley",
["vdc"] = "village development committee",
["vil"] = "village",
["voi"] = "voivodeship",
["wcomm"] = "Welsh community",
}
local no_link_def_article = {link = false, article = "the"}
local no_link_no_article = {link = false, article = false}
--[==[ var:
These qualifiers can be prepended onto any placetype and will be handled correctly. For example, the placetype
`large city` will be displayed as `large <nowiki>[[city]]</nowiki>` and categorized as if `city` were specified. If the
value in the following table is a string, the qualifier will display according to the string. If the value is `true`,
the qualifier will be linked to its corresponding Wiktionary entry. If the value is `false`, the qualifier will not be
linked but will appear as-is. Note that these qualifiers do not override placetypes with entries elsewhere that contain
those same qualifiers. For example, the entry for `inland sea` in `placetype_data` will apply in preference to treating
`inland sea` as equivalent to `sea`.
]==]
export.placetype_qualifiers = {
-- generic qualifiers
["huge"] = false,
["tiny"] = false,
["large"] = false,
["big"] = false,
["mid-size"] = false,
["mid-sized"] = false,
["small"] = false,
["sizable"] = false,
["important"] = false,
["long"] = false,
["short"] = false,
["major"] = false,
["minor"] = false,
["high"] = false,
["tall"] = false,
["low"] = false,
["left"] = false, -- left tributary
["right"] = false, -- right tributary
["modern"] = false, -- for use in opposition to "ancient" in another definition
-- "former" qualifiers
["abandoned"] = true,
["ancient"] = true,
["deserted"] = true,
["extinct"] = true,
["former"] = false,
["historic"] = "historical",
["historical"] = true,
["medieval"] = true,
["mediaeval"] = true,
["ruined"] = true,
["traditional"] = true,
-- sea qualifiers
["coastal"] = true,
["inland"] = true, -- note, we also have an entry in placetype_data for 'inland sea' to get a link to [[inland sea]]
["maritime"] = true,
["overseas"] = true,
["seaside"] = true,
["beachfront"] = true,
["beachside"] = true,
["riverside"] = true,
-- lake qualifiers
["freshwater"] = true,
["saltwater"] = true,
["endorheic"] = true,
["oxbow"] = true,
["ox-bow"] = "[[oxbow]]", -- [[ox-bow]] is a red link
["tidal"] = true,
-- land qualifiers
["hilltop"] = true,
["hilly"] = true,
["insular"] = true,
["peninsular"] = true,
["chalk"] = true,
["karst"] = true,
["limestone"] = true,
["mountainous"] = true,
["mountaintop"] = true,
["alpine"] = true,
["volcanic"] = true, -- for an island
-- political status qualifiers
["autonomous"] = true,
["incorporated"] = true,
["special"] = true,
["unincorporated"] = true,
["coterminous"] = true,
-- monetary status/etc. qualifiers
["fashionable"] = true,
["wealthy"] = true,
["affluent"] = true,
["declining"] = true,
-- city vs. rural qualifiers
["urban"] = true,
["suburban"] = true,
["exurban"] = true,
["outlying"] = true,
["remote"] = true,
["rural"] = true,
["outback"] = true,
["inner"] = false,
["inner-city"] = true,
["central"] = false,
["outer"] = false,
-- land use qualifiers
["residential"] = true,
["agricultural"] = true,
["business"] = true,
["commercial"] = true,
["industrial"] = true,
-- business use qualifiers
["railroad"] = true,
["railway"] = true,
["farming"] = true,
["fishing"] = true,
["mining"] = true,
["logging"] = true,
["cattle"] = true,
-- tourism use qualifiers
["resort"] = true, -- note, we also have 'resort city' and 'resort town', that take precedecne
["spa"] = true, -- note, we also have 'spa city' and 'spa town', that take precedecne
["ski"] = true, -- note, we also have 'ski resort city' and 'ski resort town', that take precedecne
-- religious qualifiers
["holy"] = true,
["sacred"] = true,
["religious"] = true,
["secular"] = true,
-- qualifiers for nonexistent places
["claimed"] = false,
["fictional"] = true,
["legendary"] = true,
["mythical"] = true,
["mythological"] = true,
-- directional qualifiers
["northern"] = false,
["southern"] = false,
["eastern"] = false,
["western"] = false,
["north"] = false,
["south"] = false,
["east"] = false,
["west"] = false,
["northeastern"] = false,
["southeastern"] = false,
["northwestern"] = false,
["southwestern"] = false,
["northeast"] = false,
["southeast"] = false,
["northwest"] = false,
["southwest"] = false,
-- seasonal qualifiers
["summer"] = true, -- e.g. for 'summer capital'
["winter"] = true,
-- legal status qualifiers
-- FIXME: Two-word qualifiers don't work yet. But you can enter "de-facto" and it's canonicalized to [[de facto]].
["official"] = true,
["unofficial"] = true,
["de facto"] = true, -- 'de facto capital'
["de-facto"] = "[[de facto]]", -- [[de-facto]] is a red link
["de jure"] = true, -- 'de jure capital'
["de-jure"] = "[[de jure]]", -- [[de-jure]] is a red link
-- NOTE: 'unrecognized/unrecognised' are handled as placetypes 'unrecognized country', 'unrecognized state'
-- misc. qualifiers
["planned"] = true,
["chartered"] = true,
["landlocked"] = true,
["uninhabited"] = true,
-- superlative qualifiers
["first"] = no_link_def_article,
["second"] = no_link_def_article, -- for "second largest" etc.
["third"] = no_link_def_article,
["fourth"] = no_link_def_article,
["last"] = no_link_def_article,
["only"] = no_link_def_article,
["sole"] = no_link_def_article,
["main"] = no_link_def_article,
["largest"] = no_link_def_article,
["biggest"] = no_link_def_article,
["smallest"] = no_link_def_article,
["shortest"] = no_link_def_article,
["longest"] = no_link_def_article,
["tallest"] = no_link_def_article,
["highest"] = no_link_def_article,
["lowest"] = no_link_def_article,
["leftmost"] = no_link_def_article,
["rightmost"] = no_link_def_article,
["innermost"] = no_link_def_article,
["outermost"] = no_link_def_article,
["northernmost"] = no_link_def_article,
["southernmost"] = no_link_def_article,
["westernmost"] = no_link_def_article,
["easternmost"] = no_link_def_article,
["northwesternmost"] = no_link_def_article,
["southwesternmost"] = no_link_def_article,
["northeasternmost"] = no_link_def_article,
["southeasternmost"] = no_link_def_article,
-- several/various
["several"] = no_link_no_article,
["various"] = no_link_no_article,
["numerous"] = no_link_no_article,
["multiple"] = no_link_no_article,
["many"] = no_link_no_article,
["other"] = no_link_no_article,
}
--[==[ var:
In this table, the key qualifiers should be treated the same as the value qualifiers for categorization purposes. This
is overridden by `placetype_data` and `qualifier_to_placetype_equivs`.
]==]
export.former_qualifiers = {
["abandoned"] = {"FORMER"},
["ancient"] = {"ANCIENT", "FORMER"},
["former"] = {"FORMER"},
["extinct"] = {"FORMER"},
["historic"] = {"FORMER"},
["historical"] = {"FORMER"},
["medieval"] = {"ANCIENT", "FORMER"},
["mediaeval"] = {"ANCIENT", "FORMER"},
["ruined"] = {"ANCIENT", "FORMER"},
["traditional"] = {"FORMER"},
}
--[==[ var:
In this table, any placetypes containing these qualifiers that do not occur in `placetype_data` should be mapped to the
specified placetypes for categorization purposes. Entries here are overridden by `placetype_data`.
]==]
export.qualifier_to_placetype_equivs = {
["fictional"] = "fictional location",
["legendary"] = "mythological location",
["mythical"] = "mythological location",
["mythological"] = "mythological location",
-- For e.g. Taiwan as a "claimed province" of China; parts of Belize as claimed by Guatemala; various islands
-- claimed by various parties in East Asia. FIXME: We should conditionalize on what is being claimed since there are
-- also claimed capitals, e.g. Israel and Palestine claim Jerusalem as their capital.
["claimed"] = "claimed political division",
}
--[==[ var:
Mapping from placetypes to the corresponding plural category-only placetype for a capital of that placetype. The reverse
mapping also exists.
]==]
export.placetype_to_capital_cat = {
["autonomous community"] = "autonomous community capitals",
["canton"] = "cantonal capitals",
["comarca"] = "comarca capitals",
["ประเทศ"] = "national capitals",
-- The following are not obviously different from 'county seats' but the latte terminology is used in the US.
["เทศมณฑล"] = "county capitals",
["department"] = "departmental capitals",
["อำเภอ"] = "district capitals",
["division"] = "division capitals",
["emirate"] = "emirate capitals",
["governorate"] = "governorate capitals",
["hromada"] = "hromada capitals",
["krai"] = "krai capitals",
["metropolitan city"] = "metropolitan city capitals",
["เทศบาล"] = "municipal capitals",
["oblast"] = "oblast capitals",
["okrug"] = "okrug capitals",
["prefecture"] = "prefectural capitals",
["จังหวัด"] = "provincial capitals",
["raion"] = "raion capitals",
["regency"] = "regency capitals",
["ภูมิภาค"] = "regional capitals",
["regional unit"] = "regional unit capitals",
["republic"] = "republic capitals",
["รัฐ"] = "state capitals",
["ดินแดน"] = "territorial capitals",
["voivodeship"] = "voivodeship capitals",
}
--[==[ var:
This contains placenames that should be preceded by an article (almost always "the"). '''NOTE''': There are multiple
ways that placenames can come to be preceded by "the":
# Listed here.
# Given in [[Module:place/locations]] with an initial "the". All such placenames are added to this map by the code
just below the map.
# The placetype of the placename has `holonym_use_the = true` in its placetype_data.
# A regex in placename_the_re matches the placename.
Note that "the" is added only before the first holonym in a place description.
]==]
export.placename_article = {
-- This should only contain info that can't be inferred from [[Module:place/locations]].
["archipelago"] = {
["Cyclades"] = "the",
["Dodecanese"] = "the",
},
["ประเทศ"] = {
["Holy Roman Empire"] = "the",
},
["จักรวรรดิ"] = {
["Holy Roman Empire"] = "the",
},
["เกาะ"] = {
["North Island"] = "the",
["South Island"] = "the",
},
["ภูมิภาค"] = {
["Balkans"] = "the",
["Russian Far East"] = "the",
["Caribbean"] = "the",
["Caucasus"] = "the",
["Middle East"] = "the",
["New Territories"] = "the",
["North Caucasus"] = "the",
["South Caucasus"] = "the",
["West Bank"] = "the",
["Gaza Strip"] = "the",
},
["valley"] = {
["San Fernando Valley"] = "the",
},
}
--[==[ var:
Regular expressions to apply to determine whether we need to put 'the' before a holonym. The key "*" applies to all
holonyms, otherwise only the regexes for the holonym's placetype apply.
]==]
export.placename_the_re = {
-- We don't need entries for peninsulas, seas, oceans, gulfs or rivers
-- because they have holonym_use_the = true.
["*"] = {"^Isle of ", " Islands$", " Mountains$", " Empire$", " Country$", " Region$", " District$", "^City of "},
["bay"] = {"^Bay of "},
["ทะเลสาบ"] = {"^Lake of "},
["ประเทศ"] = {"^Republic of ", " Republic$"},
["republic"] = {"^Republic of ", " Republic$"},
["ภูมิภาค"] = {" [Rr]egion$"},
["แม่น้ำ"] = {" River$"},
["local government area"] = {"^Shire of "},
["เทศมณฑล"] = {"^Shire of "},
["Indian reservation"] = {" Reservation", " Nation"},
["tribal jurisdictional area"] = {" Reservation", " Nation"},
}
--[==[ var:
If any of the following holonyms are present, the associated holonyms are automatically added to the end of the list of
holonyms for categorization (but not display) purposes.
]==]
export.cat_implications = {
["ภูมิภาค"] = {
["Eastern Europe"] = {"continent/Europe"},
["Central Europe"] = {"continent/Europe"},
["Western Europe"] = {"continent/Europe"},
["South Europe"] = {"continent/Europe"},
["Southern Europe"] = {"continent/Europe"},
["Northern Europe"] = {"continent/Europe"},
["Northeast Europe"] = {"continent/Europe"},
["Northeastern Europe"] = {"continent/Europe"},
["Southeast Europe"] = {"continent/Europe"},
["Southeastern Europe"] = {"continent/Europe"},
["North Caucasus"] = {"continent/Europe"},
["South Caucasus"] = {"continent/Asia"},
["South Asia"] = {"continent/Asia"},
["Southern Asia"] = {"continent/Asia"},
["East Asia"] = {"continent/Asia"},
["Eastern Asia"] = {"continent/Asia"},
["Central Asia"] = {"continent/Asia"},
["West Asia"] = {"continent/Asia"},
["Western Asia"] = {"continent/Asia"},
["Southeast Asia"] = {"continent/Asia"},
["North Asia"] = {"continent/Asia"},
["Northern Asia"] = {"continent/Asia"},
["Anatolia"] = {"continent/Asia"},
["Asia Minor"] = {"continent/Asia"},
["Mesopotamia"] = {"continent/Asia"},
["North Africa"] = {"continent/Africa"},
["Central Africa"] = {"continent/Africa"},
["West Africa"] = {"continent/Africa"},
["East Africa"] = {"continent/Africa"},
["Southern Africa"] = {"continent/Africa"},
["Central America"] = {"continent/Central America"},
["Caribbean"] = {"continent/North America"},
["Polynesia"] = {"continent/Oceania"},
["Micronesia"] = {"continent/Oceania"},
["Melanesia"] = {"continent/Oceania"},
["Siberia"] = {"country/Russia", "continent/Asia"},
["Russian Far East"] = {"country/Russia", "continent/Asia"},
["South Wales"] = {"constituent country/Wales", "continent/Europe"},
["Balkans"] = {"continent/Europe"},
["West Bank"] = {"country/Palestine", "continent/Asia"},
["Gaza"] = {"country/Palestine", "continent/Asia"},
["Gaza Strip"] = {"country/Palestine", "continent/Asia"},
}
}
------------------------------------------------------------------------------------------
-- Category and display handlers --
------------------------------------------------------------------------------------------
local function city_type_cat_handler(data)
local entry_placetype = data.entry_placetype
local generic_before_non_cities = export.get_placetype_prop(entry_placetype, "generic_before_non_cities")
if not generic_before_non_cities then
internal_error("city_type_cat_handler called on placetype %s that doesn't have a `generic_before_non_cities`" ..
" setting", entry_placetype)
end
local plural_entry_placetype = export.pluralize_placetype(entry_placetype)
local group, key, spec, container_trail = export.find_matching_holonym_location(data)
if group and not spec.is_former_place and not spec.is_city then
-- Categorize both in key, and in the larger polity that the key is part of, e.g. [[Hirakata]] goes in both
-- "Cities in Osaka Prefecture" and "Cities in Japan". (But don't do the latter if no_container_cat is set.)
local cap_plural_entry_placetype = ucfirst(plural_entry_placetype)
local retcats = {("%s%s%s"):format(cap_plural_entry_placetype, generic_before_non_cities, export.get_prefixed_key(key, spec))} --th
if container_trail[1] and not spec.no_container_cat then
for _, container in ipairs(container_trail[1]) do
insert(retcats, ("%s%s%s"):format(cap_plural_entry_placetype, generic_before_non_cities, export.get_prefixed_key(container.key, container.spec))) --th
end
end
return retcats
end
end
local function capital_city_cat_handler(data, non_city)
local holonym_placetype, holonym_placename, holonym_index, place_desc =
data.holonym_placetype, data.holonym_placename, data.holonym_index, data.place_desc
-- The first time we're called we want to return something; otherwise we will be called for later-mentioned
-- holonyms, which can result in wrongly classifying into e.g. `National capitals`. Simulate the loop in
-- find_placetype_cat_specs() over holonyms so we get the proper `Cities in ...` categories as well as the capital
-- category/categories we add below.
local retcats
if not non_city and place_desc.holonyms then
for h_index, holonym in export.get_holonyms_to_check(place_desc, holonym_index) do
local h_placetype, h_placename = holonym.placetype, holonym.unlinked_placename
retcats = city_type_cat_handler {
entry_placetype = "นคร",
holonym_placetype = h_placetype,
holonym_placename = h_placename,
holonym_index = h_index,
place_desc = place_desc,
}
if retcats then
break
end
end
end
if not retcats then
retcats = {}
end
-- Now find the appropriate capital-type category for the placetype of the holonym, e.g. 'State capitals'. If we
-- recognize the holonym among the known holonyms in [[Module:place/locations]], also add a category like 'State
-- capitals of the United States'. Truncate e.g. 'autonomous region' to 'region', 'union territory' to 'territory'
-- when looking up the type of capital category, if we can't find an entry for the holonym placetype itself (there's
-- an entry for 'autonomous community').
local capital_cat = export.placetype_to_capital_cat[holonym_placetype]
if not capital_cat then
capital_cat = export.placetype_to_capital_cat[holonym_placetype:gsub("^.* ", "")]
end
if capital_cat then
capital_cat = ucfirst(capital_cat)
local inserted_specific_variant_cat = false
if holonym_index then
-- Now find the first recognized holonym location. We don't stop when :also is seen because of the common pattern
-- where we use :also to specify that a given city is the capital at multiple surrounding levels.
local matching_group, matching_key, matching_spec, matching_container_trail, matching_holonym_index
for h_index = holonym_index, #place_desc.holonyms do
if place_desc.holonyms[h_index].placetype then
matching_group, matching_key, matching_spec, matching_container_trail = export.find_matching_holonym_location {
holonym_placetype = place_desc.holonyms[h_index].placetype,
holonym_placename = place_desc.holonyms[h_index].unlinked_placename,
holonym_index = h_index,
place_desc = place_desc,
}
if matching_group then
matching_holonym_index = h_index
break
end
end
end
if matching_holonym_index == holonym_index then
if matching_container_trail[1] and not matching_spec.no_container_cat then
for _, container in ipairs(matching_container_trail[1]) do
insert(retcats, ("%sของ%s"):format(capital_cat, export.get_prefixed_key(container.key,
container.spec)))
inserted_specific_variant_cat = true
end
end
elseif matching_holonym_index then
-- Check to make sure that the holonym placetype we were called on is listed among the
-- divtypes of the location we found.
local function insert_specific_variant_if_possible(key, spec)
return export.get_equiv_placetype_prop(holonym_placetype, function(pt)
local plural_holonym_placetype = export.pluralize_placetype(pt)
local saw_matching_div
if spec.divs then
local divs = spec.divs
if type(divs) ~= "table" then
divs = {divs}
end
for _, div in ipairs(divs) do
if type(div) ~= "table" then
div = {type = div}
end
if plural_holonym_placetype == div.type then
saw_matching_div = true
break
end
end
end
if saw_matching_div then
insert(retcats, ("%sของ%s"):format(capital_cat, export.get_prefixed_key(key, spec)))
return true
end
return false
end)
end
if insert_specific_variant_if_possible(matching_key, matching_spec) then
inserted_specific_variant_cat = true
elseif not matching_spec.no_container_cat then
for _, containers in ipairs(matching_container_trail) do
local saw_no_container_cat = false
for _, container in ipairs(containers) do
if insert_specific_variant_if_possible(container.key, container.spec) then
inserted_specific_variant_cat = true
break
end
saw_no_container_cat = saw_no_container_cat or container.spec.no_container_cat
end
if inserted_specific_variant_cat or saw_no_container_cat then
break
end
end
end
end
else
-- This happens when in an invocation like {{place|en|capital city|s/Haryana,Punjab}} for
-- [[Chandigarh]]. We fall back to older code that doesn't depend on the holonym index existing.
-- FIXME: This may not be necessary. In the example just given, when processing Haryana we add to
-- [[:Category:en:State capitals of India]], and nothing extra gets added when processing Punjab.
-- Possibly we can just skip this case entirely.
local group, key, spec, container_trail = export.find_matching_holonym_location(data)
if group and container_trail[1] and not spec.no_container_cat then
for _, container in ipairs(container_trail[1]) do
insert(retcats, ("%sของ%s"):format(capital_cat, export.get_prefixed_key(container.key,
container.spec)))
inserted_specific_variant_cat = true
end
end
end
if not inserted_specific_variant_cat then
insert(retcats, capital_cat)
end
else
-- We didn't recognize the holonym placetype; just put in 'Capital cities'.
insert(retcats, "เมืองหลวง")
end
return retcats
end
--[=[
This is invoked specially for all placetypes (see the `*` placetype key at the bottom of `placetype_data`). This is used
in two ways:
# To add pages to generic holonym categories like [[:Category:en:สถานที่ในMerseyside, England]] (and
[[:Category:en:สถานที่ในEngland]]) for any pages that have `co/Merseyside` as their holonym.
# To categorize demonyms in bare placename categories like [[:Category:en:Merseyside, England]] if the demonym
description mentions `co/Merseyside` and doesn't mention a more specific placename that also has a category. (In this
case there are none, but we can have demonyms at multiple levels, e.g. in France for individual villages, departments,
administrative regions, and for the entire country, and for example we only want to categorize a demonym into
[[:Category:France]] if no more specific category applies.) Unlike when invoked from {{tl|place}}, a demonym
invocation only adds the most specific holonym category and not the category of any containing polity (hence if we
add [[:Category:en:Merseyside, England]] we won't also add [[:Category:England]]).
This code also handles cities; e.g. for the first use case above, it would be used to add a page that has `city/Boston`
as a holonym to [[:Category:en:สถานที่ในBoston]], along with [[:Category:en:สถานที่ในMassachusetts, USA]] and
[[:Category:en:สถานที่ในthe United States]]. The city handler tries to deal with the possibility of multiple cities
having the same name. For example, the code in [[Module:place/locations]] knows about the city of [[Columbus]],
[[Ohio]], which has containing polities `Ohio` (a state) and `the United States` (a country). If either containing
polity is mentioned, the handler proceeds to return the key `Columbus` (along with `Ohio, USA` and `the United States`).
Otherwise, if any other state or country is mentioned, the handler returns nothing, and otherwise it assumes the
mentioned city is the one we're considering and returns `Columbus` etc. This works correctly if the place only mentions
Ohio and a holonym for a Columbus in a different country is encountered, because of the function
`augment_holonyms_with_container`, which adds the US as a holonym when Ohio is encountered.
The single parameter `data` is as in category handlers. The return value is a list of categories (without the preceding
language code).
]=]
local function generic_place_cat_handler(data)
local from_demonym = data.from_demonym
local retcats = {}
local function insert_retkey(key, spec)
if from_demonym then
insert(retcats, key)
else
insert(retcats, ("สถานที่ใน%s"):format(export.get_prefixed_key(key, spec)))
end
end
local group, key, spec, container_trail = export.find_matching_holonym_location(data)
if group then
if not spec.no_generic_place_cat then
-- This applies to continents and continental regions.
insert_retkey(key, spec)
end
-- Categorize both in key, and in the larger location(s) that the key is part of, e.g. [[Hirakata]] goes in
-- both [[Category:สถานที่ในOsaka Prefecture, Japan]] and [[Category:สถานที่ในJapan]]. But not when
-- no_container_cat is set (e.g. for 'United Kingdom').
if not spec.no_container_cat then
for _, container_set in ipairs(container_trail) do
local stop_adding_containers = false
for _, container in ipairs(container_set) do
if not container.spec.no_generic_place_cat then
insert_retkey(container.key, container.spec)
end
if container.spec.no_container_cat then
stop_adding_containers = true
end
end
if stop_adding_containers then
break
end
end
end
return retcats
end
end
--[==[
Special category handler run for all placetypes that checks for specified division placetypes of known locations and
categorizes appropriately.
]==]
function export.political_division_cat_handler(data)
if data.from_demonym then
return
end
local group, key, spec, container_trail = export.find_matching_holonym_location(data)
if group then
local divlists = {}
if spec.divs then
insert(divlists, spec.divs)
end
if spec.addl_divs then
insert(divlists, spec.addl_divs)
end
for _, divlist in ipairs(divlists) do
if type(divlist) ~= "table" then
divlist = {divlist}
end
for _, div in ipairs(divlist) do
if type(div) == "string" then
div = {type = div}
end
local sgdiv = export.maybe_singularize_placetype(div.type) or div.type
local prep = div.prep or "ของ"
local cat_as = div.cat_as or div.type
if type(cat_as) ~= "table" then
cat_as = {cat_as}
end
if not export.placetype_data[sgdiv] then
internal_error("Placetype %s associated with known location key %s and data %s not found in " ..
"`placetype_data`", sgdiv, key, spec)
end
if sgdiv == data.entry_placetype then
local retcats = {}
for _, pt_cat in ipairs(cat_as) do
if type(pt_cat) == "string" then
pt_cat = {type = pt_cat}
end
local pt_prep = pt_cat.prep or prep
insert(retcats, ucfirst(pt_cat.type) .. pt_prep .. export.get_prefixed_key(key, spec)) --th
end
return retcats
end
end
end
end
end
--[==[
This is used to add pages to "bare" categories like [[:Category:en:Georgia, USA]] for `[[Georgia]]` and any
foreign-language terms that are translations of the state of Georgia. We look at the page title (or its overridden value
in {{para|pagename}}) as well as the glosses in {{para|t}}/{{para|t2}} etc., various extra-info values such as the
modern names in {{para|modern}}, and any values specified using a form-of directive. We need to pay attention to the
entry placetypes specified so we don't overcategorize; e.g. the US state of Georgia is `[[Джорджия]]` in Russian but the
country of Georgia is `[[Грузия]]`, and if we just looked for matching names, we'd get both Russian terms categorized
into both [[:Category:ru:Georgia, USA]] and [[:Category:ru:Georgia]]. We also need to check the containing holonyms to
make sure there isn't a mismatch (so we don't e.g. categorize Newark, Delaware in [[:Category:en:Newark]], which is
intended for Newark, New Jersey).
]==]
function export.get_bare_categories(args, overall_place_spec)
local bare_cats = {}
local place_descs = overall_place_spec.descs
local possible_placetypes_by_place_desc = {}
for i, place_desc in ipairs(place_descs) do
possible_placetypes_by_place_desc[i] = {}
for _, placetype in ipairs(place_desc.placetypes) do
if not export.placetype_is_ignorable(placetype) then
local equivs = export.get_placetype_equivs(placetype, {register_former_as_non_former = true})
for _, equiv in ipairs(equivs) do
insert(possible_placetypes_by_place_desc[i], equiv.placetype)
end
end
end
end
local function check_term(term)
-- Treat Wikipedia links like local ones.
term = term:gsub("%[%[w:", "[["):gsub("%[%[wikipedia:", "[[")
term = export.remove_links_and_html(term)
term = term:gsub("^the ", "")
for i, place_desc in ipairs(place_descs) do
-- Iterate over all matching locations in case there are multiple, as with Delhi defined as
-- {{place|en|megacity/and/union territory|c/India|containing the national capital [[New Delhi]]}}.
for group, key, spec, container_trail in export.iterate_matching_holonym_location {
holonym_placetype = possible_placetypes_by_place_desc[i],
holonym_placename = term,
place_desc = place_desc,
} do
insert(bare_cats, key)
end
end
end
-- FIXME: Should we only do the following if the language is English (requires that the lang is passed in)?
-- We should always do it if `pagename` is given (as it is with {{tcl}}) but maybe not otherwise unless 1=en. There
-- are cases like [[Ankara]] = English name for capital of Turkey, but also the name in various languages for the
-- capital of Ghana (= English [[Accra]]). But this should get caught by mismatching the containing country. The
-- advantage of checking when the language isn't English is we catch those places that fail to give an English
-- translation but where the translation happens to be the same as the other-language spelling. However, I don't
-- know how often this situation occurs.
check_term(args.pagename or mw.title.getCurrentTitle().subpageText)
for _, t in ipairs(args.t) do
check_term(t)
end
local function check_termobj_list(terms)
for _, term in ipairs(terms) do
if term.eq then
check_term(term.eq)
end
if term.alt or term.term then
check_term(term.alt or term.term)
end
end
end
for _, extra_info_terms in ipairs(overall_place_spec.extra_info) do
local arg = extra_info_terms.arg
if arg == "modern" or arg == "now" or arg == "full" or arg == "short" then
check_termobj_list(extra_info_terms.terms)
end
end
for _, directive in ipairs(overall_place_spec.directives) do
check_termobj_list(directive.terms)
end
return bare_cats
end
--[==[
This is used to augment the holonyms associated with a place description with the containing polities. For example,
given the following:
`# {{tl|place|en|subprefecture|pref/Hokkaido}}.`
We auto-add Japan as another holonym so that the term gets categorized into [[:Category:Subprefectures of Japan]].
To avoid over-categorizing we need to check to make sure no other countries are specified as holonyms.
]==]
function export.augment_holonyms_with_container(place_descs)
for _, place_desc in ipairs(place_descs) do
if place_desc.holonyms then
-- This ends up containing a copy of the original holonyms, with the augmented holonyms inserted in their
-- appropriate position. We don't just put them at the end because some holonyms have use the `:also`
-- modifier, which causes category processing to restart at that point after generating categories for a
-- preceding holonym, and we don't want the preceding holonym's augmented holonyms interfering with
-- categorization of a later holonym. We proceed from right to left, and each time we augment, we copy
-- the holonyms with the augmented holonym(s) inserted appropriately and replace the place description's
-- holonyms with the augmented ones before the next iteration. The reason for this is so that e.g.
-- {{place|neighborhood|city/Birmingham|co/West Midlands|cc/England}} doesn't throw an error during the
-- augmentation process due to 'Birmingham' referring to two known locations (in England and Alabama). If
-- we go left to right, we will throw an ambiguity error on `city/Birmingham` because code to exclude
-- Birmingham, Alabama needs `c/United Kingdom` present (to cause a mismatch with `c/United States`),
-- which isn't yet present as the augmentation code hasn't gotten to `cc/England` yet. For similar
-- reasons, we need to include the augmented holonyms in the holonyms considered in the next iteration
-- rather than modifying the place description once at athe end.
for i = #place_desc.holonyms, 1, -1 do
local holonym = place_desc.holonyms[i]
if holonym.placetype and not export.placetype_is_ignorable(holonym.placetype) then
local group, key, spec, container_trail = export.find_matching_holonym_location {
holonym_placetype = holonym.placetype,
holonym_placename = holonym.unlinked_placename,
holonym_index = i,
place_desc = place_desc,
}
if group and container_trail[1] and not spec.no_auto_augment_container then
local augmented_holonyms = {}
for j = 1, i do
insert(augmented_holonyms, place_desc.holonyms[j])
end
for _, containers in ipairs(container_trail) do
local any_no_auto_augment_container = false
for _, container in ipairs(containers) do
any_no_auto_augment_container = any_no_auto_augment_container or
container.spec.no_auto_augment_container
local containing_type = container.spec.placetype
if type(containing_type) == "table" then
-- If the containing type is a list, use the first element as the canonical variant.
containing_type = containing_type[1]
end
local full_container_placename, elliptical_container_placename =
m_locations.key_to_placename(container.group, container.key)
-- Don't side-effect holonyms while processing them.
local new_holonym = {
-- By the time we run, the display has already been generated so we don't need to
-- set display_placename.
placetype = containing_type,
-- placename_to_key() for the group should correctly handle both full and elliptical
-- placenames, but the full placename seems less likely to be ambiguous. FIXME: We
-- should just store the key directly and use it when available to avoid having to
-- convert key to placename and back to key.
unlinked_placename = full_container_placename,
-- Indicate that this is an augmented holonym, and was derived from the specified
-- holonym. In iterate_matching_holonym_location(), we ignore augmented holonyms
-- derived from holonyms that are different from the holonym we're searching for but
-- of the same placetype. This is to correctly handle a situation like
-- {{place|river|dept/Ardèche,Gard,Vaucluse,Bouches-du-Rhône|c/France}}. Here,
-- `Ardèche` is in `r/Auvergne-Rhône-Alpes`, while `Gard` is in `r/Occitania` and
-- the other two are in `r/Provence-Alpes-Côte d'Azur`. Augmenting proceeds from
-- right to left, so after it adds `r/Provence-Alpes-Côte d'Azur` to
-- `Bouches-du-Rhône`, Vaucluse gets augmented correctly but `Gard` fails to match
-- in find_matching_holonym_location() because of the mismatch between augmented
-- `r/Provence-Alpes-Côte d'Azur` and actual `r/Occitania`. Similarly, all later
-- calls to find_matching_holonym_location() fail to match `Gard` (and likewise
-- `Ardèche`) against any known location. To deal with this, we mark augmented
-- holoynms as being augmented due to a source holonym, and when processing a given
-- holonym, ignore augmented holonyms from other holonyms of the same placetype.
-- The restriction to the same placetype is so that `Birmingham` still gets
-- correctly disambiguated to Birmingham, England in the example given above near
-- the top of this function, using the augmented holonym `c/United Kingdom` added by
-- the specified `cc/England` (whose placetype `constituent country` differs from
-- the placetype `city` of Birmingham).
augmented_from_holonym = holonym,
}
insert(augmented_holonyms, new_holonym)
-- But it is safe to modify other parts of the place_desc.
export.key_holonym_into_place_desc(place_desc, new_holonym)
end
if any_no_auto_augment_container then
break
end
end
for j = i + 1, #place_desc.holonyms do
insert(augmented_holonyms, place_desc.holonyms[j])
end
place_desc.holonyms = augmented_holonyms
end
end
end
end
end
end
-- Cat handler for district, areas, neighborhoods and suburbs. Districts are tricky because they can either be political
-- divisions or city neighborhoods. Areas similarly can be political divisions (rarely; specifically, in Kuwait), city
-- neighborhoods or larger geographical areas/regions. We handle this as follows:
-- (1) `placetype_data` cat entries for specific countries or country divisions take precedence over cat_handlers, so if
-- the user says {{tl|place|district|s/Maharashtra|c/India}}, we won't even be called because there is an entry that
-- categorizes into [[:Category|Districts of Maharashtra, India]].
-- (2) If we're called, we check the holonym we're called on to see if it is a recognized city, e.g. if we're called
-- using {{tl|place|district|city/Mumbai|s/Maharashtra|c/India}}. If so, we categorize under e.g.
-- [[:Category:Neighbourhoods of Mumbai]]. (Choosing the spelling "neighbourhoods" because we're in India.)
-- (3) If we're called and the holonym is not a recognized city, we check if the placetype has has_neighborhoods set.
-- If so, it's "city-like" and we categorize under the first containing polity that we recognize. For example, if
-- we're called using {{tl|place|district|town/Northampton|co/Hampshire|s/Massachusetts|c/US}}, we should recognize
-- town as "city-like" and categorize under [[:Category:Neighborhoods in Massachusetts]]. (Note "ใน" not "ของ", and
-- note the spelling "neighborhoods" because we're in the US.)
-- (4) If the holonym is not city-like, we do nothing. If there's a city or city-like placetype farther up (e.g. we're
-- called as {{tl|place|district|ward/Foo|mun/Bar|...}}), we will handle the city-like entity according to (2) or
-- (3) when called on that holonym. Otherwise either the categorization in (1) takes place or there's no
-- categorization.
local function district_neighborhood_cat_handler(data)
local function get_plural_entry_placetype(location_spec, container_trail)
if data.entry_placetype == "suburb" then
return "Suburbs"
else
-- Check for `british_spelling` setting on the spec itself or any container.
local uses_british_spelling = location_spec.british_spelling
if uses_british_spelling == nil and container_trail then
for _, container_set in ipairs(container_trail) do
local must_outer_break = false
for _, container in ipairs(container_set) do
if container.spec.british_spelling ~= nil then
uses_british_spelling = container.spec.british_spelling
must_outer_break = true
break
end
end
if must_outer_break then
break
end
end
end
return uses_british_spelling and "Neighbourhoods" or "Neighborhoods"
end
end
-- First check the immediate holonym to see if it's a city or a city-like top-level entity (Hong Kong, Bonaire,
-- etc.)
local group, key, spec, container_trail = export.find_matching_holonym_location(data)
if group and not spec.is_former_place and spec.is_city then
return {get_plural_entry_placetype(spec, container_trail) .. " of " .. export.get_prefixed_key(key, spec)}
end
-- If the entry placetype is neighbo(u)rhood, assume it is a neighborhood even if there isn't a city-like
-- entity father up the chain. (E.g. due to a mistaken use of m/ instead of mun/ for municipality.)
local has_neighborhoods
local entry_placetype = data.entry_placetype
if entry_placetype == "neighborhood" or entry_placetype == "neighbourhood" or entry_placetype == "suburb" then
has_neighborhoods = true
else
-- Otherwise, make sure the current holonym is city-like.
has_neighborhoods = export.get_equiv_placetype_prop(data.holonym_placetype, function(pt)
return export.get_placetype_prop(pt, "has_neighborhoods")
end, {continue_on_nil_only = true})
end
if has_neighborhoods then
-- Loop up the holonyms, looking for city and city-like entities in case of e.g. [[Sepulveda]] written
-- {{place|en|neighborhood|valley/San Fernando Valley|city/Los Angeles|s/California|c/USA}}
-- but also look for a recognizable poldiv, and if so categorize as "Neighborhoods in POLDIV". We need
-- to start with the current holonym, which is especially important for neighborhoods and suburbs that
-- may have the first holonym be a recognizable province, etc. but can't hurt otherwise. (Previously
-- we skipped the first/current holonym.)
for other_holonym_index, other_holonym in export.get_holonyms_to_check(data.place_desc,
data.holonym_index) do
local other_holonym_data = {
holonym_placetype = other_holonym.placetype,
holonym_placename = other_holonym.unlinked_placename,
holonym_index = other_holonym_index,
place_desc = data.place_desc,
}
local group, key, spec, container_trail = export.find_matching_holonym_location(other_holonym_data)
if group and not spec.is_former_place then
return {get_plural_entry_placetype(spec, container_trail) .. (spec.is_city and "ของ" or "ใน") ..
export.get_prefixed_key(key, spec)}
end
end
end
end
function export.check_already_seen_string(holonym_placename, already_seen_strings)
local canon_placename = ulower(m_links.remove_links(holonym_placename))
if type(already_seen_strings) ~= "table" then
already_seen_strings = {already_seen_strings}
end
for _, already_seen_string in ipairs(already_seen_strings) do
if canon_placename:find(already_seen_string) then
return true
end
end
return false
end
-- Prefix display handler that adds a prefix such as "Metropolitan Borough of " to the display
-- form of holonyms. We make sure the holonym doesn't contain the prefix or some variant already.
-- We do this by checking if any of the strings in ALREADY_SEEN_STRINGS, either a single string or
-- a list of strings, or the prefix if ALREADY_SEEN_STRINGS is omitted, are found in the holonym
-- placename, ignoring case and links. If the prefix isn't already present, we create a link that
-- uses the raw form as the link destination but the prefixed form as the display form, unless the
-- holonym already has a link in it, in which case we just add the prefix.
local function prefix_display_handler(prefix, holonym_placename, already_seen_strings)
if export.check_already_seen_string(holonym_placename, already_seen_strings or ulower(prefix)) then
return holonym_placename
end
if holonym_placename:find("%[%[") then
return prefix .. " " .. holonym_placename
end
return prefix .. " [[" .. holonym_placename .. "]]"
end
-- Suffix display handler that adds a suffix such as " parish" to the display form of holonyms.
-- Works identically to prefix_display_handler but for suffixes instead of prefixes.
local function suffix_display_handler(suffix, holonym_placename, already_seen_strings, include_suffix_in_link)
if export.check_already_seen_string(holonym_placename, already_seen_strings or ulower(suffix)) then
return holonym_placename
end
if holonym_placename:find("%[%[") then
return holonym_placename .. " " .. suffix
end
if include_suffix_in_link then
return "[[" .. holonym_placename .. " " .. suffix .. "]]"
else
return "[[" .. holonym_placename .. "]] " .. suffix
end
end
-- Display handler for boroughs. New York City boroughs are display as-is. Others are suffixed
-- with "borough".
local function borough_display_handler(holonym_placetype, holonym_placename)
local unlinked_placename = m_links.remove_links(holonym_placename)
if m_locations.new_york_boroughs[unlinked_placename] then
-- Hack: don't display "borough" after the names of NYC boroughs
return holonym_placename
end
return suffix_display_handler("borough", holonym_placename)
end
local function county_display_handler(holonym_placetype, holonym_placename)
local unlinked_placename = m_links.remove_links(holonym_placename)
-- Display handler for Irish counties. Irish counties are displayed as e.g. "County [[Cork]]".
if m_locations.ireland_counties["County " .. unlinked_placename .. ", Ireland"] or
m_locations.northern_ireland_counties["County " .. unlinked_placename .. ", Northern Ireland"] then
return prefix_display_handler("เทศมณฑล", holonym_placename)
end
-- Display handler for Taiwanese counties. Taiwanese counties are displayed as e.g. "[[Chiayi]] County".
if m_locations.taiwan_counties[unlinked_placename .. " County, Taiwan"] then
return suffix_display_handler("เทศมณฑล", holonym_placename)
end
-- Display handler for Romanian counties. Romanian counties are displayed as e.g. "[[Cluj]] County".
if m_locations.romania_counties[unlinked_placename .. " County, Romania"] then
return suffix_display_handler("เทศมณฑล", holonym_placename)
end
-- FIXME, we need the same for US counties but need to key off the country, not the specific county.
-- Others are displayed as-is.
return holonym_placename
end
-- Display handler for prefectures. Japanese prefectures are displayed as e.g. "[[Fukushima]] Prefecture".
-- Others are displayed as e.g. "[[Fthiotida]] prefecture".
local function prefecture_display_handler(holonym_placetype, holonym_placename)
local unlinked_placename = m_links.remove_links(holonym_placename)
local suffix = m_locations.japan_prefectures[unlinked_placename .. " Prefecture, Japan"] and "Prefecture" or "prefecture"
return suffix_display_handler(suffix, holonym_placename)
end
-- Display handler for provinces of Iran, Laos, North and South Korea, Thailand, Turkey and Vietnam. Recognized
-- provinces are displayed as e.g. "[[Gyeonggi]] Province" or "[[Antalya]] Province". Others are displayed as-is.
local function province_display_handler(holonym_placetype, holonym_placename)
local unlinked_placename = m_links.remove_links(holonym_placename)
if
m_locations.iran_provinces[unlinked_placename .. ", Iran"] or
m_locations.laos_provinces[unlinked_placename .. ", Laos"] or
m_locations.north_korea_provinces[unlinked_placename .. ", North Korea"] or
m_locations.south_korea_provinces[unlinked_placename .. ", South Korea"] or
m_locations.thailand_provinces[unlinked_placename .. ", ไทย"] or
m_locations.turkey_provinces[unlinked_placename .. ", Turkey"] or
m_locations.vietnam_provinces[unlinked_placename .. ", เวียดนาม"] then
return suffix_display_handler("จังหวัด", holonym_placename)
end
return holonym_placename
end
-- Display handler for Nigerian states. Nigerian states are display as "[[Kano]] State". Others are displayed as-is.
local function state_display_handler(holonym_placetype, holonym_placename)
local unlinked_placename = m_links.remove_links(holonym_placename)
if m_locations.nigeria_states[unlinked_placename .. " State, Nigeria"] then
return suffix_display_handler("รัฐ", holonym_placename)
end
return holonym_placename
end
-- Display handler for voivodeships. Display as e.g. [[Subcarpathian Voivodeship]].
local function voivodesip_display_handler(holonym_placetype, holonym_placename)
return suffix_display_handler("Voivodeship", holonym_placename, nil, "include_suffix_in_link")
end
------------------------------------------------------------------------------------------
-- Placetype data --
------------------------------------------------------------------------------------------
--[==[ var:
Main placetype data structure. This specifies, for each canonicalized placetype, various properties. The keys are
placetypes (in the singular, except for category-only placetypes, which are plural and followed by `!`), and the value
is a table of properties. The `"*"` key is special and is used for adding "generic" categories of the form
`สถานที่ใน``location`` `; it runs for all entry placetypes. Keys in the form of plural placetypes followed by `!` are
used only in [[Module:category tree/topic cat/data/Places]] for specifying the properties of categories containing the
specified placetype, esp. bare categories like [[:Category:States and territories]] (rather than qualified categories
like [[:Category:States and territories of Australia]]).
Keys under the value table for a given placetype of are two types: ''property keys'' (which specify the value of
specific properties) and ''categorization keys'' (which tell how to categorize certain sorts of holonyms if the
placetype in question occurs as an entry placetype). Categorization keys are either the special value `default` or are
wildcard strings with a slash in them, such as `"country/*"`. Note that only wildcard strings are currently allowed
directly in the placetype data; everything else is handled through category handlers, either per-placetype or special
(such as `political_division_cat_handler`). The algorithm for how category keys and handlers are used to generate
categories is described at the top of [[Module:place]].
There are several recognized property keys, of various types:
1. The following link-related property keys are recognized:
* `link`: '''Required''' except in category-only placetypes ending in `!`. Describes how to link and display the
placetype in the formatted description when occurring as an entry placetype. Also used for formatting pluralized
placetypes (which may occur in entry placetypes, esp. new-format ones, such as `two <<islands>>`) and may occur in
categories). The possible values are:
*# `true`: Link to the same-named Wiktionary entry. This creates a raw link, e.g. `<nowiki>[[city]]</nowiki>`, which is
converted to an English-specific link by JavaScript postprocessing. If the placetype is plural, this creates a
two-part raw link e.g. `<nowiki>[[city|cities]]</nowiki>`.
*# `"w"`: Link to the same-named Wikipedia entry. This creates a two-part link, e.g.
`<nowiki>[[w:census town|census town]]</nowiki>`, or `<nowiki>[[w:census town|census towns]]</nowiki>` if the
placetype is given plural.
*# `"+..."`: Create a two-part link to the entry following the `+` sign. For example, if `cercle` specifies
`"+w:cercles of Mali"`, a two-part link `<nowiki>[[w:cercles of Mali|cercle]]</nowiki>` will be generated, or
`<nowiki>[[w:cercles of Mali|cercles]]</nowiki>` if plural `cercles` is specified.
*# `"separately"`: Link each word separately. For example, if `administrative territory` specifies `"separately"`, it
will be linked as `<nowiki>[[administrative]] [[territory]]</nowiki>`, or as
`<nowiki>[[administrative]] [[territory|territories]]</nowiki>` if plural `administrative territories` is given.
*# another string: Use that string directly. If the placetype is plural, `pluralize()` in [[Module:en-utilities]] is
called on the string, which will correctly pluralize most strings, including those with links in them. (If there
are multiple links, the display form of the last link is pluralized.)
*# `false`: This placetype is not allowed as an entry placetype. An error will be thrown if this placetype is given as
an entry placetype. This is specified for internal-use placetypes, especially placetypes used in conjunction with
the qualifiers `former`, `ancient`, `historical` and such.
* `plural_link`: If specified and the placetype is plural, use the value in place of generating a pluralized version of
the link spec in `link`. Most commonly, this is either a string with links in it (which is used directly) or the
value `false`, indicating that the placetype cannot occur plural. (This is used for example by `caplc`, which displays
as `<nowiki>[[capital]] and [[large]]st [[city]]</nowiki>`, where a plural version doesn't make sense.) Generally if
this is specified, `plural` also needs to be specified to give a special placetype plural; this situation occurs
especially with multiword placetypes where something other than the last word is pluralized. An example is
`town with bystatus`, whose plural is `towns with bystatus`, which needs to be explicitly given. This example uses
`link = <nowiki>"[[town]] with [[bystatus#Norwegian Bokmål|bystatus]]"</nowiki>` ({{m|nb|bystatus}}) is a Norwegian
Bokmål word, and template calls aren't currently permitted in link strings), along with
`plural_link = <nowiki>"[[town]]s with [[bystatus#Norwegian Bokmål|bystatus]]"</nowiki>`.
* `category_link`: Spec indicating how to display the placetype when occurring in category descriptions. Defaults to
the value of `link`, and in turn is overridden by more specific `category_link_*` keys; see below. Category-only
placetypes (which are plural and end in `!`) usually use `category_link` in preference to `link`. The value of
`category_link` can be any of the types of specs given above, but most commonly is a plural string with links in it,
spelling out the description; in this case it is used directly. When both `category_link` and `link` are given, the
value in `category_link` is typically longer and more descriptive. For example, `polity` uses `link = true`, which
just generates a link `<nowiki>[[polity]]</nowiki>` or plural `<nowiki>[[polity|polities]]</nowiki>`, but specifies a
separate `category_link = <nowiki>"[[independent]] or [[semi-]][[independent]] [[polity|polities]]"</nowiki>`, which
clarifies in the category description what a polity is.
* `category_link_top_level`: Spec indicating how to display top-level (bare/unqualified) categories, i.e. categories
where the placetype is not followed by `in ``location`` ` or `of ``location`` `. If given, this overrides
`category_link` for this type of category.
* `category_link_before_noncity`: Spec indicating how to display qualified categories of the form
` ``placetypes`` in/of ``location`` ` where ``location`` does not refer to a city. If given, this overrides
`category_link` for this type of category.
* `category_link_before_city`: Spec indicating how to display qualified categories of the form
` ``placetypes`` in/of ``location`` ` where ``location`` refer to a city. If given, this overrides `category_link` for
this type of category. An example where this is given is `neighborhood`, which uses the following specs:<ol>
<li>`link = true`</li>
<li>`category_link = <nowiki>"[[neighborhood]]s, [[district]]s and other subportions of [[city|cities]]"</nowiki>`</li>
<li>`category_link_before_city = <nowiki>"[[neighborhood]]s, [[district]]s and other subportions"</nowiki>`</li>
</ol> This has the effect of making the entry placetype `neighborhood` display as just
`<nowiki>[[neighborhood]]</nowiki>`, while e.g. a category like `Neighborhoods of Chicago` displays as
`<nowiki>[[neighborhood]]s, [[district]]s and other subportions of [[Chicago]], ...</nowiki>` and a category like
`Neighborhoods in Illinois, USA` displays as
`<nowiki>[[neighborhood]]s, [[district]]s and other subportions of [[city|cities]] in [[Illinois]], ...</nowiki>`.
* `disallow_in_entries`: If specified, this placetype cannot occur as an entry placetype, and the specified value
(a message indicating what to use instead) is displayed in the error message.
* `disallow_in_holonyms`: If specified, this placetype cannot occur as a holonym placetype, and the specified value
(a message indicating what to use instead) is displayed in the error message.
2. There is currently one fallback-related property key recognized:
* `fallback`: If specified, its value is a placetype which will be used for categorization purposes if no categories
get added using the placetype itself. As an example, `branch` sets a fallback of `river` but also sets
`preposition = "ของ"`, meaning that {{tl|place|en|branch|riv/Mississippi}} displays as `a branch of the Mississippi`
(whereas `river` itself uses the preposition `in`), but otherwise categorizes the same as `river`. A more complex
example is `area`, which sets a fallback of `geographic and cultural area` and also sets a category handler that
checks for cities or city-like entities (e.g. boroughs) occurring as holonyms and categorizes the toponym under
[[:Category:Neighborhoods of CITY]] (for recognized cities) or otherwise [[:Category:Neighborhoods of POLDIV]] (for
the nearest containing recognized location). In addition, `area` is set as a political division of Kuwait, meaning if
`c/Kuwait` occurs as holonym, the toponym is categorized under [[:Category:Areas of Kuwait]]. If none of these
categories trigger, the fallback of `geographic and cultural area` will take effect, and the toponym will be
categorized as e.g. [[:Category:Geographic and cultural areas of England]].
3. There is currently one property to control irregular plurals of placetypes:
* `plural`: If specified, its value is the plural of the placetype. Otherwise, the default pluralization algorithm in
[[Module:en-utilities]] applies (which correctly pluralizes most words, including those ending in `-y`, `-ch`, `-sh`,
`-x`, etc.). The value of `plural` is also used when converting a pluralized placetype into its singular equivalent;
for example, since the placetype `kibbutz` has `plural = "kibbutzim"`, the placetype `kibbutzim` will be recognized
as a plural and singularized to `kibbutz`. For this reason, it's occasionally necessary to specify a `plural` value
even when the default pluralization algorithm works correctly, if the default singularization algorithm won't
correctly reverse the pluralization (as with `pass` and other terms ending in `-ss`).
4. The following property keys relate to generating categories for entry placetypes and specifying the parents of those
categories:
* `class`: The general class of placetype. This is used for various purposes: (a) to categorize placetypes preceded by
a qualifier such as `former`, `ancient`, `medieval` or `historical` (note that these placetypes are not all treated
alike); (b) to determine the parent category of bare placetype categories (e.g. [[:Category:Villages]] for placetype
`village`); (c) to determine whether to add a parent category `political divisions of specific countries` to
qualified placetype categories (e.g. [[:Category:Villages in Mali]]). The possible values are:
*# `polity`: a more-or-less sovereign/independent polity, such as a country, kingdom or empire.
*# `subpolity`: a non-sovereign division of a polity, above the level of an individual settlement.
*# `settlement`: a city or smaller equivalent, such as a village. This also includes administrative divisions of a
settlement, such as wards and barangays.
*# `non-admin settlement`: similar to a settlement but without administrative or political significance, such as an
unincorporated community, farm or neighborhood.
*# `capital`: a settlement that is a capital. A former capital is generally still in existence, just not the capital
any more.
*# `natural feature`: any non-man-made feature, such as a lake, mountain, island, ocean, etc.
*# `man-made structure`: a man-made feature below the level of a neighborhood, such as a house, airport, university,
metro station, park or the like.
*# `geographic region`: a geographic or cultural region or area that has no administrative significance. These may vary
greatly in size but typically have some sort of cultural significance (possibly historical). The `former`, `ancient`,
etc. qualifier has no effect on the category of these placetypes.
*# `generic place`: a place that isn't further qualified into any specific subtype.
* `former_type`: The class of placetype used for categorizing placetypes preceded by a qualifier such as `former`,
`ancient`, `medieval` or `historical`. The possible values are the same as for `class` but with the addition of
`dependent territory` (for colonies, protectorates and the like) and `!` (ignore the historical/former/ancient/etc.
qualifier; used e.g. with `fictional location` and `mythological location`). If not specified, the value of `class`
is used. When a qualifier such as `former`, `ancient`, `medieval` or `historical` is encountered (specifically, those
in `former_qualifiers`), it is mapped using `former_qualifiers` to the appropriate internal qualifier or qualifiers
(one or both of `ANCIENT` and/or `FORMER`, which are written in all-caps to distinguish them from user-specified
qualifiers), which is prepended to the value of `former_type` or `class` to form a placetype whose properties are
looked up to determine how to categorize the toponym in question. For example, if `medieval village` is given, we map
`medieval` to `ANCIENT` and `FORMER`, and `village` to its `class` of `settlement`, and enter the placetypes
`ANCIENT settlement` and `FORMER settlement` (in that order) into the list of equivalent placetypes returned by
`get_placetype_equivs`. In this case, there is an entry in `placetype_data` for `ANCIENT settlement`, so its default
category spec `Ancient settlements` is used as the category. If on the other hand `medieval kingdom` is given, where
`kingdom` has a `class` value `polity`, we first look up `ANCIENT polity`, see there is no entry in `placetype_data`
for it, and then look up `FORMER polity`, which exists and has a default category spec `Former polities`, which is
used as the category. Note that if the placetype following the "former" qualifier is recognized in `placetype_data`
but has no `former_type` or `class` and no fallback with a `former_type` or `class` specified, it is an internal
error; but if the placetype isn't recognized (e.g. something like `former greenhouse` is specified and we don't have
an entry for `greenhouse`), we just track the occurrence and end up not categorizing.
* `bare_category_parent`: This specifies the first parent category of a bare placetype category named according to the
placetype in question (e.g. [[:Category:Atolls]] for placetype `atoll`, or [[:Category:Named buildings]] for
placetype `named buildings!`). If not specified, the first parent category is determined by the value of `class`,
using the mapping `class_to_bare_category_parent` in [[Module:category tree/topic cat/data/Places]].
* `addl_bare_category_parents`: Extra parent categories to add a bare placetype category to (see `bare_category_parent`
just above).
* `bare_category_breadcrumb`: Breadcrumb for bare placetype categories. Also used as the sort key of
`bare_category_parent` if it is a string.
* `inherently_former`: If specified and the given placetype is used as an entry placetype, act as if `former` or
`ancient` (depending on the value of `inherently_former`) were prefixed to the placetype. This is for placetypes that
always refer to no-longer-existing entities, such as `satrapy` and `treaty port`. The value of `inherently_former` is
a list of internal qualifiers (one or more of `ANCIENT` and/or `FORMER`), just as for `former_qualifiers`, and the
implementation is the same.
* `cat_handler`: Handler used to generate the categories to add a given toponym to, if its entry placetype is the
placetype in question. Generally the `cat_handler` function checks the holonyms specified in order to determine which
category or categories to generate. For example, `district_neighborhood_cat_handler` handles placetypes `district`,
`neighborhood`, `subdivision`, `suburb` and the like, and either adds the toponym to a category like
`Neighborhoods of ``city`` ` (if a recognized city is given as a holonym), or otherwise a category like
`Neighborhoods in ``location`` ` (for the first recognized non-city location given as a holonym, if an unrecognized
city or city-like entity is given before the recognized non-city). The algorithm that runs the category handlers
iterates over holonyms from left to right, running the `cat_handler` function on each holonym in turn until one or
more categories are returned; see below for more specifics. (Note that countries for which e.g. a `district` is a
political division do not get the corresponding category added by the `district_neighborhood_cat_handler` function but
by `political_division_cat_handler`.) `cat_handler` functions are called with one argument, `data`, describing the
resolved entry placetype (i.e. after resolving placetype aliases and fallbacks) and the holonym being processed. The
return value should be a list of category specs (categories minus the langcode prefix, with `+++` standing for the
holonym key, or the value `true`, which stands for ` ``Placetypes`` in/of ``Holonym`` `, i.e. the pluralized placetype
with the appropriate preposition as specified in `placetype_data`). `data` contains the following fields:
** `entry_placetype`: the resolved entry placetype for the entry placetype being processed (i.e. it will always have an
entry in `placetype_data` but may not be the original placetype given by the user);
** `holonym_placetype` and `holonym_placename`: the holonym placetype and placename being processed;
** `holonym_index`: the index of the holonym being processed, or {nil} if we're handling an overriding holonym (FIXME:
we will change the overriding holonym algorithm so there will be an index even when processing overriding holonyms);
** `place_desc`: a full description of the {{tl|place}} call, as specified at the top of [[Module:place]];
** `from_demonym`: If set, we are called from [[Module:demonym]], triggered by {{tl|demonym-adj}} or
{{tl|demonym-noun}}, instead of being triggered by {{tl|place}}.
* `has_neighborhoods`: If `true`, the specified placetype is city-like. This is used in the
`district_neighborhood_cat_handler` to determine whether to add a category such as `Neighborhoods in ``location`` `;
see the section just above on `cat_handler`.
5. The following preposition-related property keys are recognized:
* `preposition`: The preposition used after this placetype when it occurs as an entry placetype. Defaults to `"ใน"`.
* `generic_before_non_cities`: If specified, the appropriate category description handler in
[[Module:category tree/topic cat/data/Places]] will recognize categories of the form
` ``Placetype`` in/of ``location`` ` for the specified placetype and preposition, if ``location`` is a non-city. This
is used to generate descriptions for categories added by category handlers and by explicit category specs in the
placetype data. All placetypes that specify `generic_before_non_cities` or `generic_before_cities` *MUST* also specify
a value for `class` so that the category tree code can determine whether it's a political or non-political division.
* `generic_before_cities`: Like `generic_before_non_cities` but for locations referring to cities.
6. The following property keys control the auto-addition of affixes when formatting holonyms of a particular placetype:
* `affix_type`: If specified, add the placetype as an affix before or after holonyms of this placetype. Possible values
are:
*# `"pref"` (the holonym will display as `(the) placetype of Holonym`, where `the` appears when the holonym directly
follows an entry placetype);
*# `"Pref"` (same as `"pref"` but the placetype is capitalized; each word is capitalized if there are multiple);
*# `"suf"` (the holonym will display as `Holonym placetype`);
*# `"Suf"` (the holonym will display as `Holonym Placetype`, i.e. same as `"suf"` but the placetype is capitalized).
* `suffix`: String to use in place of the placetype itself when the placetype is displayed as a suffix after a holonym.
Note that `suffix` can be used independently of `affix_type` because the user can also request a suffix explicitly
using a syntax like `adr:suf/Occitania`, which will display as `Occitania region` because the placetype
`administrative region` specifies `suffix = "ภูมิภาค"`.
* `prefix`: Like `suffix` but for use when the placetype is displayed as a prefix before the holonym.
* `affix`: Like `suffix` and `prefix` but for use when the placetype is displayed as an affix either before or after the
holonym. If both `suffix` or `prefix` and `affix` are given for a single placetype, `suffix` or `prefix` take
precedence.
* `no_affix_strings`: String or list of strings that, if they occur in the holonym, suppress the addition of any affix
requested using `affix_type`. Defaults to the placetype itself. For example, `autonomous okrug` specifies
`affix_type = "Suf"` so that `aokr/Nenets` displays as `Nenets Autonomous Okrug`, but also specifies
`no_affix_strings = "okrug"` so that `aokr/Nenets Okrug` or `aokr/Nenets Autonomous Okrug` displays as specified,
without a redundant `Autonomous Okrug` added. Matching is case-insensitive but whole-word.
* `display_handler`: A function of two arguments, `holonym_placetype` and `holonym_placename` (specifying a holonym).
Its return value is a string specifying the display form of the holonym.
7. The following property keys control the indefinite and definite articles used before entry placetypes and/or holonyms
of the specified placetype.
* `entry_placetype_use_the`: Use `"the"` before this placetype when it occurs as an entry placetype.
* `entry_placetype_indefinite_article`: Indefinite article used before this placetype when it occurs as an entry
placetype (usually `"a"`, specifically for placetypes beginning with u- that don't take the indefinite article
`"an"`). Defaults to the appropriate indefinite article (`"a"` or `"an"` depending on whether the placetype begins
with a vowel). Overridden by `entry_placetype_use_the`, and unlike for most properties, does not apply to equivalent
placetypes (i.e. fallbacks or those formed by removing a qualifier from the beginning); only to the exact placetype
specified.
* `holonym_use_the`: Use `"the"` before holonyms of this placetype.
'''NOTE:'''
# The `link` property must be specified on all placetypes, except those ending in `!` (category-only placetypes), which
must have either `link` or `category_link` specified.
# Either the `class` or `former_type` property must be specified on all placetypes not ending in `!` that do not have a
fallback (if a placetype has a fallback and omits the `class` and `former_type` properties, they are taken from the
fallback). An internal error will result if a placetype has no `class` or `former_type` property derivable either
directly or through a fallback, if an attempt is made to categorize a former/ancient/historical/etc. entity of this
placetype.
# It is possible to have multiple levels of fallback (e.g. `frazione` falls back to `hamlet`, which falls back
to `village`). Fallback loops will cause an internal error. All placetypes specified as fallbacks must exist in
`placetype_data` or an internal error occurs.
]==]
export.placetype_data = {
--[=[
If you need to sort the following, do this (using Vim):
1. Make sure all full-line comments are within the { ... } table, or are moved after and on the same line as single-line
entries.
2. Make sure the table uses tabs everywhere for indent, and not spaces.
3. Mark the top of the table with `ma`, go to the bottom and execute the following two lines in sequence:
:'a,.s/\n/\\n/g
:s/\\n\(\t\[\)/\r\1/g
The first command converts every newline to a literal `\n` sequence, so the whole thing becomes a single line, while
the second command restores the newlines before the beginning of each entry. The effect is to convert all entries to
a single line while not losing any information. (Potentially a negative lookahead could be used to do it all in one
command.)
4. Execute the following to sort:
:'a,.!perl -pe 's/^(\t\[")(.*?)(".*)$/$2 @@@ $1$2$3/' | sort -f | perl -pe 's/.*? @@@ //'
Note that a simple `sort -f` (where `-f` means case-insensitive) would almost work, but it would sort "hill station"
before "hill" and "county borough" before "เทศมณฑล" because the space after e.g. "hill station" sorts before the
quotation mark after e.g. "hill". The above command deals with this by extracting the key, prepending it followed by
` @@@ `, sorting, and then removing key (the classic decorate-sort-undecorate pattern).
5. Put the table back to multi-line format by marking the top of the table with `ma`, going to the bottom and executing
:'a,.s/\\n/\r/g
Note that for some reason, in order to get a match a newline in the left side of a replacement, you must use \n, but
to insert a newline in the right sode of a replacement you must use \r.
]=]
["*"] = {
link = false,
cat_handler = generic_place_cat_handler,
},
["administrative atoll"] = {
-- Maldives
link = "+w:administrative divisions of the Maldives",
preposition = "ของ",
class = "subpolity",
},
["administrative capital"] = {
link = "w",
fallback = "เมืองหลวง",
},
["administrative center"] = {
link = "w",
fallback = "เมืองหลวงที่ไม่ใช่นคร",
},
["administrative centre"] = {
link = "w",
fallback = "administrative center",
},
["administrative county"] = {
link = "w",
fallback = "เทศมณฑล",
},
["administrative district"] = {
link = "w",
fallback = "อำเภอ",
},
["administrative headquarters"] = {
link = "separately",
fallback = "administrative centre",
},
["administrative region"] = {
link = true,
preposition = "ของ",
suffix = "ภูมิภาค", -- but prefix is still "administrative region (of)"
fallback = "ภูมิภาค",
class = "subpolity",
},
["administrative seat"] = {
link = "w",
fallback = "administrative centre",
},
["administrative territory"] = {
link = "separately",
preposition = "ของ",
suffix = "ดินแดน", -- but prefix is still "administrative territory (of)"
fallback = "ดินแดน",
class = "subpolity",
},
["administrative unit"] = {
-- Grrr, it's difficult to generalize about "administrative units". In Albania, "administrative unit" is an
-- official term for a city-level division of municipalities; Wikipedia renders it using the more practical term
-- "commune". In Pakistan, "administrative unit" is a collective term used to refer to all the different types
-- of first-level divisions (four provinces, one federal territory, and two "disputed territories", i.e. Azad
-- Kashmir and Gilgit-Balistan, that are variously described). For this reason, we set no fallback, but we need
-- to include this so that it can be used as a placetype for Albania, categorizing as communes.
link = "w",
class = "subpolity",
},
["administrative village"] = {
link = "w",
preposition = "ของ",
has_neighborhoods = true,
class = "settlement",
},
["aimag"] = {
-- used in Mongolia, Russia and China (Inner Mongolia); in Mongolia, equivalent to a province;
-- in China, equivalent to a prefecture (below a province); in Russia, equivalent to a municipal district.
link = "w",
fallback = "prefecture",
},
["airport"] = {
link = true,
class = "man-made structure",
default = {true},
},
["alliance"] = {
link = true,
fallback = "confederation",
},
["archipelago"] = {
link = true,
fallback = "เกาะ",
},
["area"] = {
link = true,
preposition = "ของ",
fallback = "geographic and cultural area",
-- Areas can either be administrative divisions (specifically of Kuwait) or geographic areas. Assume the former
-- when categorizing 'Areas' but the latter when handling e.g. 'historical area'.
class = "subpolity",
former_type = "geographic region",
cat_handler = district_neighborhood_cat_handler,
},
["arm"] = {
link = true,
preposition = "ของ",
class = "natural feature",
default = {"ทะเล"},
},
["arrondissement"] = {
link = true,
preposition = "ของ",
-- FIXME!!! Grrrrr!!! In some countries, arrondissements are divisions of cities; in others, they are divisions
-- of departments or provinces. Need to conditionalize on the country for both of the following.
class = "subpolity",
has_neighborhoods = true,
},
["associated province"] = {
link = "separately",
fallback = "จังหวัด",
},
["atoll"] = {
-- FIXME! Atolls are administrative divisions of the Maldives but natural features elsewhere. Need to
-- conditionalize `class` on the country. See also `administrative atoll`.
link = true,
class = "natural feature",
bare_category_parent = "เกาะ",
default = {true},
},
["autonomous city"] = {
link = "w",
preposition = "ของ",
fallback = "นคร",
has_neighborhoods = true,
},
["autonomous community"] = {
-- Spain; refers to regional entities, not village-like entities, as might be expected from "community"
link = true,
preposition = "ของ",
class = "subpolity",
},
["autonomous island"] = {
-- Comoros; seems like an administrative atoll of the Maldives.
link = "+w:autonomous islands of Comoros",
preposition = "ของ",
class = "subpolity",
},
["autonomous oblast"] = {
link = true,
preposition = "ของ",
affix_type = "Suf",
no_affix_strings = "oblast",
class = "subpolity",
},
["autonomous okrug"] = {
link = true,
preposition = "ของ",
affix_type = "Suf",
no_affix_strings = "okrug",
class = "subpolity",
},
["autonomous prefecture"] = {
link = true,
fallback = "prefecture",
},
["autonomous province"] = {
link = "w",
fallback = "จังหวัด",
},
["autonomous region"] = {
link = "w",
preposition = "ของ",
fallback = "administrative region",
-- "administrative region" sets an affix of "ภูมิภาค" but we want to display as "Tibet Autonomous Region"
-- if the user writes 'ar:Suf/Tibet'.
affix = "autonomous region",
},
["autonomous republic"] = {
link = "w",
preposition = "ของ",
class = "subpolity",
},
["autonomous territorial unit"] = {
-- Moldova; only two of them, one for Gagauzia and one for Transnistria.
link = "w",
preposition = "ของ",
class = "subpolity",
},
["autonomous territory"] = {
link = "w",
fallback = "dependent territory",
},
["bailiwick"] = {
-- Jersey, etc.
link = true,
fallback = "องค์การทางการเมือง",
},
["barangay"] = {
-- Philippines
link = true,
class = "settlement",
-- Barangays are formal administrative divisions of a city rather than informal neighborhoods, but can use
-- some of the properties of a neighborhood.
fallback = "neighborhood",
},
["barrio"] = {
-- Spanish-speaking countries; Philippines
link = true,
-- FIXME: Not completely correct, in some countries barrios are formal administrative divisions of a city.
-- `class` will need to conditionalize on the country to be completely correct.
fallback = "neighborhood",
},
["basin"] = {
link = true,
fallback = "ทะเลสาบ",
},
["bay"] = {
link = true,
preposition = "ของ",
class = "natural feature",
addl_bare_category_parents = {"bodies of water"},
default = {true},
},
["beach"] = {
link = true,
class = "natural feature",
addl_bare_category_parents = {"water"},
default = {true},
},
["beach resort"] = {
link = "w",
fallback = "resort town",
},
["bishopric"] = {
link = true,
fallback = "องค์การทางการเมือง",
},
["bodies of water!"] = {
-- FIXME: This is (maybe?) a type category not a name category. There should be an option for this. We need to
-- straighten out the type vs. name vs. related-to issue.
category_link = "[[body of water|bodies of water]]",
class = "natural feature",
addl_bare_category_parents = {"ธรณีสัณฐาน", "ecosystems", "water"},
},
["borough"] = {
link = true,
preposition = "ของ",
display_handler = borough_display_handler,
has_neighborhoods = true,
-- "former borough" could be a former settlement or a former part of a city but seems more likely to
-- be a former subpolity, particularly in England. FIXME, we really need a handler to take care of this
-- properly.
class = "subpolity",
-- Grr, some boroughs are city-like but some (e.g. in Britain) may be larger.
},
["borough seat"] = {
link = true,
entry_placetype_use_the = true,
preposition = "ของ",
has_neighborhoods = true,
class = "capital",
},
["branch"] = {
link = true,
preposition = "ของ",
fallback = "แม่น้ำ",
},
["bridge"] = {
link = true,
class = "man-made structure",
default = {"Named bridges"},
},
["building"] = {
link = true,
class = "man-made structure",
default = {"Named buildings"},
},
["built-up area"] = {
link = "w",
fallback = "area",
},
["burgh"] = {
link = true,
fallback = "borough",
},
["business park"] = {
link = true,
fallback = "park",
},
["caliphate"] = {
link = true,
fallback = "องค์การทางการเมือง",
},
["canton"] = {
link = true,
preposition = "ของ",
affix_type = "suf",
class = "subpolity",
},
["cape"] = {
link = true,
fallback = "headland",
},
["capital"] = {
link = true,
fallback = "เมืองหลวง",
},
["เมืองหลวง"] = {
link = true,
category_link = "[[capital city|capital cities]]: the [[seat of government|seats of government]] for a country or [[political]] [[division]] of a country",
entry_placetype_use_the = true,
preposition = "ของ",
has_neighborhoods = true,
class = "capital",
bare_category_parent = "นคร",
cat_handler = capital_city_cat_handler,
default = {true},
-- The following is necessary so that e.g. [[Melbourne]] defined as {{place|en|capital city|s/Victoria|c/Australia}}
-- gets categorized in the bare category [[Category:en:Melbourne]]; otherwise placetype 'capital city' wouldn't
-- match against the placetype 'city' of Melbourne.
fallback = "นคร",
},
["caplc"] = {
link = "[[capital]] and [[large]]st [[city]]",
plural_link = false,
fallback = "เมืองหลวง",
},
["captaincy"] = {
link = true,
preposition = "ของ",
class = "subpolity",
inherently_former = {"FORMER"},
},
["caravan city"] = {
link = "w",
fallback = "นคร",
class = "settlement",
inherently_former = {"ANCIENT", "FORMER"},
},
["castle"] = {
link = true,
fallback = "building",
},
["cathedral city"] = {
link = true,
fallback = "นคร",
},
["cattle station"] = {
-- Australia
link = true,
fallback = "farm",
},
["census area"] = {
link = true,
affix_type = "Suf",
has_neighborhoods = true,
class = "non-admin settlement",
},
["census-designated place"] = {
-- United States
link = true,
class = "non-admin settlement",
},
["census division"] = {
-- Canada
link = "w",
preposition = "ของ",
class = "subpolity",
},
["census town"] = {
link = "w",
fallback = "เมือง",
},
["central business district"] = {
link = true,
fallback = "neighborhood",
},
["cercle"] = {
-- Mali
link = "+w:cercles of Mali",
preposition = "ของ",
class = "subpolity",
},
["ceremonial county"] = {
link = true,
fallback = "เทศมณฑล",
},
["chain of islands"] = {
link = "[[chain]] of [[island]]s",
plural = "chains of islands",
plural_link = "[[chain]]s of [[island]]s",
fallback = "เกาะ",
},
["channel"] = {
link = true,
fallback = "strait",
},
["charter community"] = {
-- Northwest Territories, Canada
link = "w",
fallback = "village",
},
["นคร"] = {
link = true,
generic_before_non_cities = "ใน",
has_neighborhoods = true,
class = "settlement",
cat_handler = city_type_cat_handler,
default = {true},
},
["city-state"] = {
link = true,
category_link = "[[sovereign]] [[microstate]]s consisting of a single [[city]] and [[w:dependent territory|dependent territories]]",
has_neighborhoods = true,
class = "settlement",
["continent/*"] = {"City-states", "Cities in +++", "Countries in +++", "National capitals"},
default = {"City-states", "นคร", "ประเทศ", "National capitals"},
},
["civil parish"] = {
-- Mostly England; similar to municipalities
link = true,
preposition = "ของ",
affix_type = "suf",
has_neighborhoods = true,
class = "subpolity",
},
["claimed political division"] = {
link = "[[claim]]ed [[political]] [[division]]",
class = "subpolity",
default = {true},
},
["co-capital"] = {
link = "[[co-]][[capital]]",
fallback = "เมืองหลวง",
},
["coal city"] = {
link = "+w:coal town",
fallback = "นคร",
},
["coal town"] = {
link = "w",
fallback = "เมือง",
},
["collectivity"] = {
link = "w",
preposition = "ของ",
-- No default; these are weird one-off governmental divisions in France (esp. for overseas collectivities)
class = "subpolity",
},
["colony"] = {
link = true,
fallback = "dependent territory",
},
["comarca"] = {
-- per Wikipedia: traditional region or local administrative division found in Portugal, Spain, and some of
-- their former colonies, like Brazil, Nicaragua, and Panama. In the Valencian Community, for example, it
-- sits between municipalities and provinces, something like a county or district.
link = true,
preposition = "ของ",
class = "subpolity",
},
["commandery"] = {
link = true,
preposition = "ของ",
class = "subpolity",
inherently_former = {"ANCIENT", "FORMER"},
},
["commonwealth"] = {
link = true,
preposition = "ของ",
-- No default; applies specifically to Puerto Rico
class = "subpolity",
},
["commune"] = {
link = true,
fallback = "เทศบาล",
},
["community"] = {
link = true,
category_link = "[[community|communities]] of all sizes",
fallback = "village",
},
["community development block"] = {
-- in India; appears to be similar to a rural municipality; groups several villages, unclear if there will be
-- neighborhoods so I'm not setting `has_neighborhoods` for now
link = "w",
affix_type = "suf",
no_affix_strings = "block",
class = "subpolity",
},
["comune"] = {
-- Italy, Switzerland
link = true,
fallback = "เทศบาล",
},
["condominium"] = {
link = true,
fallback = "องค์การทางการเมือง",
},
["confederacy"] = {
link = true,
fallback = "confederation",
},
["confederation"] = {
link = true,
fallback = "องค์การทางการเมือง",
},
["constituency"] = {
-- currently we have them as political divisions of Namibia but many countries have them
link = true,
preposition = "ของ",
class = "subpolity",
},
["constituent country"] = {
link = true,
preposition = "ของ",
class = "subpolity",
},
["constituent part"] = {
link = "separately",
preposition = "ของ",
class = "subpolity",
},
["constituent republic"] = {
-- Of Russia, Yugoslavia, etc.
link = "separately",
preposition = "ของ",
class = "subpolity",
},
["counties and county-level cities!"] = {
-- This is used when grouping counties and county-level cities under prefecture-level cities in China.
category_link = "[[county|counties]] and [[county-level city|county-level cities]]",
class = "subpolity",
},
["continent"] = {
link = true,
category_link = false, -- can't occur as a bare category
class = "natural feature",
default = {"Continents and continental regions"},
},
["continental region"] = {
link = "separately",
category_link = false, -- can't occur as a bare category
class = "geographic region",
fallback = "continent",
},
["continents and continental regions!"] = {
category_link = "[[continent]]s and [[continent]]-[[level]] [[region]]s (e.g. [[Polynesia]])",
class = "geographic region",
},
["council area"] = {
link = true,
-- in Scotland; similar to a county
preposition = "ของ",
affix_type = "suf",
class = "subpolity",
},
["ประเทศ"] = {
link = true,
class = "polity", --ห้ามแปล class
["continent/*"] = {true, "ประเทศ"},
default = {true},
},
["country-like entities!"] = {
category_link = "[[polity|polities]] not normally considered [[country|countries]] but treated similarly for categorization purposes; typically, [[unrecognized]] [[de-facto]] countries or [[w:dependent territory|dependent territories]]",
class = "polity", --ห้ามแปล class
},
["เทศมณฑล"] = {
link = true,
preposition = "ของ",
display_handler = county_display_handler,
class = "subpolity",
},
["county borough"] = {
link = true,
-- in Wales; similar to a county
preposition = "ของ",
affix_type = "suf",
fallback = "borough",
class = "subpolity",
},
["county seat"] = {
link = true,
entry_placetype_use_the = true,
preposition = "ของ",
has_neighborhoods = true,
class = "capital",
},
["county town"] = {
link = true,
entry_placetype_use_the = true,
preposition = "ของ",
fallback = "เมือง",
has_neighborhoods = true,
class = "capital",
},
["county-administered city"] = {
-- In Taiwan, per Wikipedia similar to a Taiwanese township or district, which is a small city.
-- NOT anything like a "county-level city" in PR China, which is a county masquerading as a city.
link = "w",
fallback = "นคร",
has_neighborhoods = true,
class = "settlement",
},
["county-controlled city"] = {
-- Taiwan
link = "w",
fallback = "county-administered city",
},
["county-level city"] = {
-- PR China
link = "w",
fallback = "prefecture-level city",
},
["crater lake"] = {
link = true,
fallback = "ทะเลสาบ",
},
["creek"] = {
link = true,
fallback = "stream",
},
["Crown colony"] = {
link = "+crown colony",
fallback = "crown colony",
},
["crown colony"] = {
link = true,
fallback = "colony",
},
["Crown dependency"] = {
link = true,
fallback = "dependent territory",
},
["crown dependency"] = {
link = true,
fallback = "dependent territory",
},
["cultural area"] = {
link = "w",
fallback = "geographic and cultural area",
},
["cultural region"] = {
link = "w",
fallback = "geographic and cultural area",
},
["delegation"] = {
-- Tunisia
link = "+w:delegations of Tunisia",
preposition = "ของ",
class = "subpolity",
},
["department"] = {
link = true,
preposition = "ของ",
affix_type = "suf",
class = "subpolity",
},
["departmental capital"] = {
link = "separately",
fallback = "เมืองหลวง",
},
["dependency"] = {
link = true,
fallback = "dependent territory",
},
["dependent territory"] = {
link = "w",
preposition = "ของ",
class = "subpolity",
former_type = "dependent territory",
bare_category_parent = "political divisions",
["country/*"] = {true},
default = {true},
},
["desert"] = {
link = true,
class = "natural feature",
addl_bare_category_parents = {"ecosystems"},
default = {true},
},
["deserted mediaeval village"] = {
link = "w",
fallback = "deserted medieval village",
},
["deserted medieval village"] = {
link = "w",
fallback = "ANCIENT settlement",
},
["direct-administered municipality"] = {
-- China
link = "+w:direct-administered municipalities of China",
fallback = "เทศบาล",
},
["direct-controlled municipality"] = {
-- several countries
link = "w",
fallback = "เทศบาล",
},
["distributary"] = {
link = true,
preposition = "ของ",
fallback = "แม่น้ำ",
},
["อำเภอ"] = {
link = true,
preposition = "ของ",
affix_type = "suf",
-- Grrr! FIXME! Here is where we need handlers for `class`. Using similar logic to
-- district_neighborhood_cat_handler, we need to check if we're below or above a city to determine if the class
-- is "settlement" or "subpolity".
class = "subpolity",
cat_handler = district_neighborhood_cat_handler,
-- No default. Countries for which districts are political divisions will get entries.
},
["districts and autonomous regions!"] = {
-- This and other similar "combined placetypes" are for use in the plural when grouping first-level
-- administrative regions of certain countries, in this case Portugal.
category_link = "[[district]]s and [[autonomous region]]s",
class = "subpolity",
},
["districts and autonomous territorial units!"] = {
-- This and other similar "combined placetypes" are for use in the plural when grouping first-level
-- administrative regions of certain countries, in this case Moldova.
category_link = "[[district]]s and [[w:autonomous territorial unit|autonomous territorial unit]]s",
class = "subpolity",
},
["district capital"] = {
link = "separately",
fallback = "เมืองหลวง",
},
["district headquarters"] = {
link = "separately",
fallback = "administrative centre",
},
["district municipality"] = {
-- In Canada, a district municipality is equivalent to a rural municipality and won't have neighborhoods; in
-- South Africa, district municipalities group local municipalities and hence won't have neighborhoods.
link = "w",
preposition = "ของ",
affix_type = "suf",
no_affix_strings = {"อำเภอ", "เทศบาล"},
fallback = "เทศบาล",
class = "subpolity",
},
["division"] = {
link = true,
preposition = "ของ",
class = "subpolity",
},
["division capital"] = {
link = "separately",
fallback = "เมืองหลวง",
},
["dome"] = {
link = true,
fallback = "ภูเขา",
},
["dormant volcano"] = {
link = true,
fallback = "volcano",
},
["duchy"] = {
link = true,
fallback = "องค์การทางการเมือง",
},
["emirate"] = {
link = true,
preposition = "ของ",
-- FIXME: Can be subpolities (of the United Arab Emirates).
fallback = "องค์การทางการเมือง",
},
["จักรวรรดิ"] = {
link = true,
fallback = "องค์การทางการเมือง",
},
["enclave"] = {
link = true,
preposition = "ของ",
-- Enclaves can theoretically be any size but assume a subpolity.
class = "subpolity",
},
["entity"] = {
-- Bosnia and Herzegovina
link = "+w:entities of Bosnia and Herzegovina",
preposition = "ของ",
class = "subpolity",
},
["escarpment"] = {
link = true,
fallback = "ภูเขา",
},
["ethnographic region"] = {
-- used in Lithuania
link = "+w:ethnographic regions of Lithuania",
fallback = "geographic and cultural area",
},
["exclave"] = {
link = true,
preposition = "ของ",
-- exclaves can theoretically be any size but assume a subpolity.
class = "subpolity",
},
["external territory"] = {
link = "separately",
fallback = "dependent territory",
},
["farm"] = {
link = true,
class = "non-admin settlement",
default = {"Farms and ranches"},
},
["farms and ranches!"] = {
category_link = "[[farm]]s and [[ranch]]es",
class = "non-admin settlement",
},
["federal city"] = {
link = "w",
preposition = "ของ",
fallback = "นคร",
},
["federal district"] = {
link = true,
preposition = "ของ",
-- Might have neighborhoods as federal districts are often cities (e.g. Mexico City)
has_neighborhoods = true,
class = "settlement",
},
["federal subject"] = {
-- In Russia; a generic term for first-level administrative divisions (republics, oblasts, okrugs, krais,
-- autonomous okrugs and autonomous oblasts).
link = "w",
preposition = "ของ",
class = "subpolity",
},
["federal territory"] = {
link = "w",
fallback = "ดินแดน",
},
["fictional location"] = {
link = "separately",
former_type = "!",
class = "hypothetical location",
bare_category_parent = "สถานที่",
default = {true},
},
["First Nations reserve"] = {
-- Canada
link = "[[First Nations]] [[w:Indian reserve|reserve]]",
-- Wikipedia uses "Indian reserve"; presumably that is the legal term
fallback = "Indian reserve",
class = "subpolity",
},
["fjord"] = {
link = true,
class = "natural feature",
addl_bare_category_parents = {"bodies of water"},
default = {true},
},
["footpath"] = {
link = true,
fallback = "road",
},
["forest"] = {
link = true,
class = "natural feature",
addl_bare_category_parents = {"ecosystems", "forestry"},
default = {true},
},
["fort"] = {
link = true,
fallback = "building",
},
["fortress"] = {
link = true,
-- The default plural algorithm gets this right but the singularization algorithm incorrectly converts
-- fortresses -> fortresse, so put an entry here to ensure we singularize correctly.
plural = "fortresses",
fallback = "building",
},
["frazione"] = {
link = "w",
fallback = "hamlet",
},
["freeway"] = {
link = true,
fallback = "road",
},
["French prefecture"] = {
link = "[[w:prefectures in France|prefecture]]",
entry_placetype_use_the = true,
preposition = "ของ",
has_neighborhoods = true,
class = "capital",
},
["geographic and cultural area"] = {
link = "+w:cultural area",
-- `generic_before_non_cities` is used when generating the category description of categories of the format
-- `Geographic and cultural areas of PLACE`. `preposition` is used when generating {{place}} description and
-- categories for any placetype that falls back to `geographic and cultural area`.
generic_before_non_cities = "ของ",
preposition = "ของ",
class = "geographic region",
bare_category_parent = "สถานที่",
["country/*"] = {true},
["constituent country/*"] = {true},
["continent/*"] = {true},
default = {true},
},
["geographic area"] = {
link = "+w:geographic region",
fallback = "geographic and cultural area",
},
["geographic region"] = {
link = "w",
fallback = "geographic and cultural area",
},
["geographical area"] = {
link = "w",
fallback = "geographic and cultural area",
},
["geographical region"] = {
link = "w",
fallback = "geographic and cultural area",
},
["geopolitical zone"] = {
-- Nigeria
link = true,
preposition = "ของ",
class = "subpolity",
},
["gewog"] = {
-- Bhutan
link = true,
preposition = "ของ",
class = "subpolity",
},
["ghost town"] = {
link = true,
generic_before_non_cities = "ใน",
class = "non-admin settlement",
bare_category_parent = "former settlements",
cat_handler = city_type_cat_handler,
default = {true},
},
["glen"] = {
link = true,
fallback = "valley",
},
["governorate"] = {
link = true,
preposition = "ของ",
affix_type = "suf",
class = "subpolity",
},
["greater administrative region"] = {
-- China (former division)
link = "w",
preposition = "ของ",
class = "subpolity",
inherently_former = {"FORMER"},
},
["gromada"] = {
-- Poland (former division)
link = "w",
preposition = "ของ",
affix_type = "Pref",
class = "subpolity",
inherently_former = {"FORMER"},
},
["group of islands"] = {
link = "[[group]] of [[island]]s",
plural = "groups of islands",
plural_link = "[[group]]s of [[island]]s",
fallback = "island group",
},
["gulf"] = {
link = true,
preposition = "ของ",
holonym_use_the = true,
class = "natural feature",
addl_bare_category_parents = {"bodies of water"},
default = {true},
},
["hamlet"] = {
link = true,
fallback = "village",
},
["harbor city"] = {
link = "separately",
fallback = "นคร",
},
["harbor town"] = {
link = "separately",
fallback = "เมือง",
},
["harbour city"] = {
link = "separately",
fallback = "นคร",
},
["harbour town"] = {
link = "separately",
fallback = "เมือง",
},
["headland"] = {
link = true,
class = "natural feature",
addl_bare_category_parents = {"ธรณีสัณฐาน"},
default = {true},
},
["headquarters"] = {
link = "w",
fallback = "administrative centre",
},
["heath"] = {
link = true,
fallback = "moor",
},
["hemisphere"] = {
link = true,
entry_placetype_use_the = true,
fallback = "continental region",
},
["highway"] = {
link = true,
fallback = "road",
},
["hill"] = {
link = true,
class = "natural feature",
addl_bare_category_parents = {"ธรณีสัณฐาน"},
default = {true},
},
["hill station"] = {
link = "w",
fallback = "เมือง",
},
["hill town"] = {
link = "w",
fallback = "เมือง",
},
["historic region"] = {
-- provided only for the link
link = "+w:historical region",
fallback = "FORMER geographic region",
},
["historical county"] = {
-- needed for historical counties of England/etc.
link = "+w:historic county",
fallback = "FORMER subpolity",
},
["historical region"] = {
-- provided only for the link
link = "w",
fallback = "FORMER geographic region",
},
["home rule city"] = {
link = "w",
fallback = "นคร",
},
["home rule municipality"] = {
link = "w",
fallback = "เทศบาล",
},
["hot spring"] = {
link = true,
fallback = "spring",
},
["house"] = {
link = true,
fallback = "building",
},
["housing estate"] = {
-- not the same as a housing project (i.e. public housing)
link = true,
-- not exactly the case but approximately
fallback = "neighborhood",
},
["hromada"] = {
-- Ukraine
link = "w",
disallow_in_entries = "Use placetype 'urban hromada', 'rural hromada' or 'settlement hromada' in place of bare 'hromada'",
disallow_in_holonyms = "Use placetype 'urban hromada'/'uhrom', 'rural hromada'/'rhrom' or 'settlement hromada'/'shrom' in place of bare 'hromada'",
preposition = "ของ",
affix_type = "suf",
class = "subpolity",
},
["inactive volcano"] = {
link = "w",
fallback = "dormant volcano",
},
["independent city"] = {
link = true,
fallback = "นคร",
},
["independent town"] = {
link = "+independent city",
fallback = "เมือง",
},
["Indian reservation"] = {
link = "w",
-- In the US. Also known as "Native American reservation" or "domestic dependent nation", and the reservations
-- themselves often use the term "nation" in their official name (e.g. the "Navajo Nation"). But Wikipedia puts
-- the article at [[w:Indian reservation]] and uses that term when describing e.g. what the Navajo Nation is,
-- so this must still be the legal term.
preposition = "ของ",
class = "subpolity",
default = {true},
},
["Indian reserve"] = {
link = "w",
-- In Canada. "First Nations reserve" sounds more modern/PC but Wikipedia uses "Indian reserve"; presumably that
-- is still the legal term.
preposition = "ของ",
class = "subpolity",
default = {true},
},
["inland sea"] = {
-- note, we also have 'inland' as a qualifier
link = true,
fallback = "ทะเล",
},
["inner city area"] = {
link = "[[inner city]] [[area]]",
fallback = "neighborhood",
},
["เกาะ"] = {
link = true,
preposition = "ของ",
class = "natural feature",
addl_bare_category_parents = {"ธรณีสัณฐาน"},
default = {true},
},
["island country"] = {
-- FIXME: The following should map to both 'island' and 'country'.
link = "w",
fallback = "ประเทศ",
},
["island group"] = {
link = "separately",
fallback = "เกาะ",
},
["island municipality"] = {
link = "w",
fallback = "เทศบาล",
},
["islet"] = {
link = "w",
fallback = "เกาะ",
},
["Israeli settlement"] = {
link = "w",
class = "settlement",
default = {true},
},
["judicial capital"] = {
link = "w",
fallback = "เมืองหลวง",
},
["khanate"] = {
link = true,
fallback = "องค์การทางการเมือง",
},
["kibbutz"] = {
link = true,
plural = "kibbutzim",
class = "non-admin settlement",
default = {true},
},
["kingdom"] = {
link = true,
fallback = "monarchy",
},
["krai"] = {
link = true,
preposition = "ของ",
affix_type = "Suf",
class = "subpolity",
},
["ทะเลสาบ"] = {
link = true,
class = "natural feature",
addl_bare_category_parents = {"bodies of water"},
default = {true},
},
["ธรณีสัณฐาน!"] = {
category_link = "[[ธรณีสัณฐาน]]",
bare_category_parent = "สถานที่",
addl_bare_category_parents = {"โลก"},
},
["largest city"] = {
link = "[[large]]st [[city]]",
entry_placetype_use_the = true,
fallback = "นคร",
has_neighborhoods = true,
},
["league"] = {
link = true,
fallback = "confederation",
},
["legislative capital"] = {
link = "separately",
fallback = "เมืองหลวง",
},
["library"] = {
link = true,
fallback = "building",
},
["lieutenancy area"] = {
-- used in the United Kingdom; per Wikipedia:
-- In England, lieutenancy areas are colloquially known as the ceremonial counties, although this phrase does
-- not appear in any legislation referring to them. The lieutenancy areas of Scotland are subdivisions of
-- Scotland that are more or less based on the counties of Scotland, making use of the major cities as separate
-- entities.[2] In Wales, the lieutenancy areas are known as the preserved counties of Wales and are based on
-- those used for lieutenancy and local government between 1974 and 1996. The lieutenancy areas of Northern
-- Ireland correspond to the six counties and two former county boroughs.[3]
link = "w",
fallback = "ceremonial county",
},
["local authority district"] = {
link = "w",
fallback = "local government district",
},
["local government area"] = {
-- Australia
link = "w",
preposition = "ของ",
class = "subpolity",
},
["local council"] = {
-- Malta; similar to municipalities
link = "+w:local councils of Malta",
preposition = "ของ",
fallback = "เทศบาล",
},
["local government district"] = {
link = "w",
preposition = "ของ",
affix_type = "suf",
affix = "อำเภอ",
class = "subpolity",
},
["local government district with borough status"] = {
link = "[[w:local government district|local government district]] with [[w:borough status|borough status]]",
plural = "local government districts with borough status",
plural_link = "[[w:local government district|local government districts]] with [[w:borough status|borough status]]",
preposition = "ของ",
affix_type = "suf",
affix = "อำเภอ",
class = "subpolity",
},
["local urban district"] = {
link = "w",
fallback = "unincorporated community",
},
["locality"] = {
link = "+w:locality (settlement)",
-- not necessarily true, but usually is the case
fallback = "village",
},
["London borough"] = {
link = "w",
preposition = "ของ",
affix_type = "pref",
affix = "borough",
fallback = "local government district with borough status",
has_neighborhoods = true,
},
["macroregion"] = {
link = true,
fallback = "ภูมิภาค",
},
["man-made structures!"] = {
category_link = "[[w:geographical feature#Engineered constructs|man-made structures]] such as [[airport]]s, [[university|universities]] and [[metro station]]s",
bare_category_parent = "สถานที่",
},
["manor"] = {
-- FIXME: or is this more like a farm?
link = true,
fallback = "building",
},
["marginal sea"] = {
link = true,
preposition = "ของ",
fallback = "ทะเล",
},
["market city"] = {
link = "+market town",
fallback = "นคร",
},
["market town"] = {
link = true,
fallback = "เมือง",
},
["massif"] = {
link = true,
fallback = "ภูเขา",
},
["megacity"] = {
link = true,
fallback = "นคร",
},
["metro station"] = {
link = true,
class = "man-made structure",
},
["metropolitan borough"] = {
link = true,
preposition = "ของ",
affix_type = "Pref",
no_affix_strings = {"borough", "นคร"},
fallback = "local government district",
has_neighborhoods = true,
},
["metropolitan city"] = {
-- These exist e.g. in Italy and are more like municipalities or even provinces than cities.
link = true,
preposition = "ของ",
affix_type = "Pref",
no_affix_strings = {"metropolitan", "นคร"},
class = "subpolity",
},
["metropolitan county"] = {
link = true,
fallback = "เทศมณฑล",
},
["metropolitan municipality"] = {
-- In South Africa, metropolitan municipalities group local municipalities and are like districts, between
-- provinces and municipalities.
-- In Turkey, metropolitan municipalities are provinces-level.
link = "w",
preposition = "ของ",
affix_type = "Suf",
no_affix_strings = {"metropolitan", "เทศบาล"},
fallback = "เทศบาล",
class = "subpolity",
},
["microdistrict"] = {
-- residential complex in post-Soviet states
link = true,
fallback = "neighborhood",
},
["micronations!"] = {
-- FIXME, merge with microstate
category_link = "[[micronation]]s",
bare_category_parent = "ประเทศ",
},
["microstate"] = {
link = true,
fallback = "ประเทศ",
},
["military base"] = {
link = "w",
class = "settlement", -- or "man-made structure"?
default = {true},
},
["minster town"] = {
-- England
link = "separately",
fallback = "เมือง",
},
["monarchy"] = {
link = true,
fallback = "องค์การทางการเมือง",
},
["moor"] = {
link = true,
class = "natural feature",
addl_bare_category_parents = {"ธรณีสัณฐาน", "ecosystems"},
default = {true},
},
["moorland"] = {
link = true,
fallback = "moor",
},
["motorway"] = {
link = true,
fallback = "road",
},
["ภูเขา"] = {
link = true,
class = "natural feature",
addl_bare_category_parents = {"ธรณีสัณฐาน"},
default = {true},
},
["mountain indigenous district"] = {
-- Taiwan
link = "+w:district (Taiwan)",
fallback = "อำเภอ",
},
["mountain indigenous township"] = {
-- Taiwan
link = "+w:township (Taiwan)",
fallback = "township",
},
["mountain pass"] = {
link = true,
-- The default plural algorithm gets this right but the singularization algorithm incorrectly converts
-- passes -> passe, so put an entry here to ensure we singularize correctly.
plural = "mountain passes",
class = "natural feature",
addl_bare_category_parents = {"ภูเขา"},
default = {true},
},
["เทือกเขา"] = {
link = true,
fallback = "ภูเขา",
},
["mountainous region"] = {
link = "separately",
fallback = "ภูมิภาค",
},
["mukim"] = {
-- Malaysia, Brunei, Indonesia, Singapore
link = true,
preposition = "ของ",
class = "subpolity",
},
["municipal district"] = {
link = "w",
-- meaning varies depending on the country; for now, assume no neighborhoods.
-- FIXME: has_neighborhoods might have to be a function that looks at the containing holonyms.
preposition = "ของ",
affix_type = "Pref",
no_affix_strings = "อำเภอ",
fallback = "เทศบาล",
},
["เทศบาล"] = {
link = true,
preposition = "ของ",
has_neighborhoods = true,
class = "subpolity",
},
["municipality with city status"] = {
link = "[[municipality]] with [[w:city status|city status]]",
plural = "municipalities with city status",
plural_link = "[[municipality|municipalities]] with [[w:city status|city status]]",
fallback = "เทศบาล",
},
["museum"] = {
link = true,
fallback = "building",
},
["mythological location"] = {
link = "separately",
former_type = "!",
class = "hypothetical location",
bare_category_parent = "สถานที่",
default = {true},
},
["named bridges!"] = {
category_link = "notable [[bridge]]s",
bare_category_parent = "man-made structures",
addl_bare_category_parents = {"bridges"},
},
["named buildings!"] = {
category_link = "notable [[house]]s, [[library|libraries]] and other [[building]]s",
bare_category_parent = "man-made structures",
addl_bare_category_parents = {"buildings"},
},
["named roads!"] = {
category_link = "notable [[road]]s, [[highway]]s, [[trail]]s and similar linear structures",
bare_category_parent = "man-made structures",
addl_bare_category_parents = {"roads"},
},
["national capital"] = {
link = "w",
fallback = "เมืองหลวง",
},
["national park"] = {
link = true,
fallback = "park",
},
["natural features!"] = {
category_link = "[[w:geographical feature#Natural features|natural features]] such as [[lake]]s, [[mountain]]s, [[island]]s and [[ocean]]s",
bare_category_parent = "สถานที่",
},
["neighborhood"] = {
-- The majority of the properties here apply to both `neighborhoods` and `neighbourhoods`; the choice of which
-- one to use is made by district_neighborhood_cat_handler() based on the value of `british_spelling` for the
-- location (city, political division, etc.) of the holonym that follows the word "neighbo(u)hoods" in the
-- category name. It does *NOT* depend on whether the {{place}} call uses "neighborhoods" or "neighbourhoods".
-- (In general it can't, because other things like "urban areas", "อำเภอ", "subdivisions" and the like also
-- categorize as neighbo(u)rhoods.)
link = true,
-- See below. These are used by category handlers in [[Module:category tree/topic cat/data/Places]].
generic_before_non_cities = "ใน",
generic_before_cities = "ของ",
-- The following text is suitable for the top-level description of a neighborhood as well as categories of the
-- form `Neighborhoods in POLDIV` e.g. `Neighborhoods in Illinois, USA` but not for categories of the form
-- `Neighborhoods of Chicago`, where we'd get "... and other subportions of [[city|cities]] of [[Chicago]]".
category_link = "[[neighborhood]]s, [[district]]s and other subportions of [[city|cities]]",
category_link_before_city = "[[neighborhood]]s, [[district]]s and other subportions",
-- NOTE: This setting is needed for administrative divisions like barangays that fall back to `neighborhood`,
-- when set in [[Module:place/locations]] for a specific country (e.g. the Philippines). The above settings
-- for `generic_before_non_cities` and `generic_before_cities` are used by category handlers in
-- [[Module:category tree/topic cat/data/Places]] for `Neighborhoods in POLDIV` and `Neighborhoods of CITY`
-- categories. In fact, district_neighborhood_cat_handler() does not currently pay attention to them, but
-- generates "ของ" before cities and "ใน" before non-cities regardless. (FIXME: We should change that.)
preposition = "ของ",
class = "non-admin settlement",
cat_handler = district_neighborhood_cat_handler,
},
["neighbourhood"] = {
link = true,
category_link = "[[neighbourhood]]s, [[district]]s and other subportions of [[city|cities]]",
category_link_before_city = "[[neighbourhood]]s, [[district]]s and other subportions",
fallback = "neighborhood",
},
["new area"] = {
-- China (type of economic development zone, varying greatly in size)
link = "w",
preposition = "ใน",
class = "subpolity", --?
},
["new town"] = {
link = true,
fallback = "เมือง",
},
["เมืองหลวงที่ไม่ใช่นคร"] = {
link = "[[เมืองหลวง]]",
entry_placetype_use_the = true,
preposition = "ของ",
has_neighborhoods = true,
class = "capital",
cat_handler = function(data)
return capital_city_cat_handler(data, "non-city")
end,
-- FIXME, do we need the following?
default = {true},
},
["non-metropolitan county"] = {
link = "w",
fallback = "เทศมณฑล",
},
["non-metropolitan district"] = {
link = "w",
fallback = "local government district",
},
["non-sovereign kingdom"] = {
-- especially in Africa and Asia
link = "+w:non-sovereign monarchy",
generic_before_non_cities = "ใน",
class = "subpolity",
["country/*"] = {true},
["continent/*"] = {true},
default = {true},
},
["non-sovereign monarchy"] = {
link = "w",
fallback = "non-sovereign kingdom",
},
["oblast"] = {
link = true,
preposition = "ของ",
affix_type = "Suf",
class = "subpolity",
},
["oblasts and autonomous republics!"] = {
-- This and other similar "combined placetypes" are for use in the plural when grouping first-level
-- administrative regions of certain countries, in this case Ukraine.
category_link = "[[oblast]]s and [[w:autonomous republic|autonomous republic]]s",
class = "subpolity",
},
["มหาสมุทร"] = {
link = true,
holonym_use_the = true,
class = "natural feature",
addl_bare_category_parents = {"ทะเล", "bodies of water"},
default = {true},
},
["okrug"] = {
link = true,
preposition = "ของ",
affix_type = "Suf",
class = "subpolity",
},
["overseas collectivity"] = {
link = "w",
fallback = "collectivity",
},
["overseas department"] = {
link = "w",
fallback = "department",
},
["overseas territory"] = {
link = "w",
fallback = "dependent territory",
},
["parish"] = {
link = true,
preposition = "ของ",
affix_type = "suf",
class = "subpolity",
},
["parish municipality"] = {
-- in Quebec, often similar to a rural village; the famous [[Saint-Louis-du-Ha! Ha!]] is one of them.
link = "+w:parish municipality (Quebec)",
preposition = "ของ",
fallback = "เทศบาล",
has_neighborhoods = true,
},
["parish seat"] = {
link = true,
entry_placetype_use_the = true,
preposition = "ของ",
class = "capital",
has_neighborhoods = true,
},
["park"] = {
link = true,
class = "man-made structure",
default = {true},
},
["pass"] = {
link = "+mountain pass",
-- The default plural algorithm gets this right but the singularization algorithm incorrectly converts
-- passes -> passe, so put an entry here to ensure we singularize correctly.
plural = "passes",
fallback = "mountain pass",
},
["path"] = {
link = true,
fallback = "road",
},
["peak"] = {
link = true,
fallback = "ภูเขา",
},
["peninsula"] = {
link = true,
class = "natural feature",
addl_bare_category_parents = {"ธรณีสัณฐาน"},
default = {true},
},
["periphery"] = {
link = true,
preposition = "ของ",
class = "subpolity",
},
["สถานที่!"] = {
generic_before_non_cities = "ใน",
generic_before_cities = "ใน",
class = "generic place",
category_link = "[[place]]s of all sorts",
-- `category_link_top_level` control the description used in the top-level [[Category:Places]] and
-- language-specific variants such as [[Category:en:Places]]. The actual text for a language-spefic variant is
-- "{{{langname}}} names of [[geographical]] [[place]]s of all sorts; [[toponym]]s." where the "names of"
-- portion is automatically generated by the appropriate handler in
-- [[Module:category tree/topic cat/data/Places]].
category_link_top_level = "[[geographical]] [[place]]s of all sorts; [[toponym]]s",
bare_category_parent = "ชื่อ (หัวข้อ)",
},
["planned community"] = {
-- Include this so we don't categorize 'planned community' into villages, as 'community' does.
link = true,
class = "settlement",
has_neighborhoods = true,
},
["plateau"] = {
link = true,
class = "natural feature",
addl_bare_category_parents = {"ธรณีสัณฐาน"},
default = {true},
-- FIXME: Should generate both "Plateaus" and the appropriate 'geographic and cultural area' category
},
["Polish colony"] = {
link = "[[w:colony (Poland)|colony]]",
affix_type = "suf",
affix = "colony",
fallback = "village",
has_neighborhoods = true,
},
["political divisions!"] = {
category_link = "[[political]] [[division]]s and [[subdivision]]s, such as [[state]]s, [[province]]s, [[county|counties]] or [[district]]s",
bare_category_parent = "สถานที่",
},
["องค์การทางการเมือง"] = {
link = true,
category_link = "[[independent]] or [[semi-]][[independent]] [[polity|polities]]",
class = "polity", --ห้ามแปล class
bare_category_parent = "สถานที่",
default = {true},
},
["populated place"] = {
link = "+w:populated place",
-- not necessarily true, but usually is the case
fallback = "village",
},
["port"] = {
link = true,
class = "man-made structure",
default = {true},
},
["port city"] = {
-- FIXME: should categorize into "Ports" as well as "นคร"
link = true,
fallback = "นคร",
},
["port town"] = {
-- FIXME: should categorize into "Ports" as well as "เมือง"
link = "w",
fallback = "เมือง",
},
["prefecture"] = {
-- FIXME! `prefecture` is like a county in Japan and elsewhere but a department capital city in France.
-- May need `has_neighborhoods` to be a function.
link = true,
preposition = "ของ",
display_handler = prefecture_display_handler,
class = "subpolity",
},
["prefecture-level city"] = {
-- China; they are huge entities with a central city; not cities themselves.
link = "w",
preposition = "ของ",
class = "subpolity",
},
["preserved county"] = {
-- In Wales; they are former counties enshrined in law; there are 8 of them and each consists of one or more
-- "principal areas" (styled as "เทศมณฑล" or "county boroughs"), of which there are 22.
link = "w",
preposition = "ของ",
class = "subpolity",
inherently_former = {"FORMER"},
},
["primary area"] = {
-- a grouping of "อำเภอ" (neighborhoods) in Gothenburg, Sweden
link = "+w:sv:primärområde",
fallback = "neighborhood",
},
["principality"] = {
link = true,
fallback = "monarchy",
},
["promontory"] = {
link = true,
fallback = "headland",
},
["protectorate"] = {
link = true,
fallback = "dependent territory",
},
["จังหวัด"] = {
link = true,
preposition = "ของ",
display_handler = province_display_handler,
class = "subpolity",
},
["provinces and autonomous regions!"] = {
-- This and other similar "combined placetypes" are for use in the plural when grouping first-level
-- administrative regions of certain countries, in this case China.
category_link = "[[province]]s and [[autonomous region]]s",
class = "subpolity",
},
["provinces and territories!"] = {
-- This and other similar "combined placetypes" are for use in the plural when grouping first-level
-- administrative regions of certain countries, in this case Canada and Pakistan.
category_link = "[[province]]s and [[territory|territories]]",
class = "subpolity",
},
["provincial capital"] = {
link = true,
fallback = "เมืองหลวง",
},
["raion"] = {
link = true,
preposition = "ของ",
affix_type = "Suf",
class = "subpolity",
},
["ranch"] = {
link = true,
fallback = "farm",
},
["range"] = {
-- FIXME: Where is this used? Is it a mountain range?
link = true,
holonym_use_the = true,
class = "natural feature",
},
["regency"] = {
link = true,
preposition = "ของ",
class = "subpolity",
},
["ภูมิภาค"] = {
link = true,
preposition = "ของ",
-- If 'region' isn't a specific administrative division, fall back to 'geographic and cultural area'
fallback = "geographic and cultural area",
-- "former region" is a subpolity but traditional/historic(al)/ancient/medieval/etc. is a geographic region
class = "geographic region",
},
["regional capital"] = {
link = "separately",
fallback = "เมืองหลวง",
},
["regional county municipality"] = {
-- Quebec
link = "w",
preposition = "ของ",
affix_type = "Suf",
no_affix_strings = {"เทศบาล", "เทศมณฑล"},
fallback = "เทศบาล",
},
["regional district"] = {
link = "w",
preposition = "ของ",
affix_type = "Pref",
no_affix_strings = "อำเภอ",
fallback = "อำเภอ",
},
["regional municipality"] = {
link = "w",
preposition = "ของ",
affix_type = "Pref",
no_affix_strings = "เทศบาล",
fallback = "เทศบาล",
},
["regional unit"] = {
link = "w",
preposition = "ของ",
affix_type = "suf",
class = "subpolity",
},
["registration county"] = {
-- Used in Scotland for land registration purposes; formerly used in England, Wales and Ireland for statistical
-- purposes (registration of births, deaths and marriages, and for the output of census information).
link = "w",
fallback = "เทศมณฑล",
},
["republic"] = {
-- Of Russia, Yugoslavia, etc. "Republics" in general are sovereign but we use "ประเทศ" in that case.
link = true,
fallback = "constituent republic",
},
["research base"] = {
link = "+w:research station",
fallback = "research station",
},
["research station"] = {
link = "w",
class = "non-admin settlement", -- or "man-made structure"?
default = {true},
},
["reservoir"] = {
link = true,
fallback = "ทะเลสาบ",
},
["residential area"] = {
link = "separately",
fallback = "neighborhood",
},
["resort city"] = {
link = "w",
fallback = "นคร",
},
["resort town"] = {
link = "w",
fallback = "เมือง",
},
["แม่น้ำ"] = {
link = true,
generic_before_non_cities = "ใน",
holonym_use_the = true,
class = "natural feature",
addl_bare_category_parents = {"bodies of water"},
cat_handler = city_type_cat_handler,
["continent/*"] = {true},
default = {true},
},
["river island"] = {
link = "w",
fallback = "เกาะ",
},
["road"] = {
link = true,
class = "man-made structure",
default = {"Named roads"},
},
["Roman province"] = {
-- FIXME! Eliminate this in favor of 'former province|emp/Roman Empire'
link = "w",
default = {"Provinces of the Roman Empire"},
class = "subpolity",
},
["royal borough"] = {
link = "w",
preposition = "ของ",
affix_type = "Pref",
no_affix_strings = {"royal", "borough"},
fallback = "local government district with borough status",
has_neighborhoods = true,
},
["royal burgh"] = {
link = true,
fallback = "borough",
},
["royal capital"] = {
link = "w",
fallback = "เมืองหลวง",
},
["rural committee"] = {
-- Hong Kong; a group of villages
link = "w",
affix_type = "Suf",
has_neighborhoods = true,
class = "settlement",
},
["rural community"] = {
-- New Brunswick
link = "+w:list of municipalities in New_Brunswick#Rural communities",
fallback = "เทศบาล",
},
["rural hromada"] = {
link = "[[rural]] [[w:hromada|hromada]]",
affix_type = "suf",
fallback = "hromada",
},
["rural municipality"] = {
link = "w",
preposition = "ของ",
affix_type = "Pref",
no_affix_strings = "เทศบาล",
fallback = "เทศบาล",
has_neighborhoods = true, --?
},
["rural township"] = {
-- Taiwan
link = "+w:rural township (Taiwan)",
fallback = "township",
},
["sanctuary"] = {
link = true,
fallback = "temple",
},
["satrapy"] = {
link = true,
preposition = "ของ",
class = "subpolity",
inherently_former = {"ANCIENT", "FORMER"},
},
["ทะเล"] = {
link = true,
holonym_use_the = true,
class = "natural feature",
addl_bare_category_parents = {"bodies of water"},
default = {true},
},
["seaport"] = {
link = true,
fallback = "port",
},
["seat"] = {
link = true,
fallback = "administrative centre",
},
["self-administered area"] = {
-- Myanmar (groups self-administered divisions and zones)
link = "+w:self-administered zone",
preposition = "ของ",
class = "subpolity",
},
["self-administered division"] = {
-- Myanmar (only one of them: Wa Self-Administered Division)
link = "w",
fallback = "self-administered area",
},
["self-administered zone"] = {
-- Myanmar (five of them)
link = "w",
fallback = "self-administered area",
},
["separatist state"] = {
link = "separately",
fallback = "unrecognized country",
},
["การตั้งถิ่นฐาน"] = {
link = true,
category_link = "[[settlement]]s such as [[city|cities]], [[village]]s and [[farm]]s",
bare_category_parent = "สถานที่",
-- not necessarily true, but usually is the case
fallback = "village",
},
["settlement hromada"] = {
link = "[[w:Populated สถานที่ในUkraine#Rural settlements|การตั้งถิ่นฐาน]] [[w:hromada|hromada]]",
affix_type = "suf",
fallback = "hromada",
},
["sheading"] = {
-- Isle of Man
link = true,
fallback = "อำเภอ",
},
["sheep station"] = {
-- Australia
link = true,
fallback = "farm",
},
["shire"] = {
link = true,
fallback = "เทศมณฑล",
},
["shire county"] = {
link = "w",
fallback = "เทศมณฑล",
},
["shire town"] = {
link = true,
fallback = "county seat",
},
["ski resort city"] = {
link = "[[ski resort]] [[city]]",
fallback = "นคร",
},
["ski resort town"] = {
link = "[[ski resort]] [[town]]",
fallback = "เมือง",
},
["spa city"] = {
link = "+w:spa town",
fallback = "นคร",
},
["spa town"] = {
link = "w",
fallback = "เมือง",
},
["space station"] = {
link = true,
fallback = "research station",
},
["special administrative region"] = {
-- in China; in practice they are city-like (Hong Kong, Macau); also [[Oecusse]] in East Timor is formally a
-- "special administrative region"; North Korea had one such region planned (Sinuiju) but abandoned; Indonesia
-- has similar "special regions" of Jakarta, Yogyakarta and Aceh; and South Sudan has three "special
-- administrative areas"
link = "+w:special administrative regions of China",
preposition = "ของ",
class = "subpolity",
has_neighborhoods = true, --?
-- no suffix since สถานที่ในHong Kong or Macau are listed without China, except Hong Kong and Macau themselves
-- they also contain regions (or areas), e.g. [[Kowloon]], so it would be confusing
suffix = "",
},
["special collectivity"] = {
link = "w",
fallback = "collectivity",
},
["special municipality"] = {
-- formerly linked to the Taiwan article but there are also special municipalities of the Netherlands
link = "w",
fallback = "เทศบาล",
},
["special ward"] = {
-- Tokyo
link = true,
fallback = "เทศบาล",
},
["spit"] = {
link = true,
fallback = "peninsula",
},
["spring"] = {
link = true,
class = "natural feature",
default = {true},
},
["star"] = {
link = true,
class = "natural feature",
default = {true},
},
["รัฐ"] = {
link = true,
preposition = "ของ",
class = "subpolity",
-- 'former/historical state' could refer either to a state of a country (a division) or a state = sovereign
-- entity. The latter appears more common (e.g. in various "ancient states" of East Asia).
former_type = "องค์การทางการเมือง",
},
["states and territories!"] = {
-- This and other similar "combined placetypes" are for use in the plural when grouping first-level
-- administrative regions of certain countries, in this case Australia.
category_link = "[[state]]s and [[territory|territories]]",
class = "subpolity",
},
["states and union territories!"] = {
-- This and other similar "combined placetypes" are for use in the plural when grouping first-level
-- administrative regions of certain countries, in this case India.
category_link = "[[state]]s and [[union territory|union territories]]",
class = "subpolity",
},
["state capital"] = {
link = true,
fallback = "เมืองหลวง",
},
["state park"] = {
link = true,
fallback = "park",
},
["state-level new area"] = {
-- China (type of economic development zone, varying greatly in size)
link = "w",
fallback = "new area",
},
["statistical region"] = {
-- Slovenia
link = true,
fallback = "administrative region",
},
["statutory city"] = {
link = "w",
fallback = "นคร",
},
["statutory town"] = {
link = "w",
fallback = "เมือง",
},
["strait"] = {
link = true,
class = "natural feature",
addl_bare_category_parents = {"bodies of water"},
default = {true},
},
["stream"] = {
link = true,
fallback = "แม่น้ำ",
},
["street"] = {
link = true,
fallback = "road",
},
["strip"] = {
link = true,
fallback = "geographic region",
},
["strip of land"] = {
link = "[[strip]] of [[land]]",
plural = "strips of land",
plural_link = "[[strip]]s of [[land]]",
fallback = "geographic region",
},
["sub-metropolitan city"] = {
link = "+w:List of cities in Nepal#Sub-metropolitan cities",
fallback = "นคร",
},
["sub-prefectural city"] = {
link = "w",
fallback = "subprovincial city",
},
["ตำบล"] = {
link = true,
preposition = "ของ",
has_neighborhoods = true, --?
-- FIXME: subdistricts can be neighborhood-like (of Jakarta) or larger (in China); need a handler
class = "subpolity",
default = {true},
},
["subdivision"] = {
link = true,
preposition = "ของ",
affix_type = "suf",
-- FIXME: subdivisions can be neighborhood-like or larger; need a handler
class = "subpolity",
cat_handler = district_neighborhood_cat_handler,
},
["submerged ghost town"] = {
-- FIXME: Consider just having "submerged" as a qualifier.
link = "[[submerged]] [[ghost town]]",
fallback = "ghost town",
},
["subnational kingdom"] = {
link = "+w:subnational monarchy",
fallback = "non-sovereign kingdom",
},
["subnational monarchy"] = {
link = "w",
fallback = "non-sovereign kingdom",
},
["subprefecture"] = {
link = true,
affix_type = "suf",
preposition = "ของ",
class = "subpolity",
},
["subprovince"] = {
link = true,
preposition = "ของ",
class = "subpolity",
},
["subprovincial city"] = {
link = "w",
-- China; special status given to certain prefecture-level cities
fallback = "prefecture-level city",
},
["subprovincial district"] = {
link = "w",
-- China; special status given to Binhai New Area and Pudong New Area, which are county-level districts
preposition = "ของ",
class = "subpolity",
},
["subregion"] = {
link = true,
fallback = "geographic region",
},
["suburb"] = {
link = true,
-- The following text is suitable for the top-level description of a suburb as well as categories of the form
-- 'Suburbs in POLDIV' e.g. 'Suburbs in Illinois, USA' but not for categories of the form 'Suburbs of Chicago',
-- where we'd get "[[suburb]]s of [[city|cities]] of [[Chicago]]".
category_link = "[[suburb]]s of [[city|cities]]",
category_link_before_city = "[[suburb]]s",
-- See comments under "neighborhood" for the following three settings. They are used by
-- [[Module:category tree/topic cat/data/Places]] for generating the text of 'Suburbs in/of PLACE' categories
-- but currently ignored by district_neighborhood_cat_handler (which actually generates the categories for a
-- given page), which hardcodes "ใน" for non-cities and "ของ" for cities. (FIXME: Change this.)
generic_before_non_cities = "ใน",
generic_before_cities = "ของ",
preposition = "ของ",
has_neighborhoods = true, --?
class = "non-admin settlement", --?
cat_handler = district_neighborhood_cat_handler,
},
["suburban area"] = {
link = "w",
fallback = "suburb",
},
["subway station"] = {
link = "w",
fallback = "metro station",
},
["sum"] = {
-- In China, Mongolia, Russia; something like a county in Mongolia but a township in China (Inner Mongolia),
-- and equivalent to a [[selsoviet]] in the parts of Russia where it's in use (a rural council, below a raion).
link = "+w:sum (administrative division)",
-- This fallback is somewha arbitrary. We could use "เทศมณฑล" but that has a display handler
-- which we don't want to be active (FIXME: If the display handler would be active, that's a bug).
fallback = "division",
},
["supercontinent"] = {
link = true,
fallback = "continent",
},
["tehsil"] = {
link = true,
affix_type = "suf",
no_affix_strings = {"tehsil", "tahsil"},
class = "subpolity",
},
["temple"] = {
link = true,
fallback = "building",
},
["territorial authority"] = {
link = "w",
fallback = "อำเภอ",
},
["ดินแดน"] = {
link = true,
preposition = "ของ",
class = "subpolity",
},
["theme"] = {
link = "+w:theme (Byzantine district)",
preposition = "ของ",
class = "subpolity",
},
["เมือง"] = {
link = true,
generic_before_non_cities = "ใน",
has_neighborhoods = true,
class = "settlement",
cat_handler = city_type_cat_handler,
default = {true},
},
["town with bystatus"] = {
-- can't use templates in links currently
link = "[[town]] with [[bystatus#Norwegian Bokmål|bystatus]]",
plural = "towns with bystatus",
plural_link = "[[town]]s with [[bystatus#Norwegian Bokmål|bystatus]]",
fallback = "เมือง",
},
["township"] = {
link = true,
has_neighborhoods = true,
class = "settlement", --?
default = {true},
},
["township municipality"] = {
-- Quebec
link = "+w:township municipality (Quebec)",
preposition = "ของ",
fallback = "เทศบาล",
has_neighborhoods = true, --?
},
["traditional county"] = {
link = true,
fallback = "เทศมณฑล",
},
["traditional region"] = {
-- FIXME: Verify this works. Same for 'historic(al) region'.
-- provided only for the link
link = "w",
fallback = "FORMER geographic region",
},
["trail"] = {
link = true,
fallback = "road",
},
["treaty port"] = {
link = "w",
fallback = "นคร",
class = "settlement",
inherently_former = {"FORMER"},
},
["tributary"] = {
link = true,
preposition = "ของ",
fallback = "แม่น้ำ",
},
["underground station"] = {
link = "w",
fallback = "metro station",
},
["unincorporated area"] = {
link = "w",
-- I don't know if this fallback makes sense everywhere.
fallback = "unincorporated community",
},
["unincorporated community"] = {
link = true,
generic_before_non_cities = "ใน",
class = "non-admin settlement",
},
["unincorporated territory"] = {
link = "w",
fallback = "ดินแดน",
},
["union territory"] = {
-- India
link = true,
preposition = "ของ",
entry_placetype_indefinite_article = "a",
class = "subpolity",
},
["unitary authority"] = {
-- UK, New Zealand
link = true,
entry_placetype_indefinite_article = "a",
fallback = "local government district",
},
["unitary district"] = {
link = "w",
entry_placetype_indefinite_article = "a",
fallback = "local government district",
},
["united township municipality"] = {
-- Quebec
link = "+w:united township municipality (Quebec)",
entry_placetype_indefinite_article = "a",
fallback = "township municipality",
has_neighborhoods = true, --?
},
["university"] = {
link = true,
entry_placetype_indefinite_article = "a",
class = "man-made structure",
default = {true},
},
["unrecognised country"] = {
link = "w",
fallback = "unrecognized country",
},
["unrecognized and nearly unrecognized countries!"] = {
category_link = "[[de facto]] [[independent]] [[state]]s with little or no {{w|international recognition}}",
bare_category_parent = "country-like entities",
},
["unrecognized country"] = {
link = "w",
class = "polity", --ห้ามแปล class
default = {"Unrecognized and nearly unrecognized countries"},
},
["unrecognised state"] = {
link = "w",
fallback = "unrecognized country",
},
["unrecognized state"] = {
link = "w",
fallback = "unrecognized country",
},
["urban area"] = {
link = "separately",
fallback = "neighborhood",
},
["urban hromada"] = {
link = "[[urban]] [[w:hromada|hromada]]",
affix_type = "suf",
fallback = "hromada",
},
["urban service area"] = {
-- A strange beast existing in Alberta; technically a type of hamlet but in practice used for much larger
-- cities and treated equivalent to a city. (There are only two of them, [[Fort McMurray]] and [[Sherwood Park]]).
link = "w",
fallback = "นคร",
},
["urban township"] = {
link = "w",
fallback = "township",
},
["urban-type settlement"] = {
-- appears to be a particular type of small urban settlement in post-Soviet states,
-- had an administrative function.
link = "w",
fallback = "เมือง",
},
["valley"] = {
link = true,
class = "natural feature",
addl_bare_category_parents = {"ธรณีสัณฐาน", "water"},
default = {true},
},
["viceroyalty"] = {
-- in essence, a type of colony
link = true,
fallback = "dependent territory",
},
["village"] = {
link = true,
generic_before_non_cities = "ใน",
category_link = "[[village]]s, [[hamlet]]s, and other small [[community|communities]] and [[settlement]]s",
class = "settlement",
cat_handler = city_type_cat_handler,
default = {true},
},
["village development committee"] = {
-- former administrative structure in Nepal; also exists in India but not as a formal unit
link = "+w:village development committee (Nepal)",
inherently_former = {"FORMER"},
fallback = "village",
},
["village municipality"] = {
-- Quebec
link = "+w:village municipality (Quebec)",
preposition = "ของ",
fallback = "เทศบาล",
has_neighborhoods = true, --?
},
["voivodeship"] = {
-- Poland
link = true,
display_handler = voivodeship_display_handler,
preposition = "ของ",
class = "subpolity",
},
["volcano"] = {
link = true,
plural = "volcanoes",
class = "natural feature",
addl_bare_category_parents = {"ธรณีสัณฐาน"},
default = {true, "ภูเขา"},
},
["ward"] = {
link = true,
class = "settlement",
-- Wards are formal administrative divisions of a city but have some properties of neighborhoods.
fallback = "neighborhood",
},
["watercourse"] = {
link = true,
fallback = "channel",
},
["Welsh community"] = {
-- Wales
link = "[[w:community (Wales)|community]]",
preposition = "ของ",
affix_type = "suf",
affix = "community",
has_neighborhoods = true,
class = "settlement",
},
["zone"] = {
-- administrative division of Ethiopia, Qatar, Nepal, India
link = "+w:zone#Place names",
preposition = "ของ",
class = "subpolity",
},
----------------------------------------------------------------------------------------------
-- Categories for former places --
----------------------------------------------------------------------------------------------
["ANCIENT capital"] = {
link = false,
entry_placetype_use_the = true,
preposition = "ของ",
has_neighborhoods = true,
class = "capital",
-- FIXME: Consider removing 'ancient settlements' here. Ancient capitals, like former capitals, often still
-- exist but just aren't the capital any more. Maybe we should have an 'Ancient capitals' category.
default = {"Ancient settlements", "Former capitals"},
},
["ANCIENT non-admin settlement"] = {
link = false,
class = "non-admin settlement",
fallback = "ANCIENT settlement",
},
["ANCIENT settlement"] = {
link = false,
has_neighborhoods = true,
class = "settlement",
default = {"Ancient settlements"},
},
["ancient settlements!"] = {
category_link = "former [[city|cities]], [[town]]s and [[village]]s that existed in [[antiquity]]",
bare_category_parent = "former settlements",
},
["FORMER capital"] = {
link = false,
entry_placetype_use_the = true,
preposition = "ของ",
has_neighborhoods = true,
class = "capital",
default = {"Former capitals"},
},
["former capitals!"] = {
category_link = "former [[capital]] [[city|cities]] and [[town]]s",
bare_category_parent = "การตั้งถิ่นฐาน",
},
["former counties and county-level cities!"] = {
-- For categorizing former counties and county-level cities of China
category_link = "no-longer existing [[county|counties]] and [[county-level city|county-level cities]]",
bare_category_breadcrumb = "counties and county-level cities",
bare_category_parent = "former political divisions",
},
["FORMER county"] = {
-- For categorizing former counties and county-level cities of China
link = false,
fallback = "FORMER subpolity",
},
["FORMER county-level city"] = {
-- For categorizing former counties and county-level cities of China
link = false,
fallback = "FORMER subpolity",
},
["former countries and country-like entities!"] = {
category_link = "[[country|countries]] and similar [[polity|polities]] that no longer exist",
bare_category_breadcrumb = "countries and country-like entities",
bare_category_parent = "former polities",
},
["FORMER country"] = {
link = false,
class = "polity", --ห้ามแปล class
default = {"Former countries and country-like entities"},
},
["former dependent territories!"] = {
category_link = "[[w:dependent territory|dependent territories]] (colonies, dependencies, protectorates, etc.) that no longer exist",
bare_category_breadcrumb = "dependent territories",
bare_category_parent = "former political divisions",
},
["FORMER dependent territory"] = {
link = false,
preposition = "ของ",
class = "subpolity",
default = {"Former dependent territories"},
},
["former districts!"] = {
-- For categorizing former districts of China
category_link = "no-longer-existing [[district]]s",
bare_category_breadcrumb = "อำเภอ",
bare_category_parent = "former political divisions",
},
["FORMER district"] = {
-- For categorizing former districts of China
link = false,
fallback = "FORMER subpolity",
},
["FORMER geographic region"] = {
link = false,
fallback = "geographic and cultural area",
},
["FORMER man-made structure"] = {
link = false,
class = "man-made structure",
default = {"Former man-made structures"},
},
["former man-made structures!"] = {
category_link = "man-made structures such as [[airport]]s and [[park]]s that no longer exist",
bare_category_breadcrumb = "man-made structures",
bare_category_parent = "former places",
},
["former municipalities!"] = {
-- For categorizing former municipalities of the Netherlands
category_link = "no-longer-existing [[municipality|municipalities]]",
bare_category_breadcrumb = "เทศบาล",
bare_category_parent = "former political divisions",
},
["FORMER municipality"] = {
-- For categorizing former municipalities of the Netherlands
link = false,
fallback = "FORMER subpolity",
},
["FORMER natural feature"] = {
link = false,
class = "natural feature",
default = {"Former natural features"},
},
["former natural features!"] = {
category_link = "natural features such as [[lake]]s, [[river]]s and [[island]]s that no longer exist",
bare_category_breadcrumb = "natural features",
bare_category_parent = "former places",
},
["FORMER non-admin settlement"] = {
link = false,
class = "non-admin settlement",
fallback = "FORMER settlement",
},
["former places!"] = {
category_link = "[[place]]s of all sorts that no longer exist",
bare_category_breadcrumb = "former",
bare_category_parent = "สถานที่",
},
["former political divisions!"] = {
category_link = "[[political]] [[division]]s (states, provinces, counties, etc.) that no longer exist",
bare_category_breadcrumb = "political divisions",
bare_category_parent = "former places",
},
["former polities!"] = {
category_link = "[[polity|polities]] (countries, kingdoms, empires, etc.) that no longer exist",
bare_category_breadcrumb = "องค์การทางการเมือง",
bare_category_parent = "former places",
},
["FORMER polity"] = {
link = false,
class = "polity", --ห้ามแปล class
default = {"Former polities"},
},
["former prefectures!"] = {
-- For categorizing former prefectures of China
category_link = "no-longer-existing [[prefecture]]s",
bare_category_breadcrumb = "prefectures",
bare_category_parent = "former political divisions",
},
["FORMER prefecture"] = {
-- For categorizing former prefectures of China
link = false,
fallback = "FORMER subpolity",
},
["former provinces!"] = {
-- For categorizing former provinces of China, etc.
category_link = "no-longer-existing [[province]]s",
bare_category_breadcrumb = "จังหวัด",
bare_category_parent = "former political divisions",
},
["FORMER province"] = {
-- For categorizing ancient/historical/former provinces of the Roman Empire
link = false,
fallback = "FORMER subpolity",
},
["former region"] = {
-- A former region is considered a former political division, but not a 'historical/traditional/etc.' region.
link = "separately",
preposition = "ของ",
inherently_former = {"FORMER"},
class = "subpolity",
},
["FORMER settlement"] = {
link = false,
has_neighborhoods = true,
class = "settlement",
default = {"Former settlements"},
},
["former settlements!"] = {
category_link = "[[city|cities]], [[town]]s and [[village]]s that no longer exist or have been merged or reclassified",
bare_category_breadcrumb = "การตั้งถิ่นฐาน",
bare_category_parent = "former political divisions",
},
["FORMER subpolity"] = {
link = false,
preposition = "ของ",
class = "subpolity",
default = {"Former political divisions"},
},
----------------------------------------------------------------------------------------------
-- form-of categories --
----------------------------------------------------------------------------------------------
---------- Abbreviations ----------
["abbreviations of counties!"] = {
-- For categorizing abbreviations of counties of e.g. England
full_category_link = "{{glossary|abbreviation}}s of [[name]]s of [[county|counties]]",
bare_category_breadcrumb = "เทศมณฑล",
bare_category_parent = "abbreviations of political divisions",
},
["abbreviations of countries!"] = {
full_category_link = "{{glossary|abbreviation}}s of [[name]]s of [[country|countries]]",
bare_category_breadcrumb = "ประเทศ",
bare_category_parent = "abbreviations of places",
},
["abbreviations of departments!"] = {
-- For categorizing abbreviations of departments of e.g. France
full_category_link = "{{glossary|abbreviation}}s of [[name]]s of [[department]]s",
bare_category_breadcrumb = "departments",
bare_category_parent = "abbreviations of political divisions",
},
["abbreviations of districts!"] = {
-- For categorizing abbreviations of districts of e.g. ???
full_category_link = "{{glossary|abbreviation}}s of [[name]]s of [[district]]s",
bare_category_breadcrumb = "อำเภอ",
bare_category_parent = "abbreviations of political divisions",
},
["abbreviations of divisions!"] = {
-- For categorizing abbreviations of divisions of e.g. Bangladesh
full_category_link = "{{glossary|abbreviation}}s of [[name]]s of [[division]]s",
bare_category_breadcrumb = "divisions",
bare_category_parent = "abbreviations of political divisions",
},
["abbreviations of former countries!"] = {
full_category_link = "{{glossary|abbreviation}}s of [[country|countries]] that no longer [[exist]]",
bare_category_breadcrumb = "ประเทศ",
bare_category_parent = "abbreviations of former places",
},
["abbreviations of former places!"] = {
full_category_link = "{{glossary|abbreviation}}s of [[place]]s that no longer [[exist]]",
bare_category_breadcrumb = "abbreviations",
bare_category_parent = "former places",
addl_bare_category_parents = {{name = "abbreviations of places", sort = "former"}},
},
["abbreviations of places!"] = {
full_category_link = "{{glossary|abbreviation}}s of [[name]]s of [[place]]s",
bare_category_breadcrumb = "abbreviations",
bare_category_parent = "สถานที่",
},
["abbreviations of political divisions!"] = {
full_category_link = "{{glossary|abbreviation}}s of [[name]]s of [[political]] [[division]]s",
bare_category_breadcrumb = "political divisions",
bare_category_parent = "abbreviations of places",
},
["abbreviations of prefectures!"] = {
-- For categorizing abbreviations of prefectures of e.g. Japan
full_category_link = "{{glossary|abbreviation}}s of [[name]]s of [[prefecture]]s",
bare_category_breadcrumb = "prefectures",
bare_category_parent = "abbreviations of political divisions",
},
["abbreviations of provinces!"] = {
-- For categorizing abbreviations of provinces of e.g. Canada
full_category_link = "{{glossary|abbreviation}}s of [[name]]s of [[province]]s",
bare_category_breadcrumb = "จังหวัด",
bare_category_parent = "abbreviations of political divisions",
},
["abbreviations of provinces and territories!"] = {
full_category_link = "{{glossary|abbreviation}}s of [[name]]s of [[province]]s and [[territory|territories]]",
bare_category_breadcrumb = "provinces and territories",
bare_category_parent = "abbreviations of political divisions",
},
["abbreviations of regions!"] = {
-- For categorizing abbreviations of regions of e.g. Italy
full_category_link = "{{glossary|abbreviation}}s of [[name]]s of [[administrative region]]s",
bare_category_breadcrumb = "ภูมิภาค",
bare_category_parent = "abbreviations of political divisions",
},
["abbreviations of states!"] = {
-- For categorizing abbreviations of states of e.g. the United States
full_category_link = "{{glossary|abbreviation}}s of [[name]]s of [[state]]s",
bare_category_breadcrumb = "รัฐ",
bare_category_parent = "abbreviations of political divisions",
},
["abbreviations of states and territories!"] = {
full_category_link = "{{glossary|abbreviation}}s of [[name]]s of [[state]]s and [[territory|territories]]",
bare_category_breadcrumb = "states and territories",
bare_category_parent = "abbreviations of political divisions",
},
["abbreviations of states and union territories!"] = {
full_category_link = "{{glossary|abbreviation}}s of [[name]]s of [[state]]s and [[union territory|union territories]]",
bare_category_breadcrumb = "states and union territories",
bare_category_parent = "abbreviations of political divisions",
},
["abbreviations of territories!"] = {
full_category_link = "{{glossary|abbreviation}}s of [[name]]s of [[territory|territories]]",
bare_category_breadcrumb = "ดินแดน",
bare_category_parent = "abbreviations of political divisions",
},
["ABBREVIATION_OF country"] = {
link = false,
default = {"Abbreviations of countries"},
},
["ABBREVIATION_OF county"] = {
link = false,
fallback = "ABBREVIATION_OF subpolity",
},
["ABBREVIATION_OF department"] = {
link = false,
fallback = "ABBREVIATION_OF subpolity",
},
["ABBREVIATION_OF district"] = {
link = false,
fallback = "ABBREVIATION_OF subpolity",
},
["ABBREVIATION_OF division"] = {
link = false,
fallback = "ABBREVIATION_OF subpolity",
},
["ABBREVIATION_OF FORMER country"] = {
link = false,
default = {"Abbreviations of former countries"},
},
["ABBREVIATION_OF FORMER place"] = {
link = false,
default = {"Abbreviations of former places"},
},
["ABBREVIATION_OF place"] = {
link = false,
default = {"Abbreviations of places"},
},
["ABBREVIATION_OF prefecture"] = {
link = false,
fallback = "ABBREVIATION_OF subpolity",
},
["ABBREVIATION_OF province"] = {
link = false,
fallback = "ABBREVIATION_OF subpolity",
},
["ABBREVIATION_OF region"] = {
link = false,
fallback = "ABBREVIATION_OF subpolity",
},
["ABBREVIATION_OF state"] = {
link = false,
fallback = "ABBREVIATION_OF subpolity",
},
["ABBREVIATION_OF subpolity"] = {
link = false,
default = {"Abbreviations of political divisions"},
},
["ABBREVIATION_OF territory"] = {
link = false,
fallback = "ABBREVIATION_OF subpolity",
},
["ABBREVIATION_OF union territory"] = {
link = false,
fallback = "ABBREVIATION_OF subpolity",
},
---------- Archaic forms ----------
["archaic forms of places!"] = {
full_category_link = "{{glossary|archaic}} [[form]]s of [[name]]s of [[place]]s",
bare_category_breadcrumb = "archaic forms",
bare_category_parent = "สถานที่",
},
["ARCHAIC_FORM_OF place"] = {
link = false,
default = {"Archaic forms of places"},
},
---------- Clippings ----------
["clippings of places!"] = {
full_category_link = "{{glossary|clipping}}s of [[name]]s of [[place]]s",
bare_category_breadcrumb = "clippings",
bare_category_parent = "สถานที่",
},
["CLIPPING_OF place"] = {
link = false,
default = {"Clippings of places"},
},
---------- Dated forms ----------
["dated forms of places!"] = {
full_category_link = "{{glossary|dated}} [[form]]s of [[name]]s of [[place]]s",
bare_category_breadcrumb = "dated forms",
bare_category_parent = "สถานที่",
},
["DATED_FORM_OF place"] = {
link = false,
default = {"Dated forms of places"},
},
---------- Derogatory names ----------
["derogatory names for cities!"] = {
full_category_link = "{{glossary|derogatory}} [[name]]s for [[city|cities]]",
bare_category_breadcrumb = "นคร",
bare_category_parent = "derogatory names for places",
addl_bare_category_parents = {"nicknames for cities"},
},
["derogatory names for continents!"] = {
full_category_link = "{{glossary|derogatory}} [[name]]s for [[continent]]s",
bare_category_breadcrumb = "ทวีป",
bare_category_parent = "derogatory names for places",
addl_bare_category_parents = {"nicknames for continents"},
},
["derogatory names for countries!"] = {
full_category_link = "{{glossary|derogatory}} [[name]]s for [[country|countries]]",
bare_category_breadcrumb = "ประเทศ",
bare_category_parent = "derogatory names for places",
addl_bare_category_parents = {"nicknames for countries"},
},
["derogatory names for places!"] = {
full_category_link = "{{glossary|derogatory}} [[name]]s for [[place]]s",
bare_category_breadcrumb = "derogatory names",
bare_category_parent = "nicknames for places",
},
["derogatory names for states!"] = {
full_category_link = "{{glossary|derogatory}} [[name]]s for [[state]]s",
bare_category_breadcrumb = "รัฐ",
bare_category_parent = "derogatory names for places",
addl_bare_category_parents = {"nicknames for states"},
},
["DEROGATORY_NAME_FOR capital"] = {
link = false,
default = {"Derogatory names for cities"},
},
["DEROGATORY_NAME_FOR city"] = {
link = false,
default = {"Derogatory names for cities"},
},
["DEROGATORY_NAME_FOR continent"] = {
link = false,
default = {"Derogatory names for continents"},
},
["DEROGATORY_NAME_FOR country"] = {
link = false,
default = {"Derogatory names for countries"},
},
["DEROGATORY_NAME_FOR metropolitan city"] = {
-- "metropolitan city" doesn't fall back to "นคร"
link = false,
default = {"Derogatory names for cities"},
},
["DEROGATORY_NAME_FOR place"] = {
link = false,
default = {"Derogatory names for places"},
},
["DEROGATORY_NAME_FOR prefecture-level city"] = {
-- "prefecture-level city" doesn't fall back to "นคร" but things like "county-level city" and
-- "subprovincial city" fall back to "prefecture-level city"
link = false,
default = {"Derogatory names for cities"},
},
["DEROGATORY_NAME_FOR state"] = {
link = false,
default = {"Derogatory names for states"},
},
["DEROGATORY_NAME_FOR town"] = {
link = false,
default = {"Derogatory names for cities"},
},
---------- Ellipses ----------
["ellipses of places!"] = {
full_category_link = "{{glossary|ellipsis|ellipses}} of [[name]]s of [[place]]s",
bare_category_breadcrumb = "ellipses",
bare_category_parent = "สถานที่",
},
["ELLIPSIS_OF place"] = {
link = false,
default = {"Ellipses of places"},
},
---------- Former long-form names ----------
["former long-form names of countries!"] = {
full_category_link = "no-longer-[[use]]d [[long]]-[[form]] (but typically [[unofficial]]) [[name]]s of [[country|countries]]",
bare_category_breadcrumb = "ประเทศ",
bare_category_parent = "former long-form names of places",
addl_bare_category_parents = {{name = "former names of countries", sort = "long-form"}},
},
["former long-form names of places!"] = {
full_category_link = "no-longer-[[use]]d [[long]]-[[form]] (but typically [[unofficial]]) [[name]]s of [[place]]s",
bare_category_breadcrumb = "long-form",
bare_category_parent = "former names of places",
},
["FORMER_LONG_FORM_OF country"] = {
link = false,
default = {"Former long-form names of countries"},
},
["FORMER_LONG_FORM_OF place"] = {
link = false,
default = {"Former long-form names of places"},
},
---------- Former names ----------
["former names of capitals!"] = {
full_category_link = "[[former]] [[name]]s of [[capital city|capital cities]] that generally still exist but under a different name",
bare_category_breadcrumb = "capitals",
bare_category_parent = "former names of settlements",
},
["former names of countries!"] = {
full_category_link = "[[former]] [[name]]s of [[country|countries]] that generally still exist but under a different name",
bare_category_breadcrumb = "ประเทศ",
bare_category_parent = "former names of places",
},
["former names of places!"] = {
full_category_link = "[[former]] [[name]]s of [[place]]s that generally still exist but under a different name",
bare_category_breadcrumb = "former names",
bare_category_parent = "สถานที่",
},
["former names of political divisions!"] = {
full_category_link = "[[former]] [[name]]s of [[political]] [[division]]s (states, provinces, counties, etc.) that generally still exist but under a different name",
bare_category_breadcrumb = "political divisions",
bare_category_parent = "former names of places",
},
["former names of polities!"] = {
full_category_link = "[[former]] [[name]]s of [[polity|polities]] (e.g. [[country|countries]]) that generally still exist but under a different name",
bare_category_breadcrumb = "องค์การทางการเมือง",
bare_category_parent = "former names of places",
},
["former names of settlements!"] = {
full_category_link = "[[former]] [[name]]s of [[city|cities]], [[town]]s, [[village]]s, etc. that generally still exist but under a different name",
bare_category_breadcrumb = "การตั้งถิ่นฐาน",
bare_category_parent = "former names of political divisions",
},
["FORMER_NAME_OF capital"] = {
link = false,
default = {"Former names of capitals"},
},
["FORMER_NAME_OF country"] = {
link = false,
default = {"Former names of countries"},
},
["FORMER_NAME_OF place"] = {
link = false,
default = {"Former names of places"},
},
["FORMER_NAME_OF polity"] = {
link = false,
default = {"Former names of polities"},
},
["FORMER_NAME_OF region"] = {
link = false,
fallback = "FORMER_NAME_OF subpolity",
},
["FORMER_NAME_OF settlement"] = {
link = false,
default = {"Former names of settlements"},
},
["FORMER_NAME_OF subpolity"] = {
link = false,
default = {"Former names of political divisions"},
},
---------- Former nicknames ----------
["former nicknames for cities!"] = {
full_category_link = "no-longer-used [[nickname]]s for [[city|cities]], e.g. the [[Eternal City]] for [[Kyoto]] during the {{w|Heian period}} ({{circa2|800–1100|short=yes}} {{AD}})",
bare_category_breadcrumb = "นคร",
bare_category_parent = "former nicknames for places",
addl_bare_category_parents = {"nicknames for cities"},
},
["former nicknames for places!"] = {
full_category_link = "no-longer-used [[nickname]]s for [[place]]s",
bare_category_breadcrumb = "former",
bare_category_parent = "nicknames for places",
addl_bare_category_parents = {{name = "former names of places", sort = "nicknames"}},
},
["FORMER_NICKNAME_FOR capital"] = {
link = false,
default = {"Former nicknames for cities"},
},
["FORMER_NICKNAME_FOR city"] = {
link = false,
default = {"Former nicknames for cities"},
},
["FORMER_NICKNAME_FOR metropolitan city"] = {
-- "metropolitan city" doesn't fall back to "นคร"
link = false,
default = {"Former nicknames for cities"},
},
["FORMER_NICKNAME_FOR place"] = {
link = false,
default = {"Former nicknames for places"},
},
["FORMER_NICKNAME_FOR prefecture-level city"] = {
-- "prefecture-level city" doesn't fall back to "นคร" but things like "county-level city" and
-- "subprovincial city" fall back to "prefecture-level city"
link = false,
default = {"Former nicknames for cities"},
},
["FORMER_NICKNAME_FOR town"] = {
link = false,
default = {"Former nicknames for cities"},
},
---------- Former official names ----------
["former official names of countries!"] = {
full_category_link = "no-longer-[[use]]d [[official]] [[name]]s of [[country|countries]]",
bare_category_breadcrumb = "ประเทศ",
bare_category_parent = "former official names of places",
addl_bare_category_parents = {{name = "former names of countries", sort = "official"}},
},
["former official names of places!"] = {
full_category_link = "no-longer-[[use]]d [[official]] [[name]]s of [[place]]s",
bare_category_breadcrumb = "official",
bare_category_parent = "former names of places",
},
["FORMER_OFFICIAL_NAME_OF country"] = {
link = false,
default = {"Former official names of countries"},
},
["FORMER_OFFICIAL_NAME_OF place"] = {
link = false,
default = {"Former official names of places"},
},
---------- Long-form names ----------
["long-form names of countries!"] = {
full_category_link = "[[long]]-[[form]] (but typically [[unofficial]]) [[name]]s of [[country|countries]]",
bare_category_breadcrumb = "ประเทศ",
bare_category_parent = "long-form names of places",
},
["long-form names of places!"] = {
full_category_link = "[[long]]-[[form]] (but typically [[unofficial]]) [[name]]s of [[place]]s",
bare_category_breadcrumb = "long-form names",
bare_category_parent = "สถานที่",
},
["LONG_FORM_OF country"] = {
link = false,
default = {"Long-form names of countries"},
},
["LONG_FORM_OF place"] = {
link = false,
default = {"Long-form names of places"},
},
---------- Nicknames ----------
["nicknames for cities!"] = {
full_category_link = "[[nickname]]s for [[city|cities]], e.g. the [[Big Apple]] for [[New York City]]",
bare_category_breadcrumb = "นคร",
bare_category_parent = "nicknames for places",
addl_bare_category_parents = {"นคร"},
},
["nicknames for continents!"] = {
full_category_link = "[[nickname]]s for [[continent]]s",
bare_category_breadcrumb = "ทวีป",
bare_category_parent = "nicknames for places",
addl_bare_category_parents = {"ทวีป"},
},
["nicknames for countries!"] = {
full_category_link = "[[nickname]]s for [[country|countries]]",
bare_category_breadcrumb = "ประเทศ",
bare_category_parent = "nicknames for places",
addl_bare_category_parents = {"ประเทศ"},
},
["nicknames for places!"] = {
full_category_link = "[[nickname]]s for [[place]]s",
bare_category_breadcrumb = "สถานที่",
bare_category_parent = "nicknames",
addl_bare_category_parents = {"สถานที่"},
},
["nicknames for states!"] = {
-- For categorizing nicknames for states of e.g. the United States
full_category_link = "[[nicknames]] for [[state]]s",
bare_category_breadcrumb = "รัฐ",
bare_category_parent = "nicknames for places",
addl_bare_category_parents = {"รัฐ"},
},
["NICKNAME_FOR capital"] = {
link = false,
default = {"Nicknames for cities"},
},
["NICKNAME_FOR city"] = {
link = false,
default = {"Nicknames for cities"},
},
["NICKNAME_FOR continent"] = {
link = false,
default = {"Nicknames for continents"},
},
["NICKNAME_FOR country"] = {
link = false,
default = {"Nicknames for countries"},
},
["NICKNAME_FOR metropolitan city"] = {
-- "metropolitan city" doesn't fall back to "นคร"
link = false,
default = {"Nicknames for cities"},
},
["NICKNAME_FOR place"] = {
link = false,
default = {"Nicknames for places"},
},
["NICKNAME_FOR prefecture-level city"] = {
-- "prefecture-level city" doesn't fall back to "นคร" but things like "county-level city" and
-- "subprovincial city" fall back to "prefecture-level city"
link = false,
default = {"Nicknames for cities"},
},
["NICKNAME_FOR state"] = {
link = false,
default = {"Nicknames for states"},
},
["NICKNAME_FOR town"] = {
link = false,
default = {"Nicknames for cities"},
},
---------- Obsolete forms ----------
["obsolete forms of places!"] = {
full_category_link = "{{glossary|obsolete}} [[form]]s of [[name]]s of [[place]]s",
bare_category_breadcrumb = "obsolete forms",
bare_category_parent = "สถานที่",
},
["OBSOLETE_FORM_OF place"] = {
link = false,
default = {"Obsolete forms of places"},
},
---------- Official names ----------
["official names of countries!"] = {
full_category_link = "[[official]] [[name]]s of [[country|countries]]",
bare_category_breadcrumb = "ประเทศ",
bare_category_parent = "official names of places",
},
["official names of former countries!"] = {
full_category_link = "[[official]] [[name]]s of [[country|countries]] that no longer [[exist]]",
bare_category_breadcrumb = "ประเทศ",
bare_category_parent = "official names of former places",
},
["official names of former places!"] = {
full_category_link = "[[official]] [[name]]s of [[place]]s that no longer [[exist]]",
bare_category_breadcrumb = "official names",
bare_category_parent = "former places",
addl_bare_category_parents = {{name = "official names of places", sort = "former"}},
},
["official names of places!"] = {
full_category_link = "[[official]] [[name]]s of [[place]]s",
bare_category_breadcrumb = "official names",
bare_category_parent = "สถานที่",
},
["OFFICIAL_NAME_OF country"] = {
link = false,
default = {"Official names of countries"},
},
["OFFICIAL_NAME_OF FORMER country"] = {
link = false,
default = {"Official names of former countries"},
},
["OFFICIAL_NAME_OF FORMER place"] = {
link = false,
default = {"Official names of former places"},
},
["OFFICIAL_NAME_OF place"] = {
link = false,
default = {"Official names of places"},
},
---------- Official nicknames ----------
["official nicknames for places!"] = {
full_category_link = "[[official]] [[nickname]]s for [[place]]s",
bare_category_breadcrumb = "official",
bare_category_parent = "nicknames for places",
},
["official nicknames for states!"] = {
-- For categorizing official nicknames for states of e.g. the United States
full_category_link = "[[official]] [[nicknames]] for [[state]]s",
bare_category_breadcrumb = "official",
bare_category_parent = "nicknames for states",
addl_bare_category_parents = {"รัฐ"},
},
["OFFICIAL_NICKNAME_FOR place"] = {
link = false,
default = {"Official nicknames for places"},
},
["OFFICIAL_NICKNAME_FOR state"] = {
link = false,
default = {"Official nicknames for states"},
},
}
export.plural_placetype_to_singular = {}
for sg_placetype, spec in pairs(export.placetype_data) do
if spec.plural then
export.plural_placetype_to_singular[spec.plural] = sg_placetype
end
end
return export
ppnv7vnj763b7rrbevfdvo0t7mfqorv
5720699
5720689
2026-04-21T01:47:14Z
OctraBot
3198
5720699
Scribunto
text/plain
local export = {}
export.force_cat = false -- set to true for testing
local m_locations = require("Module:place/locations")
local m_links = require("Module:links")
local m_table = require("Module:table")
local m_strutils = require("Module:string utilities")
local debug_track_module = "Module:debug/track"
local en_utilities_module = "Module:en-utilities"
local dump = mw.dumpObject
local insert = table.insert
local concat = table.concat
local internal_error = m_locations.internal_error
export.internal_error = internal_error
local process_error = m_locations.process_error
export.process_error = process_error
local unpack = unpack or table.unpack -- Lua 5.2 compatibility
local ucfirst = m_strutils.ucfirst
local ulower = m_strutils.lower
local rmatch = m_strutils.match
local split = m_strutils.split
--[==[ intro:
This module contains placetype data used by [[Module:place]] and {{tl|place}}, along with a significant amount of code
to work with both placetypes and locations, as well as some placename-related info (FIXME: Consider moving it to
[[Module:place/locations]]). See also [[Module:place/locations]], which has definitions of all known locations. You must
currently load this module using {{cd|require()}}, not using {{cd|mw.loadData()}}.
In particular, it contains two fundamental and tricky functions:
# `get_placetype_equivs`, which finds the equivalent placetypes to look under in order to find a given property, and in
the process correctly handles placetypes with qualifiers (including qualifiers that act similar to "type-raising"
operators in that they do something non-trivial to the placetype to their right) as well as form-of directives and
fallbacks.
# `find_matching_holonym_location`, which looks up a holonym to find a matching known location, but in the process
checks holonyms to the right to make sure there isn't a clash between the user-specified containing holonyms and the
containers of the known location being considered. This is done to prevent overcategorizing when either there are two
known locations with the same name (e.g. Birmingham in England and Birmingham, Alabama in the US), or more generally
two locations with the same name, one of which is a known location but where the other is not (e.g. we're processing
non-known-location Mérida, Spain and don't want it categorized like known location Mérida, Yucatán, Mexico).
Both of these functions are invoked repeatedly, and probably are invoked several times on the same inputs and as a
result are candidates for memoization to speed up the operation of {{tl|place}}.
]==]
------------------------------------------------------------------------------------------
-- Basic utilities --
------------------------------------------------------------------------------------------
--[==[
Return true if `force_cat` is set either in this module or in [[Module:place/locations]].
]==]
function export.get_force_cat()
return export.force_cat or m_locations.force_cat
end
-- Add the page to a tracking "category". To see the pages in the "category",
-- go to [[Wiktionary:Tracking/place/PAGE]] and click on "What links here".
local function track(page)
require(debug_track_module)("place/" .. page)
return true
end
function export.remove_links_and_html(text)
text = m_links.remove_links(text)
return text:gsub("<.->", "")
end
--[==[
Return the singular version of a maybe-plural placetype, or nil if not plural. This correctly handles placetypes with
irregular plurals such as `kibbutzim` plural of `kibbutz` by looking up in a table constructed from the `plural` values
specified in `placetype_data`. If a special plural value is not found, the regular singularization algorithm in
[[Module:en-utilities]] is invoked, which reverses the y -> ies change after vowels and the 'es' addition after sh/ch/x,
and otherwise just subtracts a final 's' (which will incorrectly generate 'passe' for plural 'passes'; FIXME: consider
changing this for words ending in '-sses'). If the generated singular is the same as the passed-in value, nil is
returned.
]==]
function export.maybe_singularize_placetype(placetype)
if not placetype then
return nil
end
if export.plural_placetype_to_singular[placetype] then
return export.plural_placetype_to_singular[placetype]
end
local retval = --[[require(en_utilities_module).singularize(placetype)]] placetype
if retval == placetype then
return nil
end
return retval
end
-- Return the correct plural of a placetype, and (if `do_ucfirst` is given) make the first letter uppercase. We first
-- look up the plural in `placetype_data`, falling back to pluralize() in [[Module:en-utilities]], which is almost
-- always correct.
function export.pluralize_placetype(placetype, do_ucfirst)
local ptdata = export.placetype_data[placetype]
if ptdata and ptdata.plural then
placetype = ptdata.plural
else
placetype = --[[require(en_utilities_module).pluralize(placetype)]] placetype
end
if do_ucfirst then
return ucfirst(placetype)
else
return placetype
end
end
--[==[
Get the data associated with a placetype, which may be in its singular or plural form. If `from_category` is specified,
we also look for category-only placetypes (generally plural) followed by `!`. Return three values: (a) the placetype
under which the data can be looked up (i.e. in its singular form if the passed-in `placetype` is plural and did not
match a category-only placetype followed by `!`); (b) the placetype data structure; (c) the type of `placetype` match
that occurred, one of `"direct"` if the canonical placetype is the same as the passed-in `placetype` and also the same
as the key under which `ptdata` was looked up, or `"direct-category"` if the `ptdata` was looked up under a key formed
from the passed-in `placetype` by adding `!`, or `"plural"` if the `ptdata` was looked up under the singularized version
of the plural passed-in `placetype`.
]==]
function export.get_placetype_data(placetype, from_category)
local ptdata = export.placetype_data[placetype]
if ptdata then
return placetype, ptdata, "direct"
end
if from_category then
ptdata = export.placetype_data[placetype .. "!"]
if ptdata then
return placetype .. "!", ptdata, "direct-category"
end
end
local sg_placetype = export.maybe_singularize_placetype(placetype)
if sg_placetype then
ptdata = export.placetype_data[sg_placetype]
if ptdata then
return sg_placetype, ptdata, "plural"
end
end
return nil
end
--[==[
Check for special pseudo-placetypes that should be ignored for categorization purposes.
]==]
function export.placetype_is_ignorable(placetype)
return placetype == "and" or placetype == "or" or placetype == "และ" or placetype == "หรือ" or placetype:find("^%(")
end
function export.resolve_placetype_aliases(placetype)
return export.placetype_aliases[placetype] or placetype
end
--[==[
Return a property from `placetype_data` for a given placetype. If the placetype isn't found in `placetype_data`, or the
key isn't found in the placetype's entry in `placetype_data`, return nil.
]==]
function export.get_placetype_prop(placetype, key)
-- Usually we are called on equivalent placetypes returned from `get_placetype_equivs`, in which case placetype
-- aliases have been resolved, but sometimes not, e.g. when fetching the indefinite article in
-- get_placetype_article(). `resolve_placetype_aliases` is just a simple lookup and it doesn't hurt to do it twice.
placetype = export.resolve_placetype_aliases(placetype)
if export.placetype_data[placetype] then
return export.placetype_data[placetype][key]
else
return nil
end
end
--[==[
Given a placetype, split the placetype into one or more potential ''splits'', each consisting of a three-element list
{ {``prev_qualifiers``, ``this_qualifier``, ``reduced_placetype``}}, i.e.
# the concatenation of zero or more previously-recognized qualifiers on the left, normally canonicalized (if there are
zero such qualifiers, the value will be nil);
# a single recognized qualifier, normally canonicalized (if there is no qualifier, the value will be nil);
# the "reduced placetype" on the right.
Splitting between the qualifier in (2) and the reduced placetype in (3) happens at each space character, proceeding from
left to right, and stops if a qualifier isn't recognized. All placetypes are canonicalized by checking for aliases
in `placetype_aliases`, but no other checks are made as to whether the reduced placetype is recognized. Canonicalization
of qualifiers does not happen if `no_canon_qualifiers` is specified.
For example, given the placetype `"small beachside unincorporated community"`, the return value will be
{ {
{nil, nil, "small beachside unincorporated community"},
{nil, "small", "beachside unincorporated community"},
{"small", "[[beachfront]]", "unincorporated community"},
{"small [[beachfront]]", "[[unincorporated]]", "community"},
}}
Here, `"beachside"` is canonicalized to `"[[beachfront]]"` and `"unincorporated"` is canonicalized to
`"[[unincorporated]]"`, in both cases according to the entry in `placetype_qualifiers`.
On the other hand, if given `"small former haunted community"`, the return value will be
{ {
{nil, nil, "small former haunted community"},
{nil, "small", "former haunted community"},
{"small", "former", "haunted community"},
}}
because `"small"` and `"former"` but not `"haunted"` are recognized as qualifiers.
Finally, if given `"former adr"`, the return value will be
{ {
{nil, nil, "former adr"},
{nil, "former", "administrative region"},
}}
because `"adr"` is a recognized placetype alias for `"administrative region"`.
]==]
function export.split_qualifiers_from_placetype(placetype, no_canon_qualifiers)
local splits = {{nil, nil, export.resolve_placetype_aliases(placetype)}}
local prev_qualifier = nil
while true do
local qualifier, reduced_placetype = placetype:match("^(.-) (.*)$")
if qualifier then
local canon = export.placetype_qualifiers[qualifier]
if canon == nil then
break
end
local new_qualifier = qualifier
if type(canon) == "table" then
canon = canon.link
end
if not no_canon_qualifiers and canon ~= false then
if canon == true then
new_qualifier = "[[" .. qualifier .. "]]"
else
new_qualifier = canon
end
end
insert(splits, {prev_qualifier, new_qualifier, export.resolve_placetype_aliases(reduced_placetype)})
prev_qualifier = prev_qualifier and prev_qualifier .. " " .. new_qualifier or new_qualifier
placetype = reduced_placetype
else
break
end
end
return splits
end
--[==[
Given a `placetype` (which may be pluralized), return an ordered list of equivalent placetypes to look under to find the
placetype's properties (such as the category or categories to be inserted). The return value is actually an ordered list
of objects of the form `{qualifier=``qualifier``, placetype=``equiv_placetype``}` where ``equiv_placetype`` is a
placetype whose properties to look up, derived from the passed-in placetype or from a contiguous subsequence of the
words in the passed-in placetype (always including the rightmost word in the placetype, i.e. we successively chop off
qualifier words from the left and use the remainder to find equivalent placetypes). ``qualifier`` is the remaining words
not part of the subsequence used to find ``equiv_placetype``; or nil if all words in the passed-in placetype were used
to find ``equiv_placetype``. (FIXME: This qualifier is not currently used anywhere.) Only placetypes for which there is
an entry in `placetype_data` are included. The placetype passed in is always checked first, and will form the first
entry if it exists in `placetype_data`.
'''NOTE:''' This is a tricky function as it implements handling of (a) qualifiers, (b) fallback logic, (c)
"type-raising" qualifiers such as `former`/`ancient`/etc. as well as `fictional` and `mythological`, and (d) form-of
directives, which act somewhat similarly to `former`, and allows interaction between more than one of these
simultaneously (e.g. official names of former places, which have their own categorization).
If {{tl|place}} gets too slow, one potential speedup is to memoize the results of this function, as it appears to be
getting called more than once on the same inputs. Another similar potential speedup is to memoize the results of
`iterate_matching_holonym_location()`.
For example, given the placetype `left tributary`, the following placetype/qualifier combinations are checked in turn:
```
{qualifier = nil, placetype="left tributary"}
{qualifier = "left", placetype="tributary"}
{qualifier = "left", placetype="แม่น้ำ"}
```
and the return value will be
{ {
{qualifier = "left", placetype="tributary"},
{qualifier = "left", placetype="แม่น้ำ"},
}}
The algorithm first enters the placetype itself into the list, then checks for `left tributary` as a recognized
placetype in `placetype_data` and doesn't find it, so it doesn't enter it into the returned list (if it found it, it
would add it as well as any fallbacks directly after it). It then splits off the recognized qualifier `left` to form the
''reduced placetype'' `tributary`, which is entered into the list because it is found in `placetype_data`. Then, because
it has a fallback `river`, which exists in `placetype_data`, the fallback is entered next.
Another example is `small rural fraziones` (where a ''frazione'' is type of subdivision of a ''comune'' or municipality,
often specifically an outlying hamlet). the placetype/qualifier combinations checked are:
```
{qualifier = nil, placetype="small rural fraziones"}
{qualifier = nil, placetype="small rural frazione"}
{qualifier = "small", placetype="rural fraziones"}
{qualifier = "small", placetype="rural frazione"}
{qualifier = "small [[rural]]", placetype="fraziones"}
{qualifier = "small [[rural]]", placetype="frazione"}
{qualifier = "small [[rural]]", placetype="hamlet"}
{qualifier = "small [[rural]]", placetype="village"}
```
The return value ends up as
{qualifier = "small [[rural]]", placetype="frazione"},
{qualifier = "small [[rural]]", placetype="hamlet"},
{qualifier = "small [[rural]]", placetype="village"},
}}
Here, because the result of singularizing `fraziones` returns a different value from the placetype itself, that
singularized value is checked after the original plural value. Also, in the process of splitting off qualifiers,
they are canonicalized if the entry in `placetype_qualifiers` says to do so; in this case, links are placed around
`rural`. Finally, `frazione` has `hamlet` as its fallback, which in turn has `village` as its fallback, so both
fallbacks end up being returned.
`no_fallback`, if set, disables returning equivalent placetypes based on the `fallback` setting for a placetype. This is
used in the first of two loops in find_placetype_cat_specs() in [[Module:place]] to prefer exact matches for placetypes
such as barangays with later holonyms to matches based on a fallback such as `neighborhood` with an earlier holonym.
See the comment in that function in [[Module:place]] for a more detailed explanation of why this is needed. Only the
placetype itself, and any reduced placetypes created by chopping off recognized qualifiers at the beginning, are
returned; but we do not return reduced placetypes if a containing placetype exists in `placetype_data`. (For example,
`"overseas territory"` has a fallback `"dependent territory"`, and `"overseas"` is also a recognized qualifier. When
`no_fallback` is in place, without the above proviso, we would return `"overseas territory"` followed by `"ดินแดน"`
with the incorrect effect of classifying an `"overseas territory"` of the United Kingdom such as `"Gibraltar"` under
[[:Category:Territories of the United Kingdom]] instead of [[:Category:Dependent territories of the United Kingdom]].)
As an exception, if `historical`, `ancient`, `former` or the like are found, they proceed ignoring `no_fallback`,
because it seems tricky to handle them correctly in the presence of `no_fallback`, and historical/former placetypes
rarely occur with exact match category specs anyway.
`no_split_qualifiers` prevents splitting off recognized qualifiers and returning the remainder of the placetype as an
equivalent placetype. Only the passed-in placetype, and any fallbacks, will be returned. This is used in
[[Module:category tree/topic cat/data/Places]] when looking up placetypes found in categories. Such placetypes won't
have qualifiers and so it doesn't make sense to try and look for them.
`from_category`, if set, causes category-only placetypes (those ending in `!`) to also be checked.
`form_of_directive`, if set, causes the specified form-of directive (e.g. `FORMER_NAME_OF`) to be prepended to checked
placetypes, their directive-specific type (e.g. `FORMER_NAME_OF_type`), and their classes (`class`) to get the
appropriate placetypes to check for form-of-directive categories. It falls back to the prepended generic `place` as a
placetype, e.g. `FORMER_NAME_OF place`, if nothing else matches.
`no_check_for_inherently_former` is used internally to prevent an infinite loop when checking for `inherently_former`.
`register_former_as_non_former` is a major hack used in `get_bare_categories` to deal with the mismatch between e.g.
known location `Yugoslavia` declaring itself a `country` but definitions of it declaring it a `former country`. It
causes the non-former version of the specified placetype to be included in the returned equivalents along with the
former placetypes. [FIXME: This should apply only to the entries in `former_countries` but it's tricky to do that now;
fix this in the known-location refactor. -- The known-location refactor is already done but we haven't yet fixed this.]
]==]
function export.get_placetype_equivs(placetype, props)
local no_fallback, no_split_qualifiers, no_check_for_inherently_former, from_category, register_former_as_non_former
local form_of_directive
if props then
no_fallback, no_split_qualifiers, no_check_for_inherently_former, from_category, register_former_as_non_former =
props.no_fallback, props.no_split_qualifiers, props.no_check_for_inherently_former, props.from_category,
props.register_former_as_non_former
form_of_directive = props.form_of_directive
end
local equivs = {}
-- Insert `placetype` into `equivs`, along with any fallback placetypes listed in `placetype_data`. `qualifier` is
-- the preceding qualifier to insert into `equivs` along with the placetype (see comment at top of function). If
-- `from_category` is given, we also check for a category-specific entry consisting of the placetype followed by
-- `!`, and in all cases we also check to see if `placetype` is plural, and if so, insert the singularized version
-- along with its fallbacks (if any) in `placetype_data`. `form_of_prefix` is a form-of prefix such as
-- `OFFICIAL_NAME_OF`. If specified, we check the fallbacks of `placetype` without the prefix but then insert into
-- `equivs` the prefixed placetype. This way, if the user says e.g. {{tl|place|pt|@official name of:Cuba|island country|r/Caribbean}},
-- we will correctly categorize into [[:Category:Official names of countries]], rather than only trying to look up
-- `OFFICIAL_NAME_OF island country` and failing, falling back ultimately to [[:Category:Official names of places]].
local function insert_placetype_and_fallbacks(qualifier, placetype, form_of_prefix)
local function insert_equiv(pt)
if form_of_prefix then
-- Let's say the user says {{tl|place|pt|@official name of:Cuba|island country|r/Caribbean}} and we have
-- no entry for `OFFICIAL_NAME_OF island country` but we do for `OFFICIAL_NAME_OF country` (which we end
-- up processing because `island country` falls back to `country`), and that entry in turn is defined
-- using a fallback. We have to insert that fallback-of-fallback, and the easiest/cleanest way of
-- handling this is by calling ourselves recursively.
insert_placetype_and_fallbacks(qualifier, form_of_prefix .. " " .. pt)
else
insert(equivs, {qualifier=qualifier, placetype=pt})
end
end
-- Insert the placetype, along with any fallbacks.
local canon_placetype, ptdata, ptmatch = export.get_placetype_data(placetype, from_category)
if ptdata then
insert_equiv(canon_placetype)
if no_fallback then
return
end
local first_placetype = #equivs + 1
local prev_placetype = nil
while true do
local pt_value = export.placetype_data[canon_placetype]
if not pt_value then
internal_error("Fallback value %s specified for placetype %s but is not in `placetype_data`",
canon_placetype, prev_placetype)
end
if pt_value.fallback then
insert_equiv(pt_value.fallback)
local last_placetype = #equivs
if last_placetype - first_placetype >= 10 then
local fallback_loop = {}
for i = first_placetype, last_placetype do
insert(fallback_loop, equivs[i].placetype)
end
internal_error("Apparent loop in fallback chain: %s", table.concat(fallback_loop, " -> "))
end
prev_placetype = canon_placetype
canon_placetype = pt_value.fallback
else
break
end
end
end
end
-- Insert `placetype` into `equivs`, along with any fallback placetypes listed in `placetype_data`. This is a
-- wrapper around the more basic `insert_placetype_and_fallbacks()` which handles form-of directives. If there is no
-- form-of directive, this function directly calls `insert_placetype_and_fallbacks()`. We do things this way so that
-- form-of directives correctly combine with `former`-type qualifiers. Note that we also have special backups for
-- form-of directives that check `DIRECTIVE place` (and before that, `DIRECTIVE FORMER/ANCIENT place` is there's a
-- `former`-type directive); these backups live outside this function because we want them done once, late, rather
-- than in each invocation of `process_and_insert_placetype()`.
local function process_and_insert_placetype(qualifier, reduced_placetype)
if form_of_directive then
-- First check for e.g. `OFFICIAL_NAME_OF island country` and its fallbacks; then we look for fallbacks of
-- `island country` and check e.g. `OFFICIAL_NAME_OF country` and its fallbacks. All of this is handled by
-- `insert_placetype_and_fallbacks()` with appropriate parameters. After that, check the general class of
-- the directive, e.g. `subpolity` if something like `district` is given. (Eventually, we check for
-- `OFFICIAL_NAME_OF place` as a backup, but this happens at the end outside the loop over qualifiers.)
insert_placetype_and_fallbacks(qualifier, reduced_placetype, form_of_directive)
if not no_fallback then
local reduced_placetype_equivs = export.get_placetype_equivs(reduced_placetype)
local directive_type = export.get_equiv_placetype_prop_from_equivs(reduced_placetype_equivs,
function(pt) return export.get_placetype_prop(pt, form_of_directive .. "_type") or
export.get_placetype_prop(pt, "class") end
)
if not directive_type then
local pt_data = export.get_equiv_placetype_prop_from_equivs(reduced_placetype_equivs,
function(pt) return export.placetype_data[pt] end
)
if pt_data then
internal_error("For placetype %s in conjunction with form-of directive %s, placetype data " ..
'located but directive-specific type property %s missing, and so is "class"; ' ..
"placetypes searched are %s", reduced_placetype, form_of_directive,
form_of_directive .. "_type", reduced_placetype_equivs)
else
-- This should be allowed, as we allow unrecognized placetypes in general.
end
elseif directive_type ~= "!" then
insert_placetype_and_fallbacks(qualifier, directive_type, form_of_directive)
end
end
else
insert_placetype_and_fallbacks(qualifier, reduced_placetype)
end
end
-- Successively split off recognized qualifiers and loop over successively greater sets of qualifiers from the left
-- (unless `no_split_qualifiers` is specified, in which case we don't check for qualifiers).
local splits
if no_split_qualifiers then
splits = {{nil, nil, export.resolve_placetype_aliases(placetype)}}
else
splits = export.split_qualifiers_from_placetype(placetype)
end
for _, split in ipairs(splits) do
local prev_qualifier, this_qualifier, reduced_placetype = unpack(split, 1, 3)
-- If a special "former" qualifier like `former` or `historical` isn't present, and
-- `no_check_for_inherently_former` is not given (this flag is used to avoid infinite loops), check for
-- "inherently former" placetypes like `satrapy` and `treaty port` that always refer to no-longer-existing
-- placetypes, and handle accordingly.
local unlinked_this_qualifier
if this_qualifier and this_qualifier:find("%[") then
unlinked_this_qualifier = export.remove_links_and_html(this_qualifier)
else
unlinked_this_qualifier = this_qualifier
end
local former_qualifiers = this_qualifier and export.former_qualifiers[unlinked_this_qualifier] or nil
if not former_qualifiers and not no_check_for_inherently_former then
former_qualifiers = export.get_equiv_placetype_prop(reduced_placetype,
function(pt) return export.get_placetype_prop(pt, "inherently_former") end,
{no_check_for_inherently_former = true})
end
-- If a special "former" qualifier like `former` or `historical` is present, map it to the appropriate internal
-- qualifiers (`ANCIENT` and/or `FORMER`, which are written in all-caps to distinguish them from user-specified
-- qualifiers), fetch the `former_type` property, and treat the placetype as if a concatenation of the mapped
-- qualifier(s) and the value of `former_type`. For example, if `medieval village` is given, we map `medieval`
-- to `ANCIENT` and `FORMER`, and `village` to its `former_type` of `settlement`, and enter the placetypes
-- `ANCIENT settlement` and `FORMER settlement` (in that order) into `equivs`. If the placetype following the
-- "former" qualifier is recognized in `placetype_data` but has no `former_type` and no fallback with a
-- `former_type` specified, it is an internal error; but if the placetype isn't recognized (e.g. something like
-- `former greenhouse` is specified and we don't have an entry for `greenhouse`), just track the occurrence and
-- don't enter anything into `equivs`.
if former_qualifiers then
-- FIXME: Should we respect `no_fallback` here? My instinct says no.
local reduced_placetype_equivs = export.get_placetype_equivs(reduced_placetype, {
no_check_for_inherently_former = true
})
local former_type = export.get_equiv_placetype_prop_from_equivs(reduced_placetype_equivs,
function(pt) return export.get_placetype_prop(pt, "former_type") or
export.get_placetype_prop(pt, "class") end
)
if not former_type then
local pt_data = export.get_equiv_placetype_prop_from_equivs(reduced_placetype_equivs,
function(pt) return export.placetype_data[pt] end
)
if pt_data then
internal_error("For placetype %s, placetype data located but `former_type` missing; " ..
"placetypes searched are %s", reduced_placetype, reduced_placetype_equivs)
else
-- Enable error when we've verified there aren't any examples.
track("bad-former-placetype")
track("bad-former-placetype/" .. reduced_placetype)
--process_error("For placetype '%s', unrecognized placetype following 'former'-type " ..
-- "qualifier; searched placetype(s) %s", reduced_placetype, dump(reduced_placetype_equivs))
end
elseif former_type ~= "!" then
-- First check directly for `ANCIENT/FORMER` + the original following placetype. This makes it possible
-- for (e.g.) former provinces of the Roman empire to be categorized specially.
for _, former_qualifier in ipairs(former_qualifiers) do
process_and_insert_placetype(prev_qualifier, former_qualifier .. " " .. reduced_placetype)
end
for _, former_qualifier in ipairs(former_qualifiers) do
process_and_insert_placetype(prev_qualifier, former_qualifier .. " " .. former_type)
end
-- HACK! See explanation above for `register_former_as_non_former`.
if register_former_as_non_former then
process_and_insert_placetype(prev_qualifier, reduced_placetype)
end
-- If we're processing a form-of directive, after doing everything else we do
-- `DIRECTIVE ANCIENT/FORMER place` e.g. `OFFICIAL_NAME_OF FORMER place` as a backup.
if form_of_directive and not no_fallback then
for _, former_qualifier in ipairs(former_qualifiers) do
insert_placetype_and_fallbacks(prev_qualifier, form_of_directive .. " " .. former_qualifier ..
" place")
end
end
-- Don't continue processing equivs. The reason is probably the same as the `break` below for
-- qualifier_to_placetype_equivs[]; categories for `former BLAH` are set using `default`, and
-- non-former equivs will otherwise take precedence.
break
end
end
-- Then see if the rightmost split-off qualifier is in qualifier_to_placetype_equivs
-- (e.g. 'fictional *' -> 'fictional location'). If so, add the mapping.
if this_qualifier and export.qualifier_to_placetype_equivs[unlinked_this_qualifier] then
insert(equivs, {
qualifier=prev_qualifier,
placetype=export.qualifier_to_placetype_equivs[unlinked_this_qualifier]
})
-- Don't continue processing equivs; otherwise, if we specify 'mythological city', even though the
-- equivalent entry for 'mythological location' gets inserted ahead of the entry for 'city', the
-- latter ends up generating the category because the category for 'mythological location' is set as
-- the default value, which is used only when no non-default category can be found.
break
end
-- Finally, join the rightmost split-off qualifier to the previously split-off qualifiers to form a combined
-- qualifier, and add it along with reduced_placetype and any mapping in placetype_data for reduced_placetype.
-- NOTE: The first time through this loop, both `prev_qualifier` and `this_qualifier` are nil, and this inserts
-- the full placetype into `equivs`.
local qualifier = prev_qualifier and prev_qualifier .. " " .. this_qualifier or this_qualifier
process_and_insert_placetype(qualifier, reduced_placetype)
-- If `no_fallback` and there's an entry in `placetype_data` for this placetype, don't include any reduced
-- placetypes to avoid the "overseas territory treated as a territory" issue describe above.
if no_fallback then
local canon_placetype, ptdata, ptmatch = export.get_placetype_data(reduced_placetype, from_category)
if canon_placetype then
break
end
end
end
-- If we're processing a form-of directive, after doing everything else we do `DIRECTIVE place` e.g.
-- `OFFICIAL_NAME_OF place` as a backup; but only if either the placetype as a whole is recognized or the placetype
-- begins with a recognized qualifier. This latter check is to avoid categorizing into e.g.
-- [[Category:en:Former names of places]] in an invocation like
-- {{place|en|@former name of:Democratic Republic of the Congo|country|r/Central Africa|;|used from 1971–1997}};
-- the `used from 1971–1997` gets treated as a placetype and we're called on it.
if form_of_directive and not no_fallback and (splits[2] or export.get_placetype_data(placetype, from_category)) then
insert_placetype_and_fallbacks(nil, form_of_directive .. " place")
end
return equivs
end
function export.get_equiv_placetype_prop_from_equivs(equivs, fun, continue_on_nil_only)
for _, equiv in ipairs(equivs) do
local retval = fun(equiv.placetype)
if continue_on_nil_only and retval ~= nil or not continue_on_nil_only and retval then
return retval, equiv
end
end
return nil, nil
end
--[==[
Given a placetype `placetype` and a function `fun` of one argument, iteratively call the function on equivalent
placetypes fetched from `get_placetype_equivs` until the function returns a non-falsy value (i.e. not {nil} or {false});
but if `continue_on_nil_only` is specified, the iterations continue until the function returns non non-{nil} value.
FIXME: We should make `continue_on_nil_only` the default; but this requires changing some callers.) When `fun` returns a
non-falsy or non-{nil} value, `get_equiv_placetype_prop` returns two values: the value returned by `fun` and the
equivalent placetype that triggered the non-falsy (or non-{nil}) return value. If `fun` never returns a non-falsy (or
non-{nil}) value, `get_equiv_placetype_prop` returns {nil} for both return values. If `placetype` is passed in as {nil},
the return value is the result of calling `fun` on {nil} (whatever it is) with {nil} for the second return value.
]==]
function export.get_equiv_placetype_prop(placetype, fun, props)
if not placetype then
return fun(nil), nil
end
return export.get_equiv_placetype_prop_from_equivs(export.get_placetype_equivs(placetype, props), fun,
props and props.continue_on_nil_only)
end
--[==[
Return the article that is used with an entry placetype. We proceed as follows:
# See if there is a recognized qualifier at the beginning that specifies an article (including `false` for no article).
This takes precedence over anything else, so that e.g. `various capitals` gets no article rather than "`the"`.
# Then check the placetype or any equivalent placetype for the `entry_placetype_use_the` property, indicating that
`"the"` should be used.
# Otherwise we look to see if the placetype itself (not any equivalents, even those involving deleting a qualifier from
the beginning) has an entry in `placetype_data` that specifies the indefinite article using `entry_placetype_use_the`
(principally for use with placetypes like `union territory`).
# Otherwise, we use [[Module:en-utilities]] to apply the standard algorithm to generate `"an"` for words beginning with
a vowel and `"a"` otherwise.
If `ucfirst` is true, the first letter of the article is made upper-case.
]==]
function export.get_placetype_article(placetype, ucfirst)
local art
local qualifier, reduced_placetype = placetype:match("^(.-) (.*)$")
if qualifier then
local canon = export.placetype_qualifiers[qualifier]
if type(canon) == "table" then
art = canon.article
end
end
if art == false then
return art
end
if art == nil then
local placetype_use_the = export.get_equiv_placetype_prop(placetype,
function(pt) return export.get_placetype_prop(pt, "entry_placetype_use_the") end)
if placetype_use_the then
art = "the"
else
art = export.get_placetype_prop(placetype, "entry_placetype_indefinite_article")
if not art then
art = --[[require(en_utilities_module).get_indefinite_article(placetype)]] ""
end
end
end
if ucfirst then
art = m_strutils.ucfirst(art)
end
return art
end
--[==[
Return the preposition that should be used after `placetype` when occurring as an entry placetype or in categories
(e.g. `city >in< France` but `country >of< South America`). The preposition defaults to `"ใน"` if not specified.
]==]
function export.get_placetype_entry_preposition(placetype)
local pt_prep = export.get_equiv_placetype_prop(placetype,
function(pt) return export.get_placetype_prop(pt, "preposition") end
)
return pt_prep or "ใน"
end
--[==[
Given a place desc (see top of file) and a holonym object (see top of file), add a key/value into the place desc's
`holonyms_by_placetype` field corresponding to the placetype and placename of the holonym. For example, corresponding
to the holonym "c/Italy", a key "ประเทศ" with the list value {"Italy"} will be added to the place desc's
`holonyms_by_placetype` field. If there is already a key with that place type, the new placename will be added to the
end of the value's list.
]==]
function export.key_holonym_into_place_desc(place_desc, holonym)
if not holonym.placetype then
return
end
-- Key in equivalent placetypes, so that e.g. `cities/San Francisco` gets keyed under `city`; but don't do
-- fallbacks, as it doesn't seem correct for the "do other holonyms of the same placetype" algorithm to do holonyms
-- of different types just because they have the same fallback.
local equiv_placetypes = export.get_placetype_equivs(holonym.placetype, {no_fallback = true})
local unlinked_placename = holonym.unlinked_placename
for _, equiv in ipairs(equiv_placetypes) do
local placetype = equiv.placetype
if not place_desc.holonyms_by_placetype then
place_desc.holonyms_by_placetype = {}
end
if not place_desc.holonyms_by_placetype[placetype] then
place_desc.holonyms_by_placetype[placetype] = {unlinked_placename}
else
insert(place_desc.holonyms_by_placetype[placetype], unlinked_placename)
end
end
end
--[=[
Construct a formatted link from the raw link spec `link` given the canonical singular placetype `sg_placetype`. If the
placetype was originally plural, `orig_placetype` should contain this plural value; otherwise it should be nil. This
will construct the appropriate type of link that displays as `orig_placetype` (or otherwise `sg_placetype`) but links to
whatever the `link` spec specifies (which may be `sg_placetype`, a Wikipedia article, etc.). `ptdata` is the placetype
data structure for the placetype, and `from_category` indicates that we are generating the description of a category
(otherwise we are generating the display form of an entry placetype).
]=]
local function make_placetype_link(link, sg_placetype, orig_placetype, ptdata, from_category, noerror)
if not from_category and ptdata.disallow_in_entries then
if noerror then
return "[not meant to be specified directly, with warning: " .. ptdata.disallow_in_entries .. "]"
else
process_error("Placetype %s is not meant to be specified directly: " .. ptdata.disallow_in_entries, sg_placetype)
end
end
if link == nil then
internal_error("Placetype data present for placetype %s but no link= setting given", sg_placetype)
elseif link == true then
if orig_placetype then
return ("[[%s|%s]]"):format(sg_placetype, orig_placetype)
else
return ("[[%s]]"):format(sg_placetype)
end
elseif link == false then
process_error("Placetype %s is not meant to be specified directly, but is only for internal use", sg_placetype)
elseif link == "w" then
return ("[[w:%s|%s]]"):format(sg_placetype, orig_placetype or sg_placetype)
elseif link == "separately" then
if orig_placetype then
local sg_words = split(sg_placetype, " ")
local orig_words = split(orig_placetype, " ")
if #sg_words ~= #orig_words then
internal_error("Can't construct 'separately' link for plural placetype %s as original placetype %s " ..
"has different number of words", orig_placetype, sg_placetype)
else
for i = 1, #sg_words do
if sg_words[i] == orig_words[i] then
sg_words[i] = ("[[%s]]"):format(sg_words[i])
else
sg_words[i] = ("[[%s|%s]]"):format(sg_words[i], orig_words[i])
end
end
return concat(sg_words, " ")
end
else
return (sg_placetype:gsub("([^ ]+)", "[[%1]]"))
end
elseif link:find("^%+") then
link = link:sub(2) -- discard initial +
return ("[[%s|%s]]"):format(link, orig_placetype or sg_placetype)
elseif not orig_placetype then
return link
else
return --[[require(en_utilities_module).pluralize(link)]] link
end
end
--[==[
Get the display form of a placetype by looking it up in `placetype_data`. If the placetype is recognized, or is the
plural of a recognized placetype, the corresponding linked display form is returned (with plural placetypes displaying
as plural but linked to the singular form of the placetype). Otherwise, return nil. If we're generating the description
of a category, `category_type` should be set to one of `"top-level"` (for top-level categories like
[[:Category:Neighborhoods]]), `"noncity"` (for non-city categories like [[:Category:Neighborhoods in Illinois, USA]]) or
`"city"` (for city categories like [[:Category:Neighborhoods of Chicago]]). Otherwise, we're generating the description
for use in formatting a {{tl|place}} call, and category-only placetypes ending in `!` will be ignored, along with
special `category_link*` settings. `return_full` is used along with `category_type` and will preferably return the
"full" variant of category link settings, i.e. `full_category_link*`; if they don't exist, the `category_link*` value is
prepended with `"names of"`. `noerror` says to not throw an error when encountering entry placetypes that would be
disallowed.
]==]
function export.get_placetype_display_form(placetype, category_type, return_full, noerror)
local from_category = not not category_type
local canon_placetype, ptdata, ptmatch = export.get_placetype_data(placetype, from_category)
if canon_placetype then
local raw_link
local function is_linked_string(str)
return type(str) == "string" and str:find("%[%[")
end
if category_type then
local fetched_full
local function fetch_maybe_full(prop)
local retval = ptdata["full_" .. prop]
if retval ~= nil then
if return_full then
return retval, true
else
internal_error("Saw full_" .. prop .. "=%s but `return_full` not set, can't handle", retval)
end
end
return ptdata[prop], false
end
local function maybe_prefix(str)
if return_full and not fetched_full then
return "names of " .. str
else
return str
end
end
-- Careful with `false` as possible value.
if category_type == "top-level" then --ห้ามแปล
raw_link, fetched_full = fetch_maybe_full("category_link_top_level")
elseif category_type == "noncity" then --ห้ามแปล
raw_link, fetched_full = fetch_maybe_full("category_link_before_noncity")
elseif category_type == "city" then --ห้ามแปล
raw_link, fetched_full = fetch_maybe_full("category_link_before_city")
else
internal_error('Unrecognized value for `category_type` %s, should be "top-level", "noncity" or "city"', --ห้ามแปล
category_type)
end
if type(raw_link) == "string" then
return maybe_prefix(raw_link), ptdata
elseif raw_link ~= nil then
return raw_link, ptdata
end
raw_link, fetched_full = fetch_maybe_full("category_link")
if raw_link == false then
return raw_link, ptdata
end
if is_linked_string(raw_link) then
return maybe_prefix(raw_link), ptdata
end
if ptmatch == "plural" then
raw_link, fetched_full = fetch_maybe_full("plural_link")
if raw_link == false then
return raw_link, ptdata
end
if is_linked_string(raw_link) then
return maybe_prefix(raw_link), ptdata
end
end
if raw_link == nil then
raw_link, fetched_full = fetch_maybe_full("link")
end
if raw_link == false then
return raw_link, ptdata
end
return maybe_prefix(make_placetype_link(raw_link, canon_placetype,
placetype ~= canon_placetype and placetype or nil, ptdata, from_category, noerror)), ptdata
else
if ptmatch == "plural" then
raw_link = ptdata.plural_link
if raw_link == false then
process_error("Placetype %s cannot appear plural", placetype)
end
if is_linked_string(raw_link) then
return raw_link, ptdata
end
end
if raw_link == nil then
raw_link = ptdata.link
end
return make_placetype_link(raw_link, canon_placetype,
placetype ~= canon_placetype and placetype or nil, ptdata, from_category, noerror), ptdata
end
end
return nil
end
local function resolve_unlinked_placename_display_aliases(placetype, placename)
local equiv_placetypes = export.get_placetype_equivs(placetype)
for i, equiv in ipairs(equiv_placetypes) do
equiv_placetypes[i] = equiv.placetype
end
local all_display_aliases_found = {}
local all_others_found = {}
for group, key, spec in m_locations.iterate_matching_location {
placetypes = equiv_placetypes,
placename = placename,
alias_resolution = "display",
} do
if spec.alias_of and spec.display then
insert(all_display_aliases_found, {group, key, spec, spec.display_as_full})
else
insert(all_others_found, {group, key, spec})
end
end
if not all_display_aliases_found[1] then
return placename
elseif all_display_aliases_found[2] then
internal_error("Found multiple matching display aliases for placename %s, placetype %s: " ..
"all_display_aliases_found=%s, all_others_found=%s", placename, placetype, all_display_aliases_found,
all_others_found)
elseif all_others_found[1] then
internal_error("Found a display alias along with other possible meanings for placename %s, placetype %s: " ..
"all_display_aliases_found=%s, all_others_found=%s", placename, placetype, all_display_aliases_found,
all_others_found)
else
local group, key, spec, as_full = unpack(all_display_aliases_found[1])
local full, elliptical = m_locations.key_to_placename(group, key)
return as_full and full or elliptical
end
end
--[==[
If `placename` of type `placetype` is a display alias, convert it to its canonical form; otherwise, return unchanged.
Display aliases transform certain placenames into canonical displayed forms. For example, if any of `country/US`,
`country/USA` or `country/United States of America` (or `c/US`, etc.) are given, the result will be displayed as
`United States`.
'''NOTE''': Display aliases change what is displayed from what the editor wrote in the Wikitext. As a result, they
should (a) be non-political in nature, and (b) not involve a change where the word `the` needs to be added or removed.
For example, normalizing `US` and `USA` to `United States` for display purposes is OK but normalizing `Burma` to
`Myanmar` is not (instead a cat alias should be used) because the terms `Burma` and `Myanmar` have clear political
connotations. Similarly, we have a display alias that maps the old name of `Macedonia` as a country (but not a region!)
to `North Macedonia`, but `Republic of Macedonia` is mapped to `North Macedonia` only as a cat alias because the two
terms differ in their use of `the`. (For example, if we had a display alias mapping `Republic of Macedonia` to
`North Macedonia`, the call {{tl|place|en|the <<capital city>> of the <<c/Republic of Macedonia>>}} would wrongly
display as `the [[capital city]] of the [[North Macedonia]]`.) Generally, display normalizations tend to involve
alternative forms (e.g. abbreviations, ellipses, foreign spellings) where the normalization improves clarity and
consistency.
]==]
function export.resolve_placename_display_aliases(placetype, placename)
-- If the placename is a link, apply the alias inside the link.
-- This pattern matches both piped and unpiped links. If the link is not piped, the second capture (linktext) will
-- be empty.
local link, linktext = rmatch(placename, "^%[%[([^|%[%]]+)|?([^|%[%]]-)%]%]$")
if link then
if linktext ~= "" then
local alias = resolve_unlinked_placename_display_aliases(placetype, linktext)
return "[[" .. link .. "|" .. alias .. "]]"
else
local alias = resolve_unlinked_placename_display_aliases(placetype, link)
return "[[" .. alias .. "]]"
end
else
return resolve_unlinked_placename_display_aliases(placetype, placename)
end
end
--[==[
Generate the "prefixed" version of a bare key, i.e. prefix it with `the` if correct for this key.
]==]
function export.get_prefixed_key(key, spec)
if spec.the then
return "the " .. key
else
return key
end
end
-- Necessary for use by [[Module:place]]. FIXME: Reorganize the modules so this isn't necessary.
export.iterate_matching_location = m_locations.iterate_matching_location
--[=[
Iterator that iterates over holonyms in `place_desc`. If `first_holonym_index` is given, start iterating at the
specified holonym and stop either when there are no more holonyms or a holonym with modifier `:also` is found. If
`first_holonym_index` is nil or omitted, iterate over all holonyms regardless. If `include_raw_text_holonyms` is
specified, raw text holonyms (those not of the form `placetype/placename`) are returned as well; they can be identified
by the fact that the `placetype` field in the holonym structure is nil. Two values are returned at each iteration, the
holonym index and holonym structure, similar to `ipairs()`.
]=]
function export.get_holonyms_to_check(place_desc, first_holonym_index, include_raw_text_holonyms)
local stop_at_also = not not first_holonym_index
return function(place_desc, index)
while true do
index = index + 1
local this_holonym = place_desc.holonyms[index]
-- If we were passed in a starting holonym index, go up to but not including a holonym marked with `:also`
-- (continue_cat_loop); the categorization code will then restart the loop at that holonym. That holonym
-- will have `:also` marked on it, so make sure not to stop immediately if the first holonym is marked with
-- `:also`.
if not this_holonym or stop_at_also and index > first_holonym_index and this_holonym.continue_cat_loop then
return nil
end
-- If not placetype, we're processing raw text, which we normally want to skip.
if include_raw_text_holonyms or this_holonym.placetype then
return index, this_holonym
end
end
end, place_desc, first_holonym_index and first_holonym_index - 1 or 0
end
--[==[
If the holonym in `data` (in the format as passed to a category handler) refers to a known location, iterate over all
such known locations, returning for each location the corresponding key, spec and group as well as the trail of
ancestral containers. Unlike `iterate_matching_location()`, this specifically checks that there is no mismatch between
the location's containers at any level and any of the following holonyms in the {{tl|place}} spec. The fields in `data`
are:
* `holonym_placetype`: The placetype of the holonym. It can actually be a list of possible placetypes, as with
`iterate_matching_location()`.
* `holonym_placename`: The placename of the holonym.
* `holonym_index`: The index of the holonym among the holonyms in `place_desc`, or nil if the holonym is not among the
holonyms in `place_desc`. (If a holonym index is given, we check for container mismatches among the holonyms
following the specified index, stopping either when encountering a holonym marked with modifier `:also` or, if none
exist, when we run out of holonyms. If no holonym index is given, we check all holonyms for container mismatches.)
* `place_desc`: Description of the place; used for the holonyms, to check for container mismatches.
Returns four values: the location group, the canonical key by which the location is known, the spec object describing
the location and the trail of ancestral containers for the location. The first three values are the same as for
`iterate_matching_location`.
]==]
function export.iterate_matching_holonym_location(data)
local holonym_placetype, holonym_placename, holonym_index, place_desc =
data.holonym_placetype, data.holonym_placename, data.holonym_index, data.place_desc
local matching_location_iterator = m_locations.iterate_matching_location {
placetypes = holonym_placetype,
placename = holonym_placename,
}
return function()
while true do
local group, key, spec = matching_location_iterator()
if not group then
return nil
end
local container_trail = {}
-- For each level of container, check that there are no mismatches (i.e. other location of the same
-- placetype) mentioned. We allow a mismatch at a given level if there's also a match with the container
-- at that level. For example, in the case of Kansas City, defined in [[Module:place/locations]] as a city
-- in Missouri, if we define it as {{tl|place|city|s/Missouri,Kansas}}, we ignore the mismatching state of
-- Kansas because the correct state of Missouri was also mentioned. But imagine we are defining Newark,
-- Delaware as {{tl|place|city|s/Delaware|c/US}} and (as is the case) we have an entry for Newark, New
-- Jersey in [[Module:place/locations]]. Just because the containing location `US` matches isn't enough,
-- because Newark, NJ also has New Jersey as a containing location and there's a mismatch at that level. If
-- there are no mismatches at any level we assume we're dealing with the right known location.
--
-- If at a given level there are multiple containing locations, we count a match if any holonym matches any
-- containing location, and a mismatch only if a holonym exists of the same placetype that doesn't match any
-- containing location.
local containers_mismatch = false
for containers in m_locations.iterate_containers(group, key, spec) do
insert(container_trail, containers)
local match_at_level = false
local mismatch_at_level = false
for other_holonym_index, other_holonym in export.get_holonyms_to_check(place_desc,
holonym_index and holonym_index + 1 or nil) do
local other_source_holonym = other_holonym.augmented_from_holonym
if other_source_holonym and other_source_holonym.placetype == holonym_placetype and
other_source_holonym.unlinked_placename ~= holonym_placename then
-- Ignore holonyms added during the augmentation process for other holonyms of the same
-- placetype as the placetype of the holonym we're considering. See comment in
-- augment_holonyms_with_container() for why we do this.
-- continue; grrr, no 'continue' in Lua
else
local holonym_matches_at_level = false
local holonym_exists_with_same_placetype = false
for _, container in ipairs(containers) do
if not container.spec.no_check_holonym_mismatch then
local full_container_placename, elliptical_container_placename =
m_locations.key_to_placename(container.group, container.key)
local placetypes = container.spec.placetype
if type(placetypes) ~= "table" then
placetypes = {placetypes}
end
local placetype_equivs = {}
for _, pt in ipairs(placetypes) do
m_table.extend(placetype_equivs, export.get_placetype_equivs(pt))
end
local this_holonym_matches = export.get_equiv_placetype_prop_from_equivs(
placetype_equivs, function(placetype)
return other_holonym.placetype == placetype and
(other_holonym.unlinked_placename == full_container_placename or
other_holonym.unlinked_placename == elliptical_container_placename)
end
)
if this_holonym_matches then
holonym_matches_at_level = true
break
end
local this_holonym_exists_with_same_placetype = export.get_equiv_placetype_prop_from_equivs(
placetype_equivs, function(placetype)
return other_holonym.placetype == placetype
end
)
if this_holonym_exists_with_same_placetype then
-- We seem to have a mismatch at this level. But before we decide conclusively that this
-- is the case, check to see whether the putative mismatch is an alias and matches when
-- we resolve the alias.
for oh_group, oh_key, oh_spec, oh_container_trail in
export.iterate_matching_holonym_location {
holonym_placetype = other_holonym.placetype,
holonym_placename = other_holonym.unlinked_placename,
holonym_index = other_holonym_index,
place_desc = place_desc,
} do
local oh_full_placename, oh_elliptical_placename =
m_locations.key_to_placename(oh_group, oh_key)
if oh_full_placename == full_container_placename or
oh_elliptical_placename == elliptical_container_placename then
-- Alias matched when resolved.
this_holonym_matches = true
break
end
end
if this_holonym_matches then
-- Alias matched above when resolved.
holonym_matches_at_level = true
break
else
-- Not an alias, or doesn't match when resolved. We have a true mismatch.
holonym_exists_with_same_placetype = true
end
end
end
end
if holonym_matches_at_level then
match_at_level = true
break
end
if holonym_exists_with_same_placetype then
mismatch_at_level = true
end
end
end
if not match_at_level and mismatch_at_level then
containers_mismatch = true
break
end
end
if not containers_mismatch then
return group, key, spec, container_trail
end
end
end
end
--[==[
If the holonym in `data` (in the format as passed to a category handler) refers to a known location, find and return the
corresponding key, spec and group as well as the trail of ancestral containers. This is like
`iterate_matching_holonym_location()` but throws an error if more than one location matches. (An example where this
would happen is {{tl|place|en|neighborhood|city/Newcastle}}, because there are two known locations named Newcastle. To
fix this, specify additional following disambiguating holonyms, e.g.
{{tl|place|en|neighborhood|city/Newcastle|s/New South Wales}}.
]==]
function export.find_matching_holonym_location(data)
local all_found = {}
for group, key, spec, container_trail in export.iterate_matching_holonym_location(data) do
insert(all_found, {group, key, spec, container_trail})
end
if not all_found[1] then
return nil
elseif all_found[2] then
local holonym_placetype = data.holonym_placetype
if type(holonym_placetype) == "table" then
holonym_placetype = concat(holonym_placetype, ",")
end
local found_keys = {}
for _, found in ipairs(all_found) do
local _, key, _, _ = unpack(found)
insert(found_keys, key)
end
error(("Found multiple matching locations for holonym '%s/%s'; specify disambiguating context in the " ..
"containing holonyms: %s"):format(holonym_placetype, data.holonym_placename, dump(found_keys)))
else
return unpack(all_found[1])
end
end
------------------------------------------------------------------------------------------
-- Placename and placetype data --
------------------------------------------------------------------------------------------
--[==[ var:
This is a map from aliases to their canonical forms. Any placetypes appearing as keys here will be mapped to their
canonical forms in all respects, including the display form. Contrast entries in 'placetype_data' with a fallback, which
applies to categorization and other processes but not to display.
The most important aliases are for holonym placetypes, particularly those that occur often such as "ประเทศ", "รัฐ",
"จังหวัด" and the like. Particularly long placetypes that mostly occur as entry placetypes (e.g.
"census-designated place") can be given abbreviations, but it is generally preferred to spell out the entry placetype.
Note also that we purposely avoid certain abbreviations that would be ambiguous (e.g. "d", which could variously be
interpreted as "department", "อำเภอ" or "division").
]==]
export.placetype_aliases = {
["acomm"] = "autonomous community",
["adr"] = "administrative region",
["adterr"] = "administrative territory", -- Pakistan
["aobl"] = "autonomous oblast",
["aokr"] = "autonomous okrug",
["ap"] = "autonomous province",
["apref"] = "autonomous prefecture",
["aprov"] = "autonomous province",
["ar"] = "autonomous region",
["arch"] = "archipelago",
["arep"] = "autonomous republic",
["aterr"] = "autonomous territory",
["atu"] = "autonomous territorial unit",
["bor"] = "borough",
["c"] = "ประเทศ",
["can"] = "canton",
["carea"] = "council area",
["cc"] = "constituent country",
["cdblock"] = "community development block",
["cdep"] = "Crown dependency",
["CDP"] = "census-designated place",
["cdp"] = "census-designated place",
["clcity"] = "county-level city",
["co"] = "เทศมณฑล",
["cobor"] = "county borough",
["colcity"] = "county-level city",
["coll"] = "collectivity",
["comm"] = "community",
["cont"] = "ทวีป",
["contr"] = "continental region",
["contregion"] = "continental region",
["cpar"] = "civil parish",
["damun"] = "direct-administered municipality",
["dep"] = "dependency",
["department capital"] = "departmental capital",
["dept"] = "department",
["depterr"] = "dependent territory",
["dist"] = "อำเภอ",
["distmun"] = "district municipality",
["div"] = "division",
["emp"] = "จักรวรรดิ",
["fpref"] = "French prefecture",
["gov"] = "governorate",
["govnat"] = "governorate",
["home-rule city"] = "home rule city",
["home-rule municipality"] = "home rule municipality",
["inner-city area"] = "inner city area",
["ires"] = "Indian reservation",
["isl"] = "เกาะ",
["lbor"] = "London borough",
["lga"] = "local government area",
["lgarea"] = "local government area",
["lgd"] = "local government district",
["lgdist"] = "local government district",
["metbor"] = "metropolitan borough",
["metcity"] = "มหานคร",
["metmun"] = "metropolitan municipality",
["mtn"] = "ภูเขา",
["mun"] = "เทศบาล",
["mundist"] = "municipal district",
["nonmetropolitan county"] = "non-metropolitan county",
["obl"] = "oblast",
["okr"] = "okrug",
["p"] = "จังหวัด",
["par"] = "parish",
["parmun"] = "parish municipality",
["pen"] = "peninsula",
["plcity"] = "prefecture-level city",
["plcolony"] = "Polish colony",
["pref"] = "prefecture",
["prefcity"] = "prefecture-level city",
["preflcity"] = "prefecture-level city",
["prov"] = "จังหวัด",
["r"] = "ภูมิภาค",
["range"] = "เทือกเขา",
["rcm"] = "regional county municipality",
["rcomun"] = "regional county municipality",
["rdist"] = "regional district",
["rep"] = "republic",
["rhrom"] = "rural hromada",
["riv"] = "แม่น้ำ",
["rmun"] = "regional municipality",
["robor"] = "royal borough",
["romp"] = "Roman province",
["runit"] = "regional unit",
["rurmun"] = "rural municipality",
["s"] = "รัฐ",
["sar"] = "special administrative region",
["shrom"] = "settlement hromada",
["spref"] = "subprefecture",
["sprefcity"] = "sub-prefectural city",
["sprovcity"] = "subprovincial city",
["submet city"] = "sub-metropolitan city",
["submetropolitan city"] = "sub-metropolitan city",
["sub-prefecture-level city"] = "sub-prefectural city",
["sub-provincial city"] = "subprovincial city",
["sub-provincial district"] = "subprovincial district",
["terr"] = "ดินแดน",
["terrauth"] = "territorial authority",
["twp"] = "township",
["twpmun"] = "township municipality",
["uauth"] = "unitary authority",
["ucomm"] = "unincorporated community",
["udist"] = "unitary district",
["uhrom"] = "urban hromada",
["uterr"] = "union territory",
["utwpmun"] = "united township municipality",
["val"] = "valley",
["vdc"] = "village development committee",
["vil"] = "village",
["voi"] = "voivodeship",
["wcomm"] = "Welsh community",
}
local no_link_def_article = {link = false, article = "the"}
local no_link_no_article = {link = false, article = false}
--[==[ var:
These qualifiers can be prepended onto any placetype and will be handled correctly. For example, the placetype
`large city` will be displayed as `large <nowiki>[[city]]</nowiki>` and categorized as if `city` were specified. If the
value in the following table is a string, the qualifier will display according to the string. If the value is `true`,
the qualifier will be linked to its corresponding Wiktionary entry. If the value is `false`, the qualifier will not be
linked but will appear as-is. Note that these qualifiers do not override placetypes with entries elsewhere that contain
those same qualifiers. For example, the entry for `inland sea` in `placetype_data` will apply in preference to treating
`inland sea` as equivalent to `sea`.
]==]
export.placetype_qualifiers = {
-- generic qualifiers
["huge"] = false,
["tiny"] = false,
["large"] = false,
["big"] = false,
["mid-size"] = false,
["mid-sized"] = false,
["small"] = false,
["sizable"] = false,
["important"] = false,
["long"] = false,
["short"] = false,
["major"] = false,
["minor"] = false,
["high"] = false,
["tall"] = false,
["low"] = false,
["left"] = false, -- left tributary
["right"] = false, -- right tributary
["modern"] = false, -- for use in opposition to "ancient" in another definition
-- "former" qualifiers
["abandoned"] = true,
["ancient"] = true,
["deserted"] = true,
["extinct"] = true,
["former"] = false,
["historic"] = "historical",
["historical"] = true,
["medieval"] = true,
["mediaeval"] = true,
["ruined"] = true,
["traditional"] = true,
-- sea qualifiers
["coastal"] = true,
["inland"] = true, -- note, we also have an entry in placetype_data for 'inland sea' to get a link to [[inland sea]]
["maritime"] = true,
["overseas"] = true,
["seaside"] = true,
["beachfront"] = true,
["beachside"] = true,
["riverside"] = true,
-- lake qualifiers
["freshwater"] = true,
["saltwater"] = true,
["endorheic"] = true,
["oxbow"] = true,
["ox-bow"] = "[[oxbow]]", -- [[ox-bow]] is a red link
["tidal"] = true,
-- land qualifiers
["hilltop"] = true,
["hilly"] = true,
["insular"] = true,
["peninsular"] = true,
["chalk"] = true,
["karst"] = true,
["limestone"] = true,
["mountainous"] = true,
["mountaintop"] = true,
["alpine"] = true,
["volcanic"] = true, -- for an island
-- political status qualifiers
["autonomous"] = true,
["incorporated"] = true,
["special"] = true,
["unincorporated"] = true,
["coterminous"] = true,
-- monetary status/etc. qualifiers
["fashionable"] = true,
["wealthy"] = true,
["affluent"] = true,
["declining"] = true,
-- city vs. rural qualifiers
["urban"] = true,
["suburban"] = true,
["exurban"] = true,
["outlying"] = true,
["remote"] = true,
["rural"] = true,
["outback"] = true,
["inner"] = false,
["inner-city"] = true,
["central"] = false,
["outer"] = false,
-- land use qualifiers
["residential"] = true,
["agricultural"] = true,
["business"] = true,
["commercial"] = true,
["industrial"] = true,
-- business use qualifiers
["railroad"] = true,
["railway"] = true,
["farming"] = true,
["fishing"] = true,
["mining"] = true,
["logging"] = true,
["cattle"] = true,
-- tourism use qualifiers
["resort"] = true, -- note, we also have 'resort city' and 'resort town', that take precedecne
["spa"] = true, -- note, we also have 'spa city' and 'spa town', that take precedecne
["ski"] = true, -- note, we also have 'ski resort city' and 'ski resort town', that take precedecne
-- religious qualifiers
["holy"] = true,
["sacred"] = true,
["religious"] = true,
["secular"] = true,
-- qualifiers for nonexistent places
["claimed"] = false,
["fictional"] = true,
["legendary"] = true,
["mythical"] = true,
["mythological"] = true,
-- directional qualifiers
["northern"] = false,
["southern"] = false,
["eastern"] = false,
["western"] = false,
["north"] = false,
["south"] = false,
["east"] = false,
["west"] = false,
["northeastern"] = false,
["southeastern"] = false,
["northwestern"] = false,
["southwestern"] = false,
["northeast"] = false,
["southeast"] = false,
["northwest"] = false,
["southwest"] = false,
-- seasonal qualifiers
["summer"] = true, -- e.g. for 'summer capital'
["winter"] = true,
-- legal status qualifiers
-- FIXME: Two-word qualifiers don't work yet. But you can enter "de-facto" and it's canonicalized to [[de facto]].
["official"] = true,
["unofficial"] = true,
["de facto"] = true, -- 'de facto capital'
["de-facto"] = "[[de facto]]", -- [[de-facto]] is a red link
["de jure"] = true, -- 'de jure capital'
["de-jure"] = "[[de jure]]", -- [[de-jure]] is a red link
-- NOTE: 'unrecognized/unrecognised' are handled as placetypes 'unrecognized country', 'unrecognized state'
-- misc. qualifiers
["planned"] = true,
["chartered"] = true,
["landlocked"] = true,
["uninhabited"] = true,
-- superlative qualifiers
["first"] = no_link_def_article,
["second"] = no_link_def_article, -- for "second largest" etc.
["third"] = no_link_def_article,
["fourth"] = no_link_def_article,
["last"] = no_link_def_article,
["only"] = no_link_def_article,
["sole"] = no_link_def_article,
["main"] = no_link_def_article,
["largest"] = no_link_def_article,
["biggest"] = no_link_def_article,
["smallest"] = no_link_def_article,
["shortest"] = no_link_def_article,
["longest"] = no_link_def_article,
["tallest"] = no_link_def_article,
["highest"] = no_link_def_article,
["lowest"] = no_link_def_article,
["leftmost"] = no_link_def_article,
["rightmost"] = no_link_def_article,
["innermost"] = no_link_def_article,
["outermost"] = no_link_def_article,
["northernmost"] = no_link_def_article,
["southernmost"] = no_link_def_article,
["westernmost"] = no_link_def_article,
["easternmost"] = no_link_def_article,
["northwesternmost"] = no_link_def_article,
["southwesternmost"] = no_link_def_article,
["northeasternmost"] = no_link_def_article,
["southeasternmost"] = no_link_def_article,
-- several/various
["several"] = no_link_no_article,
["various"] = no_link_no_article,
["numerous"] = no_link_no_article,
["multiple"] = no_link_no_article,
["many"] = no_link_no_article,
["other"] = no_link_no_article,
}
--[==[ var:
In this table, the key qualifiers should be treated the same as the value qualifiers for categorization purposes. This
is overridden by `placetype_data` and `qualifier_to_placetype_equivs`.
]==]
export.former_qualifiers = {
["abandoned"] = {"FORMER"},
["ancient"] = {"ANCIENT", "FORMER"},
["former"] = {"FORMER"},
["extinct"] = {"FORMER"},
["historic"] = {"FORMER"},
["historical"] = {"FORMER"},
["medieval"] = {"ANCIENT", "FORMER"},
["mediaeval"] = {"ANCIENT", "FORMER"},
["ruined"] = {"ANCIENT", "FORMER"},
["traditional"] = {"FORMER"},
}
--[==[ var:
In this table, any placetypes containing these qualifiers that do not occur in `placetype_data` should be mapped to the
specified placetypes for categorization purposes. Entries here are overridden by `placetype_data`.
]==]
export.qualifier_to_placetype_equivs = {
["fictional"] = "fictional location",
["legendary"] = "mythological location",
["mythical"] = "mythological location",
["mythological"] = "mythological location",
-- For e.g. Taiwan as a "claimed province" of China; parts of Belize as claimed by Guatemala; various islands
-- claimed by various parties in East Asia. FIXME: We should conditionalize on what is being claimed since there are
-- also claimed capitals, e.g. Israel and Palestine claim Jerusalem as their capital.
["claimed"] = "claimed political division",
}
--[==[ var:
Mapping from placetypes to the corresponding plural category-only placetype for a capital of that placetype. The reverse
mapping also exists.
]==]
export.placetype_to_capital_cat = {
["autonomous community"] = "autonomous community capitals",
["canton"] = "cantonal capitals",
["comarca"] = "comarca capitals",
["ประเทศ"] = "เมืองหลวงของประเทศ",
-- The following are not obviously different from 'county seats' but the latte terminology is used in the US.
["เทศมณฑล"] = "เมืองหลวงของเทศมณฑล",
["department"] = "departmental capitals",
["อำเภอ"] = "เมืองหลวงของอำเภอ",
["division"] = "division capitals",
["emirate"] = "emirate capitals",
["governorate"] = "governorate capitals",
["hromada"] = "hromada capitals",
["krai"] = "krai capitals",
["มหานคร"] = "เมืองหลวงของมหานคร",
["เทศบาล"] = "เมืองหลวงของเทศบาล",
["oblast"] = "oblast capitals",
["okrug"] = "okrug capitals",
["prefecture"] = "prefectural capitals",
["จังหวัด"] = "เมืองหลวงของจังหวัด",
["raion"] = "raion capitals",
["regency"] = "regency capitals",
["ภูมิภาค"] = "เมืองหลวงของภูมิภาค",
["regional unit"] = "regional unit capitals",
["republic"] = "republic capitals",
["รัฐ"] = "เมืองหลวงของรัฐ",
["ดินแดน"] = "เมืองหลวงของดินแดน",
["voivodeship"] = "voivodeship capitals",
}
--[==[ var:
This contains placenames that should be preceded by an article (almost always "the"). '''NOTE''': There are multiple
ways that placenames can come to be preceded by "the":
# Listed here.
# Given in [[Module:place/locations]] with an initial "the". All such placenames are added to this map by the code
just below the map.
# The placetype of the placename has `holonym_use_the = true` in its placetype_data.
# A regex in placename_the_re matches the placename.
Note that "the" is added only before the first holonym in a place description.
]==]
export.placename_article = {
-- This should only contain info that can't be inferred from [[Module:place/locations]].
["archipelago"] = {
["Cyclades"] = "the",
["Dodecanese"] = "the",
},
["ประเทศ"] = {
["Holy Roman Empire"] = "the",
},
["จักรวรรดิ"] = {
["Holy Roman Empire"] = "the",
},
["เกาะ"] = {
["North Island"] = "the",
["South Island"] = "the",
},
["ภูมิภาค"] = {
["Balkans"] = "the",
["Russian Far East"] = "the",
["Caribbean"] = "the",
["Caucasus"] = "the",
["Middle East"] = "the",
["New Territories"] = "the",
["North Caucasus"] = "the",
["South Caucasus"] = "the",
["West Bank"] = "the",
["Gaza Strip"] = "the",
},
["valley"] = {
["San Fernando Valley"] = "the",
},
}
--[==[ var:
Regular expressions to apply to determine whether we need to put 'the' before a holonym. The key "*" applies to all
holonyms, otherwise only the regexes for the holonym's placetype apply.
]==]
export.placename_the_re = {
-- We don't need entries for peninsulas, seas, oceans, gulfs or rivers
-- because they have holonym_use_the = true.
["*"] = {"^Isle of ", " Islands$", " Mountains$", " Empire$", " Country$", " Region$", " District$", "^City of "},
["bay"] = {"^Bay of "},
["ทะเลสาบ"] = {"^Lake of "},
["ประเทศ"] = {"^Republic of ", " Republic$"},
["republic"] = {"^Republic of ", " Republic$"},
["ภูมิภาค"] = {" [Rr]egion$"},
["แม่น้ำ"] = {" River$"},
["local government area"] = {"^Shire of "},
["เทศมณฑล"] = {"^Shire of "},
["Indian reservation"] = {" Reservation", " Nation"},
["tribal jurisdictional area"] = {" Reservation", " Nation"},
}
--[==[ var:
If any of the following holonyms are present, the associated holonyms are automatically added to the end of the list of
holonyms for categorization (but not display) purposes.
]==]
export.cat_implications = {
["ภูมิภาค"] = {
["Eastern Europe"] = {"continent/Europe"},
["Central Europe"] = {"continent/Europe"},
["Western Europe"] = {"continent/Europe"},
["South Europe"] = {"continent/Europe"},
["Southern Europe"] = {"continent/Europe"},
["Northern Europe"] = {"continent/Europe"},
["Northeast Europe"] = {"continent/Europe"},
["Northeastern Europe"] = {"continent/Europe"},
["Southeast Europe"] = {"continent/Europe"},
["Southeastern Europe"] = {"continent/Europe"},
["North Caucasus"] = {"continent/Europe"},
["South Caucasus"] = {"continent/Asia"},
["South Asia"] = {"continent/Asia"},
["Southern Asia"] = {"continent/Asia"},
["East Asia"] = {"continent/Asia"},
["Eastern Asia"] = {"continent/Asia"},
["Central Asia"] = {"continent/Asia"},
["West Asia"] = {"continent/Asia"},
["Western Asia"] = {"continent/Asia"},
["Southeast Asia"] = {"continent/Asia"},
["North Asia"] = {"continent/Asia"},
["Northern Asia"] = {"continent/Asia"},
["Anatolia"] = {"continent/Asia"},
["Asia Minor"] = {"continent/Asia"},
["Mesopotamia"] = {"continent/Asia"},
["North Africa"] = {"continent/Africa"},
["Central Africa"] = {"continent/Africa"},
["West Africa"] = {"continent/Africa"},
["East Africa"] = {"continent/Africa"},
["Southern Africa"] = {"continent/Africa"},
["Central America"] = {"continent/Central America"},
["Caribbean"] = {"continent/North America"},
["Polynesia"] = {"continent/Oceania"},
["Micronesia"] = {"continent/Oceania"},
["Melanesia"] = {"continent/Oceania"},
["Siberia"] = {"country/Russia", "continent/Asia"},
["Russian Far East"] = {"country/Russia", "continent/Asia"},
["South Wales"] = {"constituent country/Wales", "continent/Europe"},
["Balkans"] = {"continent/Europe"},
["West Bank"] = {"country/Palestine", "continent/Asia"},
["Gaza"] = {"country/Palestine", "continent/Asia"},
["Gaza Strip"] = {"country/Palestine", "continent/Asia"},
}
}
------------------------------------------------------------------------------------------
-- Category and display handlers --
------------------------------------------------------------------------------------------
local function city_type_cat_handler(data)
local entry_placetype = data.entry_placetype
local generic_before_non_cities = export.get_placetype_prop(entry_placetype, "generic_before_non_cities")
if not generic_before_non_cities then
internal_error("city_type_cat_handler called on placetype %s that doesn't have a `generic_before_non_cities`" ..
" setting", entry_placetype)
end
local plural_entry_placetype = export.pluralize_placetype(entry_placetype)
local group, key, spec, container_trail = export.find_matching_holonym_location(data)
if group and not spec.is_former_place and not spec.is_city then
-- Categorize both in key, and in the larger polity that the key is part of, e.g. [[Hirakata]] goes in both
-- "Cities in Osaka Prefecture" and "Cities in Japan". (But don't do the latter if no_container_cat is set.)
local cap_plural_entry_placetype = ucfirst(plural_entry_placetype)
local retcats = {("%s%s%s"):format(cap_plural_entry_placetype, generic_before_non_cities, export.get_prefixed_key(key, spec))} --th
if container_trail[1] and not spec.no_container_cat then
for _, container in ipairs(container_trail[1]) do
insert(retcats, ("%s%s%s"):format(cap_plural_entry_placetype, generic_before_non_cities, export.get_prefixed_key(container.key, container.spec))) --th
end
end
return retcats
end
end
local function capital_city_cat_handler(data, non_city)
local holonym_placetype, holonym_placename, holonym_index, place_desc =
data.holonym_placetype, data.holonym_placename, data.holonym_index, data.place_desc
-- The first time we're called we want to return something; otherwise we will be called for later-mentioned
-- holonyms, which can result in wrongly classifying into e.g. `National capitals`. Simulate the loop in
-- find_placetype_cat_specs() over holonyms so we get the proper `Cities in ...` categories as well as the capital
-- category/categories we add below.
local retcats
if not non_city and place_desc.holonyms then
for h_index, holonym in export.get_holonyms_to_check(place_desc, holonym_index) do
local h_placetype, h_placename = holonym.placetype, holonym.unlinked_placename
retcats = city_type_cat_handler {
entry_placetype = "นคร",
holonym_placetype = h_placetype,
holonym_placename = h_placename,
holonym_index = h_index,
place_desc = place_desc,
}
if retcats then
break
end
end
end
if not retcats then
retcats = {}
end
-- Now find the appropriate capital-type category for the placetype of the holonym, e.g. 'State capitals'. If we
-- recognize the holonym among the known holonyms in [[Module:place/locations]], also add a category like 'State
-- capitals of the United States'. Truncate e.g. 'autonomous region' to 'region', 'union territory' to 'territory'
-- when looking up the type of capital category, if we can't find an entry for the holonym placetype itself (there's
-- an entry for 'autonomous community').
local capital_cat = export.placetype_to_capital_cat[holonym_placetype]
if not capital_cat then
capital_cat = export.placetype_to_capital_cat[holonym_placetype:gsub("^.* ", "")]
end
if capital_cat then
capital_cat = ucfirst(capital_cat)
local inserted_specific_variant_cat = false
if holonym_index then
-- Now find the first recognized holonym location. We don't stop when :also is seen because of the common pattern
-- where we use :also to specify that a given city is the capital at multiple surrounding levels.
local matching_group, matching_key, matching_spec, matching_container_trail, matching_holonym_index
for h_index = holonym_index, #place_desc.holonyms do
if place_desc.holonyms[h_index].placetype then
matching_group, matching_key, matching_spec, matching_container_trail = export.find_matching_holonym_location {
holonym_placetype = place_desc.holonyms[h_index].placetype,
holonym_placename = place_desc.holonyms[h_index].unlinked_placename,
holonym_index = h_index,
place_desc = place_desc,
}
if matching_group then
matching_holonym_index = h_index
break
end
end
end
if matching_holonym_index == holonym_index then
if matching_container_trail[1] and not matching_spec.no_container_cat then
for _, container in ipairs(matching_container_trail[1]) do
insert(retcats, ("%sของ%s"):format(capital_cat, export.get_prefixed_key(container.key,
container.spec)))
inserted_specific_variant_cat = true
end
end
elseif matching_holonym_index then
-- Check to make sure that the holonym placetype we were called on is listed among the
-- divtypes of the location we found.
local function insert_specific_variant_if_possible(key, spec)
return export.get_equiv_placetype_prop(holonym_placetype, function(pt)
local plural_holonym_placetype = export.pluralize_placetype(pt)
local saw_matching_div
if spec.divs then
local divs = spec.divs
if type(divs) ~= "table" then
divs = {divs}
end
for _, div in ipairs(divs) do
if type(div) ~= "table" then
div = {type = div}
end
if plural_holonym_placetype == div.type then
saw_matching_div = true
break
end
end
end
if saw_matching_div then
insert(retcats, ("%sของ%s"):format(capital_cat, export.get_prefixed_key(key, spec)))
return true
end
return false
end)
end
if insert_specific_variant_if_possible(matching_key, matching_spec) then
inserted_specific_variant_cat = true
elseif not matching_spec.no_container_cat then
for _, containers in ipairs(matching_container_trail) do
local saw_no_container_cat = false
for _, container in ipairs(containers) do
if insert_specific_variant_if_possible(container.key, container.spec) then
inserted_specific_variant_cat = true
break
end
saw_no_container_cat = saw_no_container_cat or container.spec.no_container_cat
end
if inserted_specific_variant_cat or saw_no_container_cat then
break
end
end
end
end
else
-- This happens when in an invocation like {{place|en|capital city|s/Haryana,Punjab}} for
-- [[Chandigarh]]. We fall back to older code that doesn't depend on the holonym index existing.
-- FIXME: This may not be necessary. In the example just given, when processing Haryana we add to
-- [[:Category:en:State capitals of India]], and nothing extra gets added when processing Punjab.
-- Possibly we can just skip this case entirely.
local group, key, spec, container_trail = export.find_matching_holonym_location(data)
if group and container_trail[1] and not spec.no_container_cat then
for _, container in ipairs(container_trail[1]) do
insert(retcats, ("%sของ%s"):format(capital_cat, export.get_prefixed_key(container.key,
container.spec)))
inserted_specific_variant_cat = true
end
end
end
if not inserted_specific_variant_cat then
insert(retcats, capital_cat)
end
else
-- We didn't recognize the holonym placetype; just put in 'Capital cities'.
insert(retcats, "เมืองหลวง")
end
return retcats
end
--[=[
This is invoked specially for all placetypes (see the `*` placetype key at the bottom of `placetype_data`). This is used
in two ways:
# To add pages to generic holonym categories like [[:Category:en:สถานที่ในMerseyside, England]] (and
[[:Category:en:สถานที่ในEngland]]) for any pages that have `co/Merseyside` as their holonym.
# To categorize demonyms in bare placename categories like [[:Category:en:Merseyside, England]] if the demonym
description mentions `co/Merseyside` and doesn't mention a more specific placename that also has a category. (In this
case there are none, but we can have demonyms at multiple levels, e.g. in France for individual villages, departments,
administrative regions, and for the entire country, and for example we only want to categorize a demonym into
[[:Category:France]] if no more specific category applies.) Unlike when invoked from {{tl|place}}, a demonym
invocation only adds the most specific holonym category and not the category of any containing polity (hence if we
add [[:Category:en:Merseyside, England]] we won't also add [[:Category:England]]).
This code also handles cities; e.g. for the first use case above, it would be used to add a page that has `city/Boston`
as a holonym to [[:Category:en:สถานที่ในBoston]], along with [[:Category:en:สถานที่ในMassachusetts, USA]] and
[[:Category:en:สถานที่ในthe United States]]. The city handler tries to deal with the possibility of multiple cities
having the same name. For example, the code in [[Module:place/locations]] knows about the city of [[Columbus]],
[[Ohio]], which has containing polities `Ohio` (a state) and `the United States` (a country). If either containing
polity is mentioned, the handler proceeds to return the key `Columbus` (along with `Ohio, USA` and `the United States`).
Otherwise, if any other state or country is mentioned, the handler returns nothing, and otherwise it assumes the
mentioned city is the one we're considering and returns `Columbus` etc. This works correctly if the place only mentions
Ohio and a holonym for a Columbus in a different country is encountered, because of the function
`augment_holonyms_with_container`, which adds the US as a holonym when Ohio is encountered.
The single parameter `data` is as in category handlers. The return value is a list of categories (without the preceding
language code).
]=]
local function generic_place_cat_handler(data)
local from_demonym = data.from_demonym
local retcats = {}
local function insert_retkey(key, spec)
if from_demonym then
insert(retcats, key)
else
insert(retcats, ("สถานที่ใน%s"):format(export.get_prefixed_key(key, spec)))
end
end
local group, key, spec, container_trail = export.find_matching_holonym_location(data)
if group then
if not spec.no_generic_place_cat then
-- This applies to continents and continental regions.
insert_retkey(key, spec)
end
-- Categorize both in key, and in the larger location(s) that the key is part of, e.g. [[Hirakata]] goes in
-- both [[Category:สถานที่ในOsaka Prefecture, Japan]] and [[Category:สถานที่ในJapan]]. But not when
-- no_container_cat is set (e.g. for 'United Kingdom').
if not spec.no_container_cat then
for _, container_set in ipairs(container_trail) do
local stop_adding_containers = false
for _, container in ipairs(container_set) do
if not container.spec.no_generic_place_cat then
insert_retkey(container.key, container.spec)
end
if container.spec.no_container_cat then
stop_adding_containers = true
end
end
if stop_adding_containers then
break
end
end
end
return retcats
end
end
--[==[
Special category handler run for all placetypes that checks for specified division placetypes of known locations and
categorizes appropriately.
]==]
function export.political_division_cat_handler(data)
if data.from_demonym then
return
end
local group, key, spec, container_trail = export.find_matching_holonym_location(data)
if group then
local divlists = {}
if spec.divs then
insert(divlists, spec.divs)
end
if spec.addl_divs then
insert(divlists, spec.addl_divs)
end
for _, divlist in ipairs(divlists) do
if type(divlist) ~= "table" then
divlist = {divlist}
end
for _, div in ipairs(divlist) do
if type(div) == "string" then
div = {type = div}
end
local sgdiv = export.maybe_singularize_placetype(div.type) or div.type
local prep = div.prep or "ของ"
local cat_as = div.cat_as or div.type
if type(cat_as) ~= "table" then
cat_as = {cat_as}
end
if not export.placetype_data[sgdiv] then
internal_error("Placetype %s associated with known location key %s and data %s not found in " ..
"`placetype_data`", sgdiv, key, spec)
end
if sgdiv == data.entry_placetype then
local retcats = {}
for _, pt_cat in ipairs(cat_as) do
if type(pt_cat) == "string" then
pt_cat = {type = pt_cat}
end
local pt_prep = pt_cat.prep or prep
insert(retcats, ucfirst(pt_cat.type) .. pt_prep .. export.get_prefixed_key(key, spec)) --th
end
return retcats
end
end
end
end
end
--[==[
This is used to add pages to "bare" categories like [[:Category:en:Georgia, USA]] for `[[Georgia]]` and any
foreign-language terms that are translations of the state of Georgia. We look at the page title (or its overridden value
in {{para|pagename}}) as well as the glosses in {{para|t}}/{{para|t2}} etc., various extra-info values such as the
modern names in {{para|modern}}, and any values specified using a form-of directive. We need to pay attention to the
entry placetypes specified so we don't overcategorize; e.g. the US state of Georgia is `[[Джорджия]]` in Russian but the
country of Georgia is `[[Грузия]]`, and if we just looked for matching names, we'd get both Russian terms categorized
into both [[:Category:ru:Georgia, USA]] and [[:Category:ru:Georgia]]. We also need to check the containing holonyms to
make sure there isn't a mismatch (so we don't e.g. categorize Newark, Delaware in [[:Category:en:Newark]], which is
intended for Newark, New Jersey).
]==]
function export.get_bare_categories(args, overall_place_spec)
local bare_cats = {}
local place_descs = overall_place_spec.descs
local possible_placetypes_by_place_desc = {}
for i, place_desc in ipairs(place_descs) do
possible_placetypes_by_place_desc[i] = {}
for _, placetype in ipairs(place_desc.placetypes) do
if not export.placetype_is_ignorable(placetype) then
local equivs = export.get_placetype_equivs(placetype, {register_former_as_non_former = true})
for _, equiv in ipairs(equivs) do
insert(possible_placetypes_by_place_desc[i], equiv.placetype)
end
end
end
end
local function check_term(term)
-- Treat Wikipedia links like local ones.
term = term:gsub("%[%[w:", "[["):gsub("%[%[wikipedia:", "[[")
term = export.remove_links_and_html(term)
term = term:gsub("^the ", "")
for i, place_desc in ipairs(place_descs) do
-- Iterate over all matching locations in case there are multiple, as with Delhi defined as
-- {{place|en|megacity/and/union territory|c/India|containing the national capital [[New Delhi]]}}.
for group, key, spec, container_trail in export.iterate_matching_holonym_location {
holonym_placetype = possible_placetypes_by_place_desc[i],
holonym_placename = term,
place_desc = place_desc,
} do
insert(bare_cats, key)
end
end
end
-- FIXME: Should we only do the following if the language is English (requires that the lang is passed in)?
-- We should always do it if `pagename` is given (as it is with {{tcl}}) but maybe not otherwise unless 1=en. There
-- are cases like [[Ankara]] = English name for capital of Turkey, but also the name in various languages for the
-- capital of Ghana (= English [[Accra]]). But this should get caught by mismatching the containing country. The
-- advantage of checking when the language isn't English is we catch those places that fail to give an English
-- translation but where the translation happens to be the same as the other-language spelling. However, I don't
-- know how often this situation occurs.
check_term(args.pagename or mw.title.getCurrentTitle().subpageText)
for _, t in ipairs(args.t) do
check_term(t)
end
local function check_termobj_list(terms)
for _, term in ipairs(terms) do
if term.eq then
check_term(term.eq)
end
if term.alt or term.term then
check_term(term.alt or term.term)
end
end
end
for _, extra_info_terms in ipairs(overall_place_spec.extra_info) do
local arg = extra_info_terms.arg
if arg == "modern" or arg == "now" or arg == "full" or arg == "short" then
check_termobj_list(extra_info_terms.terms)
end
end
for _, directive in ipairs(overall_place_spec.directives) do
check_termobj_list(directive.terms)
end
return bare_cats
end
--[==[
This is used to augment the holonyms associated with a place description with the containing polities. For example,
given the following:
`# {{tl|place|en|subprefecture|pref/Hokkaido}}.`
We auto-add Japan as another holonym so that the term gets categorized into [[:Category:Subprefectures of Japan]].
To avoid over-categorizing we need to check to make sure no other countries are specified as holonyms.
]==]
function export.augment_holonyms_with_container(place_descs)
for _, place_desc in ipairs(place_descs) do
if place_desc.holonyms then
-- This ends up containing a copy of the original holonyms, with the augmented holonyms inserted in their
-- appropriate position. We don't just put them at the end because some holonyms have use the `:also`
-- modifier, which causes category processing to restart at that point after generating categories for a
-- preceding holonym, and we don't want the preceding holonym's augmented holonyms interfering with
-- categorization of a later holonym. We proceed from right to left, and each time we augment, we copy
-- the holonyms with the augmented holonym(s) inserted appropriately and replace the place description's
-- holonyms with the augmented ones before the next iteration. The reason for this is so that e.g.
-- {{place|neighborhood|city/Birmingham|co/West Midlands|cc/England}} doesn't throw an error during the
-- augmentation process due to 'Birmingham' referring to two known locations (in England and Alabama). If
-- we go left to right, we will throw an ambiguity error on `city/Birmingham` because code to exclude
-- Birmingham, Alabama needs `c/United Kingdom` present (to cause a mismatch with `c/United States`),
-- which isn't yet present as the augmentation code hasn't gotten to `cc/England` yet. For similar
-- reasons, we need to include the augmented holonyms in the holonyms considered in the next iteration
-- rather than modifying the place description once at athe end.
for i = #place_desc.holonyms, 1, -1 do
local holonym = place_desc.holonyms[i]
if holonym.placetype and not export.placetype_is_ignorable(holonym.placetype) then
local group, key, spec, container_trail = export.find_matching_holonym_location {
holonym_placetype = holonym.placetype,
holonym_placename = holonym.unlinked_placename,
holonym_index = i,
place_desc = place_desc,
}
if group and container_trail[1] and not spec.no_auto_augment_container then
local augmented_holonyms = {}
for j = 1, i do
insert(augmented_holonyms, place_desc.holonyms[j])
end
for _, containers in ipairs(container_trail) do
local any_no_auto_augment_container = false
for _, container in ipairs(containers) do
any_no_auto_augment_container = any_no_auto_augment_container or
container.spec.no_auto_augment_container
local containing_type = container.spec.placetype
if type(containing_type) == "table" then
-- If the containing type is a list, use the first element as the canonical variant.
containing_type = containing_type[1]
end
local full_container_placename, elliptical_container_placename =
m_locations.key_to_placename(container.group, container.key)
-- Don't side-effect holonyms while processing them.
local new_holonym = {
-- By the time we run, the display has already been generated so we don't need to
-- set display_placename.
placetype = containing_type,
-- placename_to_key() for the group should correctly handle both full and elliptical
-- placenames, but the full placename seems less likely to be ambiguous. FIXME: We
-- should just store the key directly and use it when available to avoid having to
-- convert key to placename and back to key.
unlinked_placename = full_container_placename,
-- Indicate that this is an augmented holonym, and was derived from the specified
-- holonym. In iterate_matching_holonym_location(), we ignore augmented holonyms
-- derived from holonyms that are different from the holonym we're searching for but
-- of the same placetype. This is to correctly handle a situation like
-- {{place|river|dept/Ardèche,Gard,Vaucluse,Bouches-du-Rhône|c/France}}. Here,
-- `Ardèche` is in `r/Auvergne-Rhône-Alpes`, while `Gard` is in `r/Occitania` and
-- the other two are in `r/Provence-Alpes-Côte d'Azur`. Augmenting proceeds from
-- right to left, so after it adds `r/Provence-Alpes-Côte d'Azur` to
-- `Bouches-du-Rhône`, Vaucluse gets augmented correctly but `Gard` fails to match
-- in find_matching_holonym_location() because of the mismatch between augmented
-- `r/Provence-Alpes-Côte d'Azur` and actual `r/Occitania`. Similarly, all later
-- calls to find_matching_holonym_location() fail to match `Gard` (and likewise
-- `Ardèche`) against any known location. To deal with this, we mark augmented
-- holoynms as being augmented due to a source holonym, and when processing a given
-- holonym, ignore augmented holonyms from other holonyms of the same placetype.
-- The restriction to the same placetype is so that `Birmingham` still gets
-- correctly disambiguated to Birmingham, England in the example given above near
-- the top of this function, using the augmented holonym `c/United Kingdom` added by
-- the specified `cc/England` (whose placetype `constituent country` differs from
-- the placetype `city` of Birmingham).
augmented_from_holonym = holonym,
}
insert(augmented_holonyms, new_holonym)
-- But it is safe to modify other parts of the place_desc.
export.key_holonym_into_place_desc(place_desc, new_holonym)
end
if any_no_auto_augment_container then
break
end
end
for j = i + 1, #place_desc.holonyms do
insert(augmented_holonyms, place_desc.holonyms[j])
end
place_desc.holonyms = augmented_holonyms
end
end
end
end
end
end
-- Cat handler for district, areas, neighborhoods and suburbs. Districts are tricky because they can either be political
-- divisions or city neighborhoods. Areas similarly can be political divisions (rarely; specifically, in Kuwait), city
-- neighborhoods or larger geographical areas/regions. We handle this as follows:
-- (1) `placetype_data` cat entries for specific countries or country divisions take precedence over cat_handlers, so if
-- the user says {{tl|place|district|s/Maharashtra|c/India}}, we won't even be called because there is an entry that
-- categorizes into [[:Category|Districts of Maharashtra, India]].
-- (2) If we're called, we check the holonym we're called on to see if it is a recognized city, e.g. if we're called
-- using {{tl|place|district|city/Mumbai|s/Maharashtra|c/India}}. If so, we categorize under e.g.
-- [[:Category:Neighbourhoods of Mumbai]]. (Choosing the spelling "neighbourhoods" because we're in India.)
-- (3) If we're called and the holonym is not a recognized city, we check if the placetype has has_neighborhoods set.
-- If so, it's "city-like" and we categorize under the first containing polity that we recognize. For example, if
-- we're called using {{tl|place|district|town/Northampton|co/Hampshire|s/Massachusetts|c/US}}, we should recognize
-- town as "city-like" and categorize under [[:Category:Neighborhoods in Massachusetts]]. (Note "ใน" not "ของ", and
-- note the spelling "neighborhoods" because we're in the US.)
-- (4) If the holonym is not city-like, we do nothing. If there's a city or city-like placetype farther up (e.g. we're
-- called as {{tl|place|district|ward/Foo|mun/Bar|...}}), we will handle the city-like entity according to (2) or
-- (3) when called on that holonym. Otherwise either the categorization in (1) takes place or there's no
-- categorization.
local function district_neighborhood_cat_handler(data)
local function get_plural_entry_placetype(location_spec, container_trail)
if data.entry_placetype == "suburb" then
return "Suburbs"
else
-- Check for `british_spelling` setting on the spec itself or any container.
local uses_british_spelling = location_spec.british_spelling
if uses_british_spelling == nil and container_trail then
for _, container_set in ipairs(container_trail) do
local must_outer_break = false
for _, container in ipairs(container_set) do
if container.spec.british_spelling ~= nil then
uses_british_spelling = container.spec.british_spelling
must_outer_break = true
break
end
end
if must_outer_break then
break
end
end
end
return uses_british_spelling and "Neighbourhoods" or "Neighborhoods"
end
end
-- First check the immediate holonym to see if it's a city or a city-like top-level entity (Hong Kong, Bonaire,
-- etc.)
local group, key, spec, container_trail = export.find_matching_holonym_location(data)
if group and not spec.is_former_place and spec.is_city then
return {get_plural_entry_placetype(spec, container_trail) .. " of " .. export.get_prefixed_key(key, spec)}
end
-- If the entry placetype is neighbo(u)rhood, assume it is a neighborhood even if there isn't a city-like
-- entity father up the chain. (E.g. due to a mistaken use of m/ instead of mun/ for municipality.)
local has_neighborhoods
local entry_placetype = data.entry_placetype
if entry_placetype == "neighborhood" or entry_placetype == "neighbourhood" or entry_placetype == "suburb" then
has_neighborhoods = true
else
-- Otherwise, make sure the current holonym is city-like.
has_neighborhoods = export.get_equiv_placetype_prop(data.holonym_placetype, function(pt)
return export.get_placetype_prop(pt, "has_neighborhoods")
end, {continue_on_nil_only = true})
end
if has_neighborhoods then
-- Loop up the holonyms, looking for city and city-like entities in case of e.g. [[Sepulveda]] written
-- {{place|en|neighborhood|valley/San Fernando Valley|city/Los Angeles|s/California|c/USA}}
-- but also look for a recognizable poldiv, and if so categorize as "Neighborhoods in POLDIV". We need
-- to start with the current holonym, which is especially important for neighborhoods and suburbs that
-- may have the first holonym be a recognizable province, etc. but can't hurt otherwise. (Previously
-- we skipped the first/current holonym.)
for other_holonym_index, other_holonym in export.get_holonyms_to_check(data.place_desc,
data.holonym_index) do
local other_holonym_data = {
holonym_placetype = other_holonym.placetype,
holonym_placename = other_holonym.unlinked_placename,
holonym_index = other_holonym_index,
place_desc = data.place_desc,
}
local group, key, spec, container_trail = export.find_matching_holonym_location(other_holonym_data)
if group and not spec.is_former_place then
return {get_plural_entry_placetype(spec, container_trail) .. (spec.is_city and "ของ" or "ใน") ..
export.get_prefixed_key(key, spec)}
end
end
end
end
function export.check_already_seen_string(holonym_placename, already_seen_strings)
local canon_placename = ulower(m_links.remove_links(holonym_placename))
if type(already_seen_strings) ~= "table" then
already_seen_strings = {already_seen_strings}
end
for _, already_seen_string in ipairs(already_seen_strings) do
if canon_placename:find(already_seen_string) then
return true
end
end
return false
end
-- Prefix display handler that adds a prefix such as "Metropolitan Borough of " to the display
-- form of holonyms. We make sure the holonym doesn't contain the prefix or some variant already.
-- We do this by checking if any of the strings in ALREADY_SEEN_STRINGS, either a single string or
-- a list of strings, or the prefix if ALREADY_SEEN_STRINGS is omitted, are found in the holonym
-- placename, ignoring case and links. If the prefix isn't already present, we create a link that
-- uses the raw form as the link destination but the prefixed form as the display form, unless the
-- holonym already has a link in it, in which case we just add the prefix.
local function prefix_display_handler(prefix, holonym_placename, already_seen_strings)
if export.check_already_seen_string(holonym_placename, already_seen_strings or ulower(prefix)) then
return holonym_placename
end
if holonym_placename:find("%[%[") then
return prefix .. " " .. holonym_placename
end
return prefix .. " [[" .. holonym_placename .. "]]"
end
-- Suffix display handler that adds a suffix such as " parish" to the display form of holonyms.
-- Works identically to prefix_display_handler but for suffixes instead of prefixes.
local function suffix_display_handler(suffix, holonym_placename, already_seen_strings, include_suffix_in_link)
if export.check_already_seen_string(holonym_placename, already_seen_strings or ulower(suffix)) then
return holonym_placename
end
if holonym_placename:find("%[%[") then
return holonym_placename .. " " .. suffix
end
if include_suffix_in_link then
return "[[" .. holonym_placename .. " " .. suffix .. "]]"
else
return "[[" .. holonym_placename .. "]] " .. suffix
end
end
-- Display handler for boroughs. New York City boroughs are display as-is. Others are suffixed
-- with "borough".
local function borough_display_handler(holonym_placetype, holonym_placename)
local unlinked_placename = m_links.remove_links(holonym_placename)
if m_locations.new_york_boroughs[unlinked_placename] then
-- Hack: don't display "borough" after the names of NYC boroughs
return holonym_placename
end
return suffix_display_handler("borough", holonym_placename)
end
local function county_display_handler(holonym_placetype, holonym_placename)
local unlinked_placename = m_links.remove_links(holonym_placename)
-- Display handler for Irish counties. Irish counties are displayed as e.g. "County [[Cork]]".
if m_locations.ireland_counties["County " .. unlinked_placename .. ", Ireland"] or
m_locations.northern_ireland_counties["County " .. unlinked_placename .. ", Northern Ireland"] then
return prefix_display_handler("เทศมณฑล", holonym_placename)
end
-- Display handler for Taiwanese counties. Taiwanese counties are displayed as e.g. "[[Chiayi]] County".
if m_locations.taiwan_counties[unlinked_placename .. " County, Taiwan"] then
return suffix_display_handler("เทศมณฑล", holonym_placename)
end
-- Display handler for Romanian counties. Romanian counties are displayed as e.g. "[[Cluj]] County".
if m_locations.romania_counties[unlinked_placename .. " County, Romania"] then
return suffix_display_handler("เทศมณฑล", holonym_placename)
end
-- FIXME, we need the same for US counties but need to key off the country, not the specific county.
-- Others are displayed as-is.
return holonym_placename
end
-- Display handler for prefectures. Japanese prefectures are displayed as e.g. "[[Fukushima]] Prefecture".
-- Others are displayed as e.g. "[[Fthiotida]] prefecture".
local function prefecture_display_handler(holonym_placetype, holonym_placename)
local unlinked_placename = m_links.remove_links(holonym_placename)
local suffix = m_locations.japan_prefectures[unlinked_placename .. " Prefecture, Japan"] and "Prefecture" or "prefecture"
return suffix_display_handler(suffix, holonym_placename)
end
-- Display handler for provinces of Iran, Laos, North and South Korea, Thailand, Turkey and Vietnam. Recognized
-- provinces are displayed as e.g. "[[Gyeonggi]] Province" or "[[Antalya]] Province". Others are displayed as-is.
local function province_display_handler(holonym_placetype, holonym_placename)
local unlinked_placename = m_links.remove_links(holonym_placename)
if
m_locations.iran_provinces[unlinked_placename .. ", Iran"] or
m_locations.laos_provinces[unlinked_placename .. ", Laos"] or
m_locations.north_korea_provinces[unlinked_placename .. ", North Korea"] or
m_locations.south_korea_provinces[unlinked_placename .. ", South Korea"] or
m_locations.thailand_provinces[unlinked_placename .. ", ไทย"] or
m_locations.turkey_provinces[unlinked_placename .. ", Turkey"] or
m_locations.vietnam_provinces[unlinked_placename .. ", เวียดนาม"] then
return suffix_display_handler("จังหวัด", holonym_placename)
end
return holonym_placename
end
-- Display handler for Nigerian states. Nigerian states are display as "[[Kano]] State". Others are displayed as-is.
local function state_display_handler(holonym_placetype, holonym_placename)
local unlinked_placename = m_links.remove_links(holonym_placename)
if m_locations.nigeria_states[unlinked_placename .. " State, Nigeria"] then
return suffix_display_handler("รัฐ", holonym_placename)
end
return holonym_placename
end
-- Display handler for voivodeships. Display as e.g. [[Subcarpathian Voivodeship]].
local function voivodesip_display_handler(holonym_placetype, holonym_placename)
return suffix_display_handler("Voivodeship", holonym_placename, nil, "include_suffix_in_link")
end
------------------------------------------------------------------------------------------
-- Placetype data --
------------------------------------------------------------------------------------------
--[==[ var:
Main placetype data structure. This specifies, for each canonicalized placetype, various properties. The keys are
placetypes (in the singular, except for category-only placetypes, which are plural and followed by `!`), and the value
is a table of properties. The `"*"` key is special and is used for adding "generic" categories of the form
`สถานที่ใน``location`` `; it runs for all entry placetypes. Keys in the form of plural placetypes followed by `!` are
used only in [[Module:category tree/topic cat/data/Places]] for specifying the properties of categories containing the
specified placetype, esp. bare categories like [[:Category:States and territories]] (rather than qualified categories
like [[:Category:States and territories of Australia]]).
Keys under the value table for a given placetype of are two types: ''property keys'' (which specify the value of
specific properties) and ''categorization keys'' (which tell how to categorize certain sorts of holonyms if the
placetype in question occurs as an entry placetype). Categorization keys are either the special value `default` or are
wildcard strings with a slash in them, such as `"country/*"`. Note that only wildcard strings are currently allowed
directly in the placetype data; everything else is handled through category handlers, either per-placetype or special
(such as `political_division_cat_handler`). The algorithm for how category keys and handlers are used to generate
categories is described at the top of [[Module:place]].
There are several recognized property keys, of various types:
1. The following link-related property keys are recognized:
* `link`: '''Required''' except in category-only placetypes ending in `!`. Describes how to link and display the
placetype in the formatted description when occurring as an entry placetype. Also used for formatting pluralized
placetypes (which may occur in entry placetypes, esp. new-format ones, such as `two <<islands>>`) and may occur in
categories). The possible values are:
*# `true`: Link to the same-named Wiktionary entry. This creates a raw link, e.g. `<nowiki>[[city]]</nowiki>`, which is
converted to an English-specific link by JavaScript postprocessing. If the placetype is plural, this creates a
two-part raw link e.g. `<nowiki>[[city|cities]]</nowiki>`.
*# `"w"`: Link to the same-named Wikipedia entry. This creates a two-part link, e.g.
`<nowiki>[[w:census town|census town]]</nowiki>`, or `<nowiki>[[w:census town|census towns]]</nowiki>` if the
placetype is given plural.
*# `"+..."`: Create a two-part link to the entry following the `+` sign. For example, if `cercle` specifies
`"+w:cercles of Mali"`, a two-part link `<nowiki>[[w:cercles of Mali|cercle]]</nowiki>` will be generated, or
`<nowiki>[[w:cercles of Mali|cercles]]</nowiki>` if plural `cercles` is specified.
*# `"separately"`: Link each word separately. For example, if `administrative territory` specifies `"separately"`, it
will be linked as `<nowiki>[[administrative]] [[territory]]</nowiki>`, or as
`<nowiki>[[administrative]] [[territory|territories]]</nowiki>` if plural `administrative territories` is given.
*# another string: Use that string directly. If the placetype is plural, `pluralize()` in [[Module:en-utilities]] is
called on the string, which will correctly pluralize most strings, including those with links in them. (If there
are multiple links, the display form of the last link is pluralized.)
*# `false`: This placetype is not allowed as an entry placetype. An error will be thrown if this placetype is given as
an entry placetype. This is specified for internal-use placetypes, especially placetypes used in conjunction with
the qualifiers `former`, `ancient`, `historical` and such.
* `plural_link`: If specified and the placetype is plural, use the value in place of generating a pluralized version of
the link spec in `link`. Most commonly, this is either a string with links in it (which is used directly) or the
value `false`, indicating that the placetype cannot occur plural. (This is used for example by `caplc`, which displays
as `<nowiki>[[capital]] and [[large]]st [[city]]</nowiki>`, where a plural version doesn't make sense.) Generally if
this is specified, `plural` also needs to be specified to give a special placetype plural; this situation occurs
especially with multiword placetypes where something other than the last word is pluralized. An example is
`town with bystatus`, whose plural is `towns with bystatus`, which needs to be explicitly given. This example uses
`link = <nowiki>"[[town]] with [[bystatus#Norwegian Bokmål|bystatus]]"</nowiki>` ({{m|nb|bystatus}}) is a Norwegian
Bokmål word, and template calls aren't currently permitted in link strings), along with
`plural_link = <nowiki>"[[town]]s with [[bystatus#Norwegian Bokmål|bystatus]]"</nowiki>`.
* `category_link`: Spec indicating how to display the placetype when occurring in category descriptions. Defaults to
the value of `link`, and in turn is overridden by more specific `category_link_*` keys; see below. Category-only
placetypes (which are plural and end in `!`) usually use `category_link` in preference to `link`. The value of
`category_link` can be any of the types of specs given above, but most commonly is a plural string with links in it,
spelling out the description; in this case it is used directly. When both `category_link` and `link` are given, the
value in `category_link` is typically longer and more descriptive. For example, `polity` uses `link = true`, which
just generates a link `<nowiki>[[polity]]</nowiki>` or plural `<nowiki>[[polity|polities]]</nowiki>`, but specifies a
separate `category_link = <nowiki>"[[independent]] or [[semi-]][[independent]] [[polity|polities]]"</nowiki>`, which
clarifies in the category description what a polity is.
* `category_link_top_level`: Spec indicating how to display top-level (bare/unqualified) categories, i.e. categories
where the placetype is not followed by `in ``location`` ` or `of ``location`` `. If given, this overrides
`category_link` for this type of category.
* `category_link_before_noncity`: Spec indicating how to display qualified categories of the form
` ``placetypes`` in/of ``location`` ` where ``location`` does not refer to a city. If given, this overrides
`category_link` for this type of category.
* `category_link_before_city`: Spec indicating how to display qualified categories of the form
` ``placetypes`` in/of ``location`` ` where ``location`` refer to a city. If given, this overrides `category_link` for
this type of category. An example where this is given is `neighborhood`, which uses the following specs:<ol>
<li>`link = true`</li>
<li>`category_link = <nowiki>"[[neighborhood]]s, [[district]]s and other subportions of [[city|cities]]"</nowiki>`</li>
<li>`category_link_before_city = <nowiki>"[[neighborhood]]s, [[district]]s and other subportions"</nowiki>`</li>
</ol> This has the effect of making the entry placetype `neighborhood` display as just
`<nowiki>[[neighborhood]]</nowiki>`, while e.g. a category like `Neighborhoods of Chicago` displays as
`<nowiki>[[neighborhood]]s, [[district]]s and other subportions of [[Chicago]], ...</nowiki>` and a category like
`Neighborhoods in Illinois, USA` displays as
`<nowiki>[[neighborhood]]s, [[district]]s and other subportions of [[city|cities]] in [[Illinois]], ...</nowiki>`.
* `disallow_in_entries`: If specified, this placetype cannot occur as an entry placetype, and the specified value
(a message indicating what to use instead) is displayed in the error message.
* `disallow_in_holonyms`: If specified, this placetype cannot occur as a holonym placetype, and the specified value
(a message indicating what to use instead) is displayed in the error message.
2. There is currently one fallback-related property key recognized:
* `fallback`: If specified, its value is a placetype which will be used for categorization purposes if no categories
get added using the placetype itself. As an example, `branch` sets a fallback of `river` but also sets
`preposition = "ของ"`, meaning that {{tl|place|en|branch|riv/Mississippi}} displays as `a branch of the Mississippi`
(whereas `river` itself uses the preposition `in`), but otherwise categorizes the same as `river`. A more complex
example is `area`, which sets a fallback of `geographic and cultural area` and also sets a category handler that
checks for cities or city-like entities (e.g. boroughs) occurring as holonyms and categorizes the toponym under
[[:Category:Neighborhoods of CITY]] (for recognized cities) or otherwise [[:Category:Neighborhoods of POLDIV]] (for
the nearest containing recognized location). In addition, `area` is set as a political division of Kuwait, meaning if
`c/Kuwait` occurs as holonym, the toponym is categorized under [[:Category:Areas of Kuwait]]. If none of these
categories trigger, the fallback of `geographic and cultural area` will take effect, and the toponym will be
categorized as e.g. [[:Category:Geographic and cultural areas of England]].
3. There is currently one property to control irregular plurals of placetypes:
* `plural`: If specified, its value is the plural of the placetype. Otherwise, the default pluralization algorithm in
[[Module:en-utilities]] applies (which correctly pluralizes most words, including those ending in `-y`, `-ch`, `-sh`,
`-x`, etc.). The value of `plural` is also used when converting a pluralized placetype into its singular equivalent;
for example, since the placetype `kibbutz` has `plural = "kibbutzim"`, the placetype `kibbutzim` will be recognized
as a plural and singularized to `kibbutz`. For this reason, it's occasionally necessary to specify a `plural` value
even when the default pluralization algorithm works correctly, if the default singularization algorithm won't
correctly reverse the pluralization (as with `pass` and other terms ending in `-ss`).
4. The following property keys relate to generating categories for entry placetypes and specifying the parents of those
categories:
* `class`: The general class of placetype. This is used for various purposes: (a) to categorize placetypes preceded by
a qualifier such as `former`, `ancient`, `medieval` or `historical` (note that these placetypes are not all treated
alike); (b) to determine the parent category of bare placetype categories (e.g. [[:Category:Villages]] for placetype
`village`); (c) to determine whether to add a parent category `political divisions of specific countries` to
qualified placetype categories (e.g. [[:Category:Villages in Mali]]). The possible values are:
*# `polity`: a more-or-less sovereign/independent polity, such as a country, kingdom or empire.
*# `subpolity`: a non-sovereign division of a polity, above the level of an individual settlement.
*# `settlement`: a city or smaller equivalent, such as a village. This also includes administrative divisions of a
settlement, such as wards and barangays.
*# `non-admin settlement`: similar to a settlement but without administrative or political significance, such as an
unincorporated community, farm or neighborhood.
*# `capital`: a settlement that is a capital. A former capital is generally still in existence, just not the capital
any more.
*# `natural feature`: any non-man-made feature, such as a lake, mountain, island, ocean, etc.
*# `man-made structure`: a man-made feature below the level of a neighborhood, such as a house, airport, university,
metro station, park or the like.
*# `geographic region`: a geographic or cultural region or area that has no administrative significance. These may vary
greatly in size but typically have some sort of cultural significance (possibly historical). The `former`, `ancient`,
etc. qualifier has no effect on the category of these placetypes.
*# `generic place`: a place that isn't further qualified into any specific subtype.
* `former_type`: The class of placetype used for categorizing placetypes preceded by a qualifier such as `former`,
`ancient`, `medieval` or `historical`. The possible values are the same as for `class` but with the addition of
`dependent territory` (for colonies, protectorates and the like) and `!` (ignore the historical/former/ancient/etc.
qualifier; used e.g. with `fictional location` and `mythological location`). If not specified, the value of `class`
is used. When a qualifier such as `former`, `ancient`, `medieval` or `historical` is encountered (specifically, those
in `former_qualifiers`), it is mapped using `former_qualifiers` to the appropriate internal qualifier or qualifiers
(one or both of `ANCIENT` and/or `FORMER`, which are written in all-caps to distinguish them from user-specified
qualifiers), which is prepended to the value of `former_type` or `class` to form a placetype whose properties are
looked up to determine how to categorize the toponym in question. For example, if `medieval village` is given, we map
`medieval` to `ANCIENT` and `FORMER`, and `village` to its `class` of `settlement`, and enter the placetypes
`ANCIENT settlement` and `FORMER settlement` (in that order) into the list of equivalent placetypes returned by
`get_placetype_equivs`. In this case, there is an entry in `placetype_data` for `ANCIENT settlement`, so its default
category spec `Ancient settlements` is used as the category. If on the other hand `medieval kingdom` is given, where
`kingdom` has a `class` value `polity`, we first look up `ANCIENT polity`, see there is no entry in `placetype_data`
for it, and then look up `FORMER polity`, which exists and has a default category spec `Former polities`, which is
used as the category. Note that if the placetype following the "former" qualifier is recognized in `placetype_data`
but has no `former_type` or `class` and no fallback with a `former_type` or `class` specified, it is an internal
error; but if the placetype isn't recognized (e.g. something like `former greenhouse` is specified and we don't have
an entry for `greenhouse`), we just track the occurrence and end up not categorizing.
* `bare_category_parent`: This specifies the first parent category of a bare placetype category named according to the
placetype in question (e.g. [[:Category:Atolls]] for placetype `atoll`, or [[:Category:Named buildings]] for
placetype `named buildings!`). If not specified, the first parent category is determined by the value of `class`,
using the mapping `class_to_bare_category_parent` in [[Module:category tree/topic cat/data/Places]].
* `addl_bare_category_parents`: Extra parent categories to add a bare placetype category to (see `bare_category_parent`
just above).
* `bare_category_breadcrumb`: Breadcrumb for bare placetype categories. Also used as the sort key of
`bare_category_parent` if it is a string.
* `inherently_former`: If specified and the given placetype is used as an entry placetype, act as if `former` or
`ancient` (depending on the value of `inherently_former`) were prefixed to the placetype. This is for placetypes that
always refer to no-longer-existing entities, such as `satrapy` and `treaty port`. The value of `inherently_former` is
a list of internal qualifiers (one or more of `ANCIENT` and/or `FORMER`), just as for `former_qualifiers`, and the
implementation is the same.
* `cat_handler`: Handler used to generate the categories to add a given toponym to, if its entry placetype is the
placetype in question. Generally the `cat_handler` function checks the holonyms specified in order to determine which
category or categories to generate. For example, `district_neighborhood_cat_handler` handles placetypes `district`,
`neighborhood`, `subdivision`, `suburb` and the like, and either adds the toponym to a category like
`Neighborhoods of ``city`` ` (if a recognized city is given as a holonym), or otherwise a category like
`Neighborhoods in ``location`` ` (for the first recognized non-city location given as a holonym, if an unrecognized
city or city-like entity is given before the recognized non-city). The algorithm that runs the category handlers
iterates over holonyms from left to right, running the `cat_handler` function on each holonym in turn until one or
more categories are returned; see below for more specifics. (Note that countries for which e.g. a `district` is a
political division do not get the corresponding category added by the `district_neighborhood_cat_handler` function but
by `political_division_cat_handler`.) `cat_handler` functions are called with one argument, `data`, describing the
resolved entry placetype (i.e. after resolving placetype aliases and fallbacks) and the holonym being processed. The
return value should be a list of category specs (categories minus the langcode prefix, with `+++` standing for the
holonym key, or the value `true`, which stands for ` ``Placetypes`` in/of ``Holonym`` `, i.e. the pluralized placetype
with the appropriate preposition as specified in `placetype_data`). `data` contains the following fields:
** `entry_placetype`: the resolved entry placetype for the entry placetype being processed (i.e. it will always have an
entry in `placetype_data` but may not be the original placetype given by the user);
** `holonym_placetype` and `holonym_placename`: the holonym placetype and placename being processed;
** `holonym_index`: the index of the holonym being processed, or {nil} if we're handling an overriding holonym (FIXME:
we will change the overriding holonym algorithm so there will be an index even when processing overriding holonyms);
** `place_desc`: a full description of the {{tl|place}} call, as specified at the top of [[Module:place]];
** `from_demonym`: If set, we are called from [[Module:demonym]], triggered by {{tl|demonym-adj}} or
{{tl|demonym-noun}}, instead of being triggered by {{tl|place}}.
* `has_neighborhoods`: If `true`, the specified placetype is city-like. This is used in the
`district_neighborhood_cat_handler` to determine whether to add a category such as `Neighborhoods in ``location`` `;
see the section just above on `cat_handler`.
5. The following preposition-related property keys are recognized:
* `preposition`: The preposition used after this placetype when it occurs as an entry placetype. Defaults to `"ใน"`.
* `generic_before_non_cities`: If specified, the appropriate category description handler in
[[Module:category tree/topic cat/data/Places]] will recognize categories of the form
` ``Placetype`` in/of ``location`` ` for the specified placetype and preposition, if ``location`` is a non-city. This
is used to generate descriptions for categories added by category handlers and by explicit category specs in the
placetype data. All placetypes that specify `generic_before_non_cities` or `generic_before_cities` *MUST* also specify
a value for `class` so that the category tree code can determine whether it's a political or non-political division.
* `generic_before_cities`: Like `generic_before_non_cities` but for locations referring to cities.
6. The following property keys control the auto-addition of affixes when formatting holonyms of a particular placetype:
* `affix_type`: If specified, add the placetype as an affix before or after holonyms of this placetype. Possible values
are:
*# `"pref"` (the holonym will display as `(the) placetype of Holonym`, where `the` appears when the holonym directly
follows an entry placetype);
*# `"Pref"` (same as `"pref"` but the placetype is capitalized; each word is capitalized if there are multiple);
*# `"suf"` (the holonym will display as `Holonym placetype`);
*# `"Suf"` (the holonym will display as `Holonym Placetype`, i.e. same as `"suf"` but the placetype is capitalized).
* `suffix`: String to use in place of the placetype itself when the placetype is displayed as a suffix after a holonym.
Note that `suffix` can be used independently of `affix_type` because the user can also request a suffix explicitly
using a syntax like `adr:suf/Occitania`, which will display as `Occitania region` because the placetype
`administrative region` specifies `suffix = "ภูมิภาค"`.
* `prefix`: Like `suffix` but for use when the placetype is displayed as a prefix before the holonym.
* `affix`: Like `suffix` and `prefix` but for use when the placetype is displayed as an affix either before or after the
holonym. If both `suffix` or `prefix` and `affix` are given for a single placetype, `suffix` or `prefix` take
precedence.
* `no_affix_strings`: String or list of strings that, if they occur in the holonym, suppress the addition of any affix
requested using `affix_type`. Defaults to the placetype itself. For example, `autonomous okrug` specifies
`affix_type = "Suf"` so that `aokr/Nenets` displays as `Nenets Autonomous Okrug`, but also specifies
`no_affix_strings = "okrug"` so that `aokr/Nenets Okrug` or `aokr/Nenets Autonomous Okrug` displays as specified,
without a redundant `Autonomous Okrug` added. Matching is case-insensitive but whole-word.
* `display_handler`: A function of two arguments, `holonym_placetype` and `holonym_placename` (specifying a holonym).
Its return value is a string specifying the display form of the holonym.
7. The following property keys control the indefinite and definite articles used before entry placetypes and/or holonyms
of the specified placetype.
* `entry_placetype_use_the`: Use `"the"` before this placetype when it occurs as an entry placetype.
* `entry_placetype_indefinite_article`: Indefinite article used before this placetype when it occurs as an entry
placetype (usually `"a"`, specifically for placetypes beginning with u- that don't take the indefinite article
`"an"`). Defaults to the appropriate indefinite article (`"a"` or `"an"` depending on whether the placetype begins
with a vowel). Overridden by `entry_placetype_use_the`, and unlike for most properties, does not apply to equivalent
placetypes (i.e. fallbacks or those formed by removing a qualifier from the beginning); only to the exact placetype
specified.
* `holonym_use_the`: Use `"the"` before holonyms of this placetype.
'''NOTE:'''
# The `link` property must be specified on all placetypes, except those ending in `!` (category-only placetypes), which
must have either `link` or `category_link` specified.
# Either the `class` or `former_type` property must be specified on all placetypes not ending in `!` that do not have a
fallback (if a placetype has a fallback and omits the `class` and `former_type` properties, they are taken from the
fallback). An internal error will result if a placetype has no `class` or `former_type` property derivable either
directly or through a fallback, if an attempt is made to categorize a former/ancient/historical/etc. entity of this
placetype.
# It is possible to have multiple levels of fallback (e.g. `frazione` falls back to `hamlet`, which falls back
to `village`). Fallback loops will cause an internal error. All placetypes specified as fallbacks must exist in
`placetype_data` or an internal error occurs.
]==]
export.placetype_data = {
--[=[
If you need to sort the following, do this (using Vim):
1. Make sure all full-line comments are within the { ... } table, or are moved after and on the same line as single-line
entries.
2. Make sure the table uses tabs everywhere for indent, and not spaces.
3. Mark the top of the table with `ma`, go to the bottom and execute the following two lines in sequence:
:'a,.s/\n/\\n/g
:s/\\n\(\t\[\)/\r\1/g
The first command converts every newline to a literal `\n` sequence, so the whole thing becomes a single line, while
the second command restores the newlines before the beginning of each entry. The effect is to convert all entries to
a single line while not losing any information. (Potentially a negative lookahead could be used to do it all in one
command.)
4. Execute the following to sort:
:'a,.!perl -pe 's/^(\t\[")(.*?)(".*)$/$2 @@@ $1$2$3/' | sort -f | perl -pe 's/.*? @@@ //'
Note that a simple `sort -f` (where `-f` means case-insensitive) would almost work, but it would sort "hill station"
before "hill" and "county borough" before "เทศมณฑล" because the space after e.g. "hill station" sorts before the
quotation mark after e.g. "hill". The above command deals with this by extracting the key, prepending it followed by
` @@@ `, sorting, and then removing key (the classic decorate-sort-undecorate pattern).
5. Put the table back to multi-line format by marking the top of the table with `ma`, going to the bottom and executing
:'a,.s/\\n/\r/g
Note that for some reason, in order to get a match a newline in the left side of a replacement, you must use \n, but
to insert a newline in the right sode of a replacement you must use \r.
]=]
["*"] = {
link = false,
cat_handler = generic_place_cat_handler,
},
["administrative atoll"] = {
-- Maldives
link = "+w:administrative divisions of the Maldives",
preposition = "ของ",
class = "subpolity",
},
["administrative capital"] = {
link = "w",
fallback = "เมืองหลวง",
},
["administrative center"] = {
link = "w",
fallback = "เมืองหลวงที่ไม่ใช่นคร",
},
["administrative centre"] = {
link = "w",
fallback = "administrative center",
},
["administrative county"] = {
link = "w",
fallback = "เทศมณฑล",
},
["administrative district"] = {
link = "w",
fallback = "อำเภอ",
},
["administrative headquarters"] = {
link = "separately",
fallback = "administrative centre",
},
["administrative region"] = {
link = true,
preposition = "ของ",
suffix = "ภูมิภาค", -- but prefix is still "administrative region (of)"
fallback = "ภูมิภาค",
class = "subpolity",
},
["administrative seat"] = {
link = "w",
fallback = "administrative centre",
},
["administrative territory"] = {
link = "separately",
preposition = "ของ",
suffix = "ดินแดน", -- but prefix is still "administrative territory (of)"
fallback = "ดินแดน",
class = "subpolity",
},
["administrative unit"] = {
-- Grrr, it's difficult to generalize about "administrative units". In Albania, "administrative unit" is an
-- official term for a city-level division of municipalities; Wikipedia renders it using the more practical term
-- "commune". In Pakistan, "administrative unit" is a collective term used to refer to all the different types
-- of first-level divisions (four provinces, one federal territory, and two "disputed territories", i.e. Azad
-- Kashmir and Gilgit-Balistan, that are variously described). For this reason, we set no fallback, but we need
-- to include this so that it can be used as a placetype for Albania, categorizing as communes.
link = "w",
class = "subpolity",
},
["administrative village"] = {
link = "w",
preposition = "ของ",
has_neighborhoods = true,
class = "settlement",
},
["aimag"] = {
-- used in Mongolia, Russia and China (Inner Mongolia); in Mongolia, equivalent to a province;
-- in China, equivalent to a prefecture (below a province); in Russia, equivalent to a municipal district.
link = "w",
fallback = "prefecture",
},
["airport"] = {
link = true,
class = "man-made structure",
default = {true},
},
["alliance"] = {
link = true,
fallback = "confederation",
},
["archipelago"] = {
link = true,
fallback = "เกาะ",
},
["area"] = {
link = true,
preposition = "ของ",
fallback = "geographic and cultural area",
-- Areas can either be administrative divisions (specifically of Kuwait) or geographic areas. Assume the former
-- when categorizing 'Areas' but the latter when handling e.g. 'historical area'.
class = "subpolity",
former_type = "geographic region",
cat_handler = district_neighborhood_cat_handler,
},
["arm"] = {
link = true,
preposition = "ของ",
class = "natural feature",
default = {"ทะเล"},
},
["arrondissement"] = {
link = true,
preposition = "ของ",
-- FIXME!!! Grrrrr!!! In some countries, arrondissements are divisions of cities; in others, they are divisions
-- of departments or provinces. Need to conditionalize on the country for both of the following.
class = "subpolity",
has_neighborhoods = true,
},
["associated province"] = {
link = "separately",
fallback = "จังหวัด",
},
["atoll"] = {
-- FIXME! Atolls are administrative divisions of the Maldives but natural features elsewhere. Need to
-- conditionalize `class` on the country. See also `administrative atoll`.
link = true,
class = "natural feature",
bare_category_parent = "เกาะ",
default = {true},
},
["autonomous city"] = {
link = "w",
preposition = "ของ",
fallback = "นคร",
has_neighborhoods = true,
},
["autonomous community"] = {
-- Spain; refers to regional entities, not village-like entities, as might be expected from "community"
link = true,
preposition = "ของ",
class = "subpolity",
},
["autonomous island"] = {
-- Comoros; seems like an administrative atoll of the Maldives.
link = "+w:autonomous islands of Comoros",
preposition = "ของ",
class = "subpolity",
},
["autonomous oblast"] = {
link = true,
preposition = "ของ",
affix_type = "Suf",
no_affix_strings = "oblast",
class = "subpolity",
},
["autonomous okrug"] = {
link = true,
preposition = "ของ",
affix_type = "Suf",
no_affix_strings = "okrug",
class = "subpolity",
},
["autonomous prefecture"] = {
link = true,
fallback = "prefecture",
},
["autonomous province"] = {
link = "w",
fallback = "จังหวัด",
},
["autonomous region"] = {
link = "w",
preposition = "ของ",
fallback = "administrative region",
-- "administrative region" sets an affix of "ภูมิภาค" but we want to display as "Tibet Autonomous Region"
-- if the user writes 'ar:Suf/Tibet'.
affix = "autonomous region",
},
["autonomous republic"] = {
link = "w",
preposition = "ของ",
class = "subpolity",
},
["autonomous territorial unit"] = {
-- Moldova; only two of them, one for Gagauzia and one for Transnistria.
link = "w",
preposition = "ของ",
class = "subpolity",
},
["autonomous territory"] = {
link = "w",
fallback = "dependent territory",
},
["bailiwick"] = {
-- Jersey, etc.
link = true,
fallback = "องค์การทางการเมือง",
},
["barangay"] = {
-- Philippines
link = true,
class = "settlement",
-- Barangays are formal administrative divisions of a city rather than informal neighborhoods, but can use
-- some of the properties of a neighborhood.
fallback = "neighborhood",
},
["barrio"] = {
-- Spanish-speaking countries; Philippines
link = true,
-- FIXME: Not completely correct, in some countries barrios are formal administrative divisions of a city.
-- `class` will need to conditionalize on the country to be completely correct.
fallback = "neighborhood",
},
["basin"] = {
link = true,
fallback = "ทะเลสาบ",
},
["bay"] = {
link = true,
preposition = "ของ",
class = "natural feature",
addl_bare_category_parents = {"bodies of water"},
default = {true},
},
["beach"] = {
link = true,
class = "natural feature",
addl_bare_category_parents = {"water"},
default = {true},
},
["beach resort"] = {
link = "w",
fallback = "resort town",
},
["bishopric"] = {
link = true,
fallback = "องค์การทางการเมือง",
},
["bodies of water!"] = {
-- FIXME: This is (maybe?) a type category not a name category. There should be an option for this. We need to
-- straighten out the type vs. name vs. related-to issue.
category_link = "[[body of water|bodies of water]]",
class = "natural feature",
addl_bare_category_parents = {"ธรณีสัณฐาน", "ecosystems", "water"},
},
["borough"] = {
link = true,
preposition = "ของ",
display_handler = borough_display_handler,
has_neighborhoods = true,
-- "former borough" could be a former settlement or a former part of a city but seems more likely to
-- be a former subpolity, particularly in England. FIXME, we really need a handler to take care of this
-- properly.
class = "subpolity",
-- Grr, some boroughs are city-like but some (e.g. in Britain) may be larger.
},
["borough seat"] = {
link = true,
entry_placetype_use_the = true,
preposition = "ของ",
has_neighborhoods = true,
class = "capital",
},
["branch"] = {
link = true,
preposition = "ของ",
fallback = "แม่น้ำ",
},
["bridge"] = {
link = true,
class = "man-made structure",
default = {"Named bridges"},
},
["building"] = {
link = true,
class = "man-made structure",
default = {"Named buildings"},
},
["built-up area"] = {
link = "w",
fallback = "area",
},
["burgh"] = {
link = true,
fallback = "borough",
},
["business park"] = {
link = true,
fallback = "park",
},
["caliphate"] = {
link = true,
fallback = "องค์การทางการเมือง",
},
["canton"] = {
link = true,
preposition = "ของ",
affix_type = "suf",
class = "subpolity",
},
["cape"] = {
link = true,
fallback = "headland",
},
["capital"] = {
link = true,
fallback = "เมืองหลวง",
},
["เมืองหลวง"] = {
link = true,
category_link = "[[capital city|capital cities]]: the [[seat of government|seats of government]] for a country or [[political]] [[division]] of a country",
entry_placetype_use_the = true,
preposition = "ของ",
has_neighborhoods = true,
class = "capital",
bare_category_parent = "นคร",
cat_handler = capital_city_cat_handler,
default = {true},
-- The following is necessary so that e.g. [[Melbourne]] defined as {{place|en|capital city|s/Victoria|c/Australia}}
-- gets categorized in the bare category [[Category:en:Melbourne]]; otherwise placetype 'capital city' wouldn't
-- match against the placetype 'city' of Melbourne.
fallback = "นคร",
},
["caplc"] = {
link = "[[capital]] and [[large]]st [[city]]",
plural_link = false,
fallback = "เมืองหลวง",
},
["captaincy"] = {
link = true,
preposition = "ของ",
class = "subpolity",
inherently_former = {"FORMER"},
},
["caravan city"] = {
link = "w",
fallback = "นคร",
class = "settlement",
inherently_former = {"ANCIENT", "FORMER"},
},
["castle"] = {
link = true,
fallback = "building",
},
["cathedral city"] = {
link = true,
fallback = "นคร",
},
["cattle station"] = {
-- Australia
link = true,
fallback = "farm",
},
["census area"] = {
link = true,
affix_type = "Suf",
has_neighborhoods = true,
class = "non-admin settlement",
},
["census-designated place"] = {
-- United States
link = true,
class = "non-admin settlement",
},
["census division"] = {
-- Canada
link = "w",
preposition = "ของ",
class = "subpolity",
},
["census town"] = {
link = "w",
fallback = "เมือง",
},
["central business district"] = {
link = true,
fallback = "neighborhood",
},
["cercle"] = {
-- Mali
link = "+w:cercles of Mali",
preposition = "ของ",
class = "subpolity",
},
["ceremonial county"] = {
link = true,
fallback = "เทศมณฑล",
},
["chain of islands"] = {
link = "[[chain]] of [[island]]s",
plural = "chains of islands",
plural_link = "[[chain]]s of [[island]]s",
fallback = "เกาะ",
},
["channel"] = {
link = true,
fallback = "strait",
},
["charter community"] = {
-- Northwest Territories, Canada
link = "w",
fallback = "village",
},
["นคร"] = {
link = true,
generic_before_non_cities = "ใน",
has_neighborhoods = true,
class = "settlement",
cat_handler = city_type_cat_handler,
default = {true},
},
["city-state"] = {
link = true,
category_link = "[[sovereign]] [[microstate]]s consisting of a single [[city]] and [[w:dependent territory|dependent territories]]",
has_neighborhoods = true,
class = "settlement",
["continent/*"] = {"City-states", "นครใน+++", "ประเทศใน+++", "เมืองหลวงของ"},
default = {"City-states", "นคร", "ประเทศ", "เมืองหลวงของประเทศ"},
},
["civil parish"] = {
-- Mostly England; similar to municipalities
link = true,
preposition = "ของ",
affix_type = "suf",
has_neighborhoods = true,
class = "subpolity",
},
["claimed political division"] = {
link = "[[claim]]ed [[political]] [[division]]",
class = "subpolity",
default = {true},
},
["co-capital"] = {
link = "[[co-]][[capital]]",
fallback = "เมืองหลวง",
},
["coal city"] = {
link = "+w:coal town",
fallback = "นคร",
},
["coal town"] = {
link = "w",
fallback = "เมือง",
},
["collectivity"] = {
link = "w",
preposition = "ของ",
-- No default; these are weird one-off governmental divisions in France (esp. for overseas collectivities)
class = "subpolity",
},
["colony"] = {
link = true,
fallback = "dependent territory",
},
["comarca"] = {
-- per Wikipedia: traditional region or local administrative division found in Portugal, Spain, and some of
-- their former colonies, like Brazil, Nicaragua, and Panama. In the Valencian Community, for example, it
-- sits between municipalities and provinces, something like a county or district.
link = true,
preposition = "ของ",
class = "subpolity",
},
["commandery"] = {
link = true,
preposition = "ของ",
class = "subpolity",
inherently_former = {"ANCIENT", "FORMER"},
},
["commonwealth"] = {
link = true,
preposition = "ของ",
-- No default; applies specifically to Puerto Rico
class = "subpolity",
},
["commune"] = {
link = true,
fallback = "เทศบาล",
},
["community"] = {
link = true,
category_link = "[[community|communities]] of all sizes",
fallback = "village",
},
["community development block"] = {
-- in India; appears to be similar to a rural municipality; groups several villages, unclear if there will be
-- neighborhoods so I'm not setting `has_neighborhoods` for now
link = "w",
affix_type = "suf",
no_affix_strings = "block",
class = "subpolity",
},
["comune"] = {
-- Italy, Switzerland
link = true,
fallback = "เทศบาล",
},
["condominium"] = {
link = true,
fallback = "องค์การทางการเมือง",
},
["confederacy"] = {
link = true,
fallback = "confederation",
},
["confederation"] = {
link = true,
fallback = "องค์การทางการเมือง",
},
["constituency"] = {
-- currently we have them as political divisions of Namibia but many countries have them
link = true,
preposition = "ของ",
class = "subpolity",
},
["constituent country"] = {
link = true,
preposition = "ของ",
class = "subpolity",
},
["constituent part"] = {
link = "separately",
preposition = "ของ",
class = "subpolity",
},
["constituent republic"] = {
-- Of Russia, Yugoslavia, etc.
link = "separately",
preposition = "ของ",
class = "subpolity",
},
["counties and county-level cities!"] = {
-- This is used when grouping counties and county-level cities under prefecture-level cities in China.
category_link = "[[county|counties]] and [[county-level city|county-level cities]]",
class = "subpolity",
},
["continent"] = {
link = true,
category_link = false, -- can't occur as a bare category
class = "natural feature",
default = {"Continents and continental regions"},
},
["continental region"] = {
link = "separately",
category_link = false, -- can't occur as a bare category
class = "geographic region",
fallback = "continent",
},
["continents and continental regions!"] = {
category_link = "[[continent]]s and [[continent]]-[[level]] [[region]]s (e.g. [[Polynesia]])",
class = "geographic region",
},
["council area"] = {
link = true,
-- in Scotland; similar to a county
preposition = "ของ",
affix_type = "suf",
class = "subpolity",
},
["ประเทศ"] = {
link = true,
class = "polity", --ห้ามแปล class
["continent/*"] = {true, "ประเทศ"},
default = {true},
},
["country-like entities!"] = {
category_link = "[[polity|polities]] not normally considered [[country|countries]] but treated similarly for categorization purposes; typically, [[unrecognized]] [[de-facto]] countries or [[w:dependent territory|dependent territories]]",
class = "polity", --ห้ามแปล class
},
["เทศมณฑล"] = {
link = true,
preposition = "ของ",
display_handler = county_display_handler,
class = "subpolity",
},
["county borough"] = {
link = true,
-- in Wales; similar to a county
preposition = "ของ",
affix_type = "suf",
fallback = "borough",
class = "subpolity",
},
["county seat"] = {
link = true,
entry_placetype_use_the = true,
preposition = "ของ",
has_neighborhoods = true,
class = "capital",
},
["county town"] = {
link = true,
entry_placetype_use_the = true,
preposition = "ของ",
fallback = "เมือง",
has_neighborhoods = true,
class = "capital",
},
["county-administered city"] = {
-- In Taiwan, per Wikipedia similar to a Taiwanese township or district, which is a small city.
-- NOT anything like a "county-level city" in PR China, which is a county masquerading as a city.
link = "w",
fallback = "นคร",
has_neighborhoods = true,
class = "settlement",
},
["county-controlled city"] = {
-- Taiwan
link = "w",
fallback = "county-administered city",
},
["county-level city"] = {
-- PR China
link = "w",
fallback = "prefecture-level city",
},
["crater lake"] = {
link = true,
fallback = "ทะเลสาบ",
},
["creek"] = {
link = true,
fallback = "stream",
},
["Crown colony"] = {
link = "+crown colony",
fallback = "crown colony",
},
["crown colony"] = {
link = true,
fallback = "colony",
},
["Crown dependency"] = {
link = true,
fallback = "dependent territory",
},
["crown dependency"] = {
link = true,
fallback = "dependent territory",
},
["cultural area"] = {
link = "w",
fallback = "geographic and cultural area",
},
["cultural region"] = {
link = "w",
fallback = "geographic and cultural area",
},
["delegation"] = {
-- Tunisia
link = "+w:delegations of Tunisia",
preposition = "ของ",
class = "subpolity",
},
["department"] = {
link = true,
preposition = "ของ",
affix_type = "suf",
class = "subpolity",
},
["departmental capital"] = {
link = "separately",
fallback = "เมืองหลวง",
},
["dependency"] = {
link = true,
fallback = "dependent territory",
},
["dependent territory"] = {
link = "w",
preposition = "ของ",
class = "subpolity",
former_type = "dependent territory",
bare_category_parent = "political divisions",
["country/*"] = {true},
default = {true},
},
["desert"] = {
link = true,
class = "natural feature",
addl_bare_category_parents = {"ecosystems"},
default = {true},
},
["deserted mediaeval village"] = {
link = "w",
fallback = "deserted medieval village",
},
["deserted medieval village"] = {
link = "w",
fallback = "ANCIENT settlement",
},
["direct-administered municipality"] = {
-- China
link = "+w:direct-administered municipalities of China",
fallback = "เทศบาล",
},
["direct-controlled municipality"] = {
-- several countries
link = "w",
fallback = "เทศบาล",
},
["distributary"] = {
link = true,
preposition = "ของ",
fallback = "แม่น้ำ",
},
["อำเภอ"] = {
link = true,
preposition = "ของ",
affix_type = "suf",
-- Grrr! FIXME! Here is where we need handlers for `class`. Using similar logic to
-- district_neighborhood_cat_handler, we need to check if we're below or above a city to determine if the class
-- is "settlement" or "subpolity".
class = "subpolity",
cat_handler = district_neighborhood_cat_handler,
-- No default. Countries for which districts are political divisions will get entries.
},
["districts and autonomous regions!"] = {
-- This and other similar "combined placetypes" are for use in the plural when grouping first-level
-- administrative regions of certain countries, in this case Portugal.
category_link = "[[district]]s and [[autonomous region]]s",
class = "subpolity",
},
["districts and autonomous territorial units!"] = {
-- This and other similar "combined placetypes" are for use in the plural when grouping first-level
-- administrative regions of certain countries, in this case Moldova.
category_link = "[[district]]s and [[w:autonomous territorial unit|autonomous territorial unit]]s",
class = "subpolity",
},
["district capital"] = {
link = "separately",
fallback = "เมืองหลวง",
},
["district headquarters"] = {
link = "separately",
fallback = "administrative centre",
},
["district municipality"] = {
-- In Canada, a district municipality is equivalent to a rural municipality and won't have neighborhoods; in
-- South Africa, district municipalities group local municipalities and hence won't have neighborhoods.
link = "w",
preposition = "ของ",
affix_type = "suf",
no_affix_strings = {"อำเภอ", "เทศบาล"},
fallback = "เทศบาล",
class = "subpolity",
},
["division"] = {
link = true,
preposition = "ของ",
class = "subpolity",
},
["division capital"] = {
link = "separately",
fallback = "เมืองหลวง",
},
["dome"] = {
link = true,
fallback = "ภูเขา",
},
["dormant volcano"] = {
link = true,
fallback = "volcano",
},
["duchy"] = {
link = true,
fallback = "องค์การทางการเมือง",
},
["emirate"] = {
link = true,
preposition = "ของ",
-- FIXME: Can be subpolities (of the United Arab Emirates).
fallback = "องค์การทางการเมือง",
},
["จักรวรรดิ"] = {
link = true,
fallback = "องค์การทางการเมือง",
},
["enclave"] = {
link = true,
preposition = "ของ",
-- Enclaves can theoretically be any size but assume a subpolity.
class = "subpolity",
},
["entity"] = {
-- Bosnia and Herzegovina
link = "+w:entities of Bosnia and Herzegovina",
preposition = "ของ",
class = "subpolity",
},
["escarpment"] = {
link = true,
fallback = "ภูเขา",
},
["ethnographic region"] = {
-- used in Lithuania
link = "+w:ethnographic regions of Lithuania",
fallback = "geographic and cultural area",
},
["exclave"] = {
link = true,
preposition = "ของ",
-- exclaves can theoretically be any size but assume a subpolity.
class = "subpolity",
},
["external territory"] = {
link = "separately",
fallback = "dependent territory",
},
["farm"] = {
link = true,
class = "non-admin settlement",
default = {"Farms and ranches"},
},
["farms and ranches!"] = {
category_link = "[[farm]]s and [[ranch]]es",
class = "non-admin settlement",
},
["federal city"] = {
link = "w",
preposition = "ของ",
fallback = "นคร",
},
["federal district"] = {
link = true,
preposition = "ของ",
-- Might have neighborhoods as federal districts are often cities (e.g. Mexico City)
has_neighborhoods = true,
class = "settlement",
},
["federal subject"] = {
-- In Russia; a generic term for first-level administrative divisions (republics, oblasts, okrugs, krais,
-- autonomous okrugs and autonomous oblasts).
link = "w",
preposition = "ของ",
class = "subpolity",
},
["federal territory"] = {
link = "w",
fallback = "ดินแดน",
},
["fictional location"] = {
link = "separately",
former_type = "!",
class = "hypothetical location",
bare_category_parent = "สถานที่",
default = {true},
},
["First Nations reserve"] = {
-- Canada
link = "[[First Nations]] [[w:Indian reserve|reserve]]",
-- Wikipedia uses "Indian reserve"; presumably that is the legal term
fallback = "Indian reserve",
class = "subpolity",
},
["fjord"] = {
link = true,
class = "natural feature",
addl_bare_category_parents = {"bodies of water"},
default = {true},
},
["footpath"] = {
link = true,
fallback = "road",
},
["forest"] = {
link = true,
class = "natural feature",
addl_bare_category_parents = {"ecosystems", "forestry"},
default = {true},
},
["fort"] = {
link = true,
fallback = "building",
},
["fortress"] = {
link = true,
-- The default plural algorithm gets this right but the singularization algorithm incorrectly converts
-- fortresses -> fortresse, so put an entry here to ensure we singularize correctly.
plural = "fortresses",
fallback = "building",
},
["frazione"] = {
link = "w",
fallback = "hamlet",
},
["freeway"] = {
link = true,
fallback = "road",
},
["French prefecture"] = {
link = "[[w:prefectures in France|prefecture]]",
entry_placetype_use_the = true,
preposition = "ของ",
has_neighborhoods = true,
class = "capital",
},
["geographic and cultural area"] = {
link = "+w:cultural area",
-- `generic_before_non_cities` is used when generating the category description of categories of the format
-- `Geographic and cultural areas of PLACE`. `preposition` is used when generating {{place}} description and
-- categories for any placetype that falls back to `geographic and cultural area`.
generic_before_non_cities = "ของ",
preposition = "ของ",
class = "geographic region",
bare_category_parent = "สถานที่",
["country/*"] = {true},
["constituent country/*"] = {true},
["continent/*"] = {true},
default = {true},
},
["geographic area"] = {
link = "+w:geographic region",
fallback = "geographic and cultural area",
},
["geographic region"] = {
link = "w",
fallback = "geographic and cultural area",
},
["geographical area"] = {
link = "w",
fallback = "geographic and cultural area",
},
["geographical region"] = {
link = "w",
fallback = "geographic and cultural area",
},
["geopolitical zone"] = {
-- Nigeria
link = true,
preposition = "ของ",
class = "subpolity",
},
["gewog"] = {
-- Bhutan
link = true,
preposition = "ของ",
class = "subpolity",
},
["ghost town"] = {
link = true,
generic_before_non_cities = "ใน",
class = "non-admin settlement",
bare_category_parent = "former settlements",
cat_handler = city_type_cat_handler,
default = {true},
},
["glen"] = {
link = true,
fallback = "valley",
},
["governorate"] = {
link = true,
preposition = "ของ",
affix_type = "suf",
class = "subpolity",
},
["greater administrative region"] = {
-- China (former division)
link = "w",
preposition = "ของ",
class = "subpolity",
inherently_former = {"FORMER"},
},
["gromada"] = {
-- Poland (former division)
link = "w",
preposition = "ของ",
affix_type = "Pref",
class = "subpolity",
inherently_former = {"FORMER"},
},
["group of islands"] = {
link = "[[group]] of [[island]]s",
plural = "groups of islands",
plural_link = "[[group]]s of [[island]]s",
fallback = "island group",
},
["gulf"] = {
link = true,
preposition = "ของ",
holonym_use_the = true,
class = "natural feature",
addl_bare_category_parents = {"bodies of water"},
default = {true},
},
["hamlet"] = {
link = true,
fallback = "village",
},
["harbor city"] = {
link = "separately",
fallback = "นคร",
},
["harbor town"] = {
link = "separately",
fallback = "เมือง",
},
["harbour city"] = {
link = "separately",
fallback = "นคร",
},
["harbour town"] = {
link = "separately",
fallback = "เมือง",
},
["headland"] = {
link = true,
class = "natural feature",
addl_bare_category_parents = {"ธรณีสัณฐาน"},
default = {true},
},
["headquarters"] = {
link = "w",
fallback = "administrative centre",
},
["heath"] = {
link = true,
fallback = "moor",
},
["hemisphere"] = {
link = true,
entry_placetype_use_the = true,
fallback = "continental region",
},
["highway"] = {
link = true,
fallback = "road",
},
["hill"] = {
link = true,
class = "natural feature",
addl_bare_category_parents = {"ธรณีสัณฐาน"},
default = {true},
},
["hill station"] = {
link = "w",
fallback = "เมือง",
},
["hill town"] = {
link = "w",
fallback = "เมือง",
},
["historic region"] = {
-- provided only for the link
link = "+w:historical region",
fallback = "FORMER geographic region",
},
["historical county"] = {
-- needed for historical counties of England/etc.
link = "+w:historic county",
fallback = "FORMER subpolity",
},
["historical region"] = {
-- provided only for the link
link = "w",
fallback = "FORMER geographic region",
},
["home rule city"] = {
link = "w",
fallback = "นคร",
},
["home rule municipality"] = {
link = "w",
fallback = "เทศบาล",
},
["hot spring"] = {
link = true,
fallback = "spring",
},
["house"] = {
link = true,
fallback = "building",
},
["housing estate"] = {
-- not the same as a housing project (i.e. public housing)
link = true,
-- not exactly the case but approximately
fallback = "neighborhood",
},
["hromada"] = {
-- Ukraine
link = "w",
disallow_in_entries = "Use placetype 'urban hromada', 'rural hromada' or 'settlement hromada' in place of bare 'hromada'",
disallow_in_holonyms = "Use placetype 'urban hromada'/'uhrom', 'rural hromada'/'rhrom' or 'settlement hromada'/'shrom' in place of bare 'hromada'",
preposition = "ของ",
affix_type = "suf",
class = "subpolity",
},
["inactive volcano"] = {
link = "w",
fallback = "dormant volcano",
},
["independent city"] = {
link = true,
fallback = "นคร",
},
["independent town"] = {
link = "+independent city",
fallback = "เมือง",
},
["Indian reservation"] = {
link = "w",
-- In the US. Also known as "Native American reservation" or "domestic dependent nation", and the reservations
-- themselves often use the term "nation" in their official name (e.g. the "Navajo Nation"). But Wikipedia puts
-- the article at [[w:Indian reservation]] and uses that term when describing e.g. what the Navajo Nation is,
-- so this must still be the legal term.
preposition = "ของ",
class = "subpolity",
default = {true},
},
["Indian reserve"] = {
link = "w",
-- In Canada. "First Nations reserve" sounds more modern/PC but Wikipedia uses "Indian reserve"; presumably that
-- is still the legal term.
preposition = "ของ",
class = "subpolity",
default = {true},
},
["inland sea"] = {
-- note, we also have 'inland' as a qualifier
link = true,
fallback = "ทะเล",
},
["inner city area"] = {
link = "[[inner city]] [[area]]",
fallback = "neighborhood",
},
["เกาะ"] = {
link = true,
preposition = "ของ",
class = "natural feature",
addl_bare_category_parents = {"ธรณีสัณฐาน"},
default = {true},
},
["island country"] = {
-- FIXME: The following should map to both 'island' and 'country'.
link = "w",
fallback = "ประเทศ",
},
["island group"] = {
link = "separately",
fallback = "เกาะ",
},
["island municipality"] = {
link = "w",
fallback = "เทศบาล",
},
["islet"] = {
link = "w",
fallback = "เกาะ",
},
["Israeli settlement"] = {
link = "w",
class = "settlement",
default = {true},
},
["judicial capital"] = {
link = "w",
fallback = "เมืองหลวง",
},
["khanate"] = {
link = true,
fallback = "องค์การทางการเมือง",
},
["kibbutz"] = {
link = true,
plural = "kibbutzim",
class = "non-admin settlement",
default = {true},
},
["kingdom"] = {
link = true,
fallback = "monarchy",
},
["krai"] = {
link = true,
preposition = "ของ",
affix_type = "Suf",
class = "subpolity",
},
["ทะเลสาบ"] = {
link = true,
class = "natural feature",
addl_bare_category_parents = {"bodies of water"},
default = {true},
},
["ธรณีสัณฐาน!"] = {
category_link = "[[ธรณีสัณฐาน]]",
bare_category_parent = "สถานที่",
addl_bare_category_parents = {"โลก"},
},
["largest city"] = {
link = "[[large]]st [[city]]",
entry_placetype_use_the = true,
fallback = "นคร",
has_neighborhoods = true,
},
["league"] = {
link = true,
fallback = "confederation",
},
["legislative capital"] = {
link = "separately",
fallback = "เมืองหลวง",
},
["library"] = {
link = true,
fallback = "building",
},
["lieutenancy area"] = {
-- used in the United Kingdom; per Wikipedia:
-- In England, lieutenancy areas are colloquially known as the ceremonial counties, although this phrase does
-- not appear in any legislation referring to them. The lieutenancy areas of Scotland are subdivisions of
-- Scotland that are more or less based on the counties of Scotland, making use of the major cities as separate
-- entities.[2] In Wales, the lieutenancy areas are known as the preserved counties of Wales and are based on
-- those used for lieutenancy and local government between 1974 and 1996. The lieutenancy areas of Northern
-- Ireland correspond to the six counties and two former county boroughs.[3]
link = "w",
fallback = "ceremonial county",
},
["local authority district"] = {
link = "w",
fallback = "local government district",
},
["local government area"] = {
-- Australia
link = "w",
preposition = "ของ",
class = "subpolity",
},
["local council"] = {
-- Malta; similar to municipalities
link = "+w:local councils of Malta",
preposition = "ของ",
fallback = "เทศบาล",
},
["local government district"] = {
link = "w",
preposition = "ของ",
affix_type = "suf",
affix = "อำเภอ",
class = "subpolity",
},
["local government district with borough status"] = {
link = "[[w:local government district|local government district]] with [[w:borough status|borough status]]",
plural = "local government districts with borough status",
plural_link = "[[w:local government district|local government districts]] with [[w:borough status|borough status]]",
preposition = "ของ",
affix_type = "suf",
affix = "อำเภอ",
class = "subpolity",
},
["local urban district"] = {
link = "w",
fallback = "unincorporated community",
},
["locality"] = {
link = "+w:locality (settlement)",
-- not necessarily true, but usually is the case
fallback = "village",
},
["London borough"] = {
link = "w",
preposition = "ของ",
affix_type = "pref",
affix = "borough",
fallback = "local government district with borough status",
has_neighborhoods = true,
},
["macroregion"] = {
link = true,
fallback = "ภูมิภาค",
},
["man-made structures!"] = {
category_link = "[[w:geographical feature#Engineered constructs|man-made structures]] such as [[airport]]s, [[university|universities]] and [[metro station]]s",
bare_category_parent = "สถานที่",
},
["manor"] = {
-- FIXME: or is this more like a farm?
link = true,
fallback = "building",
},
["marginal sea"] = {
link = true,
preposition = "ของ",
fallback = "ทะเล",
},
["market city"] = {
link = "+market town",
fallback = "นคร",
},
["market town"] = {
link = true,
fallback = "เมือง",
},
["massif"] = {
link = true,
fallback = "ภูเขา",
},
["megacity"] = {
link = true,
fallback = "นคร",
},
["metro station"] = {
link = true,
class = "man-made structure",
},
["metropolitan borough"] = {
link = true,
preposition = "ของ",
affix_type = "Pref",
no_affix_strings = {"borough", "นคร"},
fallback = "local government district",
has_neighborhoods = true,
},
["มหานคร"] = {
-- These exist e.g. in Italy and are more like municipalities or even provinces than cities.
link = true,
preposition = "ของ",
affix_type = "Pref",
no_affix_strings = {"มหานคร", "นคร"},
class = "subpolity",
},
["metropolitan county"] = {
link = true,
fallback = "เทศมณฑล",
},
["metropolitan municipality"] = {
-- In South Africa, metropolitan municipalities group local municipalities and are like districts, between
-- provinces and municipalities.
-- In Turkey, metropolitan municipalities are provinces-level.
link = "w",
preposition = "ของ",
affix_type = "Suf",
no_affix_strings = {"metropolitan", "เทศบาล"},
fallback = "เทศบาล",
class = "subpolity",
},
["microdistrict"] = {
-- residential complex in post-Soviet states
link = true,
fallback = "neighborhood",
},
["micronations!"] = {
-- FIXME, merge with microstate
category_link = "[[micronation]]s",
bare_category_parent = "ประเทศ",
},
["microstate"] = {
link = true,
fallback = "ประเทศ",
},
["military base"] = {
link = "w",
class = "settlement", -- or "man-made structure"?
default = {true},
},
["minster town"] = {
-- England
link = "separately",
fallback = "เมือง",
},
["monarchy"] = {
link = true,
fallback = "องค์การทางการเมือง",
},
["moor"] = {
link = true,
class = "natural feature",
addl_bare_category_parents = {"ธรณีสัณฐาน", "ecosystems"},
default = {true},
},
["moorland"] = {
link = true,
fallback = "moor",
},
["motorway"] = {
link = true,
fallback = "road",
},
["ภูเขา"] = {
link = true,
class = "natural feature",
addl_bare_category_parents = {"ธรณีสัณฐาน"},
default = {true},
},
["mountain indigenous district"] = {
-- Taiwan
link = "+w:district (Taiwan)",
fallback = "อำเภอ",
},
["mountain indigenous township"] = {
-- Taiwan
link = "+w:township (Taiwan)",
fallback = "township",
},
["mountain pass"] = {
link = true,
-- The default plural algorithm gets this right but the singularization algorithm incorrectly converts
-- passes -> passe, so put an entry here to ensure we singularize correctly.
plural = "mountain passes",
class = "natural feature",
addl_bare_category_parents = {"ภูเขา"},
default = {true},
},
["เทือกเขา"] = {
link = true,
fallback = "ภูเขา",
},
["mountainous region"] = {
link = "separately",
fallback = "ภูมิภาค",
},
["mukim"] = {
-- Malaysia, Brunei, Indonesia, Singapore
link = true,
preposition = "ของ",
class = "subpolity",
},
["municipal district"] = {
link = "w",
-- meaning varies depending on the country; for now, assume no neighborhoods.
-- FIXME: has_neighborhoods might have to be a function that looks at the containing holonyms.
preposition = "ของ",
affix_type = "Pref",
no_affix_strings = "อำเภอ",
fallback = "เทศบาล",
},
["เทศบาล"] = {
link = true,
preposition = "ของ",
has_neighborhoods = true,
class = "subpolity",
},
["municipality with city status"] = {
link = "[[municipality]] with [[w:city status|city status]]",
plural = "municipalities with city status",
plural_link = "[[municipality|municipalities]] with [[w:city status|city status]]",
fallback = "เทศบาล",
},
["museum"] = {
link = true,
fallback = "building",
},
["mythological location"] = {
link = "separately",
former_type = "!",
class = "hypothetical location",
bare_category_parent = "สถานที่",
default = {true},
},
["named bridges!"] = {
category_link = "notable [[bridge]]s",
bare_category_parent = "man-made structures",
addl_bare_category_parents = {"bridges"},
},
["named buildings!"] = {
category_link = "notable [[house]]s, [[library|libraries]] and other [[building]]s",
bare_category_parent = "man-made structures",
addl_bare_category_parents = {"buildings"},
},
["named roads!"] = {
category_link = "notable [[road]]s, [[highway]]s, [[trail]]s and similar linear structures",
bare_category_parent = "man-made structures",
addl_bare_category_parents = {"roads"},
},
["national capital"] = {
link = "w",
fallback = "เมืองหลวง",
},
["national park"] = {
link = true,
fallback = "park",
},
["natural features!"] = {
category_link = "[[w:geographical feature#Natural features|natural features]] such as [[lake]]s, [[mountain]]s, [[island]]s and [[ocean]]s",
bare_category_parent = "สถานที่",
},
["neighborhood"] = {
-- The majority of the properties here apply to both `neighborhoods` and `neighbourhoods`; the choice of which
-- one to use is made by district_neighborhood_cat_handler() based on the value of `british_spelling` for the
-- location (city, political division, etc.) of the holonym that follows the word "neighbo(u)hoods" in the
-- category name. It does *NOT* depend on whether the {{place}} call uses "neighborhoods" or "neighbourhoods".
-- (In general it can't, because other things like "urban areas", "อำเภอ", "subdivisions" and the like also
-- categorize as neighbo(u)rhoods.)
link = true,
-- See below. These are used by category handlers in [[Module:category tree/topic cat/data/Places]].
generic_before_non_cities = "ใน",
generic_before_cities = "ของ",
-- The following text is suitable for the top-level description of a neighborhood as well as categories of the
-- form `Neighborhoods in POLDIV` e.g. `Neighborhoods in Illinois, USA` but not for categories of the form
-- `Neighborhoods of Chicago`, where we'd get "... and other subportions of [[city|cities]] of [[Chicago]]".
category_link = "[[neighborhood]]s, [[district]]s and other subportions of [[city|cities]]",
category_link_before_city = "[[neighborhood]]s, [[district]]s and other subportions",
-- NOTE: This setting is needed for administrative divisions like barangays that fall back to `neighborhood`,
-- when set in [[Module:place/locations]] for a specific country (e.g. the Philippines). The above settings
-- for `generic_before_non_cities` and `generic_before_cities` are used by category handlers in
-- [[Module:category tree/topic cat/data/Places]] for `Neighborhoods in POLDIV` and `Neighborhoods of CITY`
-- categories. In fact, district_neighborhood_cat_handler() does not currently pay attention to them, but
-- generates "ของ" before cities and "ใน" before non-cities regardless. (FIXME: We should change that.)
preposition = "ของ",
class = "non-admin settlement",
cat_handler = district_neighborhood_cat_handler,
},
["neighbourhood"] = {
link = true,
category_link = "[[neighbourhood]]s, [[district]]s and other subportions of [[city|cities]]",
category_link_before_city = "[[neighbourhood]]s, [[district]]s and other subportions",
fallback = "neighborhood",
},
["new area"] = {
-- China (type of economic development zone, varying greatly in size)
link = "w",
preposition = "ใน",
class = "subpolity", --?
},
["new town"] = {
link = true,
fallback = "เมือง",
},
["เมืองหลวงที่ไม่ใช่นคร"] = {
link = "[[เมืองหลวง]]",
entry_placetype_use_the = true,
preposition = "ของ",
has_neighborhoods = true,
class = "capital",
cat_handler = function(data)
return capital_city_cat_handler(data, "non-city")
end,
-- FIXME, do we need the following?
default = {true},
},
["non-metropolitan county"] = {
link = "w",
fallback = "เทศมณฑล",
},
["non-metropolitan district"] = {
link = "w",
fallback = "local government district",
},
["non-sovereign kingdom"] = {
-- especially in Africa and Asia
link = "+w:non-sovereign monarchy",
generic_before_non_cities = "ใน",
class = "subpolity",
["country/*"] = {true},
["continent/*"] = {true},
default = {true},
},
["non-sovereign monarchy"] = {
link = "w",
fallback = "non-sovereign kingdom",
},
["oblast"] = {
link = true,
preposition = "ของ",
affix_type = "Suf",
class = "subpolity",
},
["oblasts and autonomous republics!"] = {
-- This and other similar "combined placetypes" are for use in the plural when grouping first-level
-- administrative regions of certain countries, in this case Ukraine.
category_link = "[[oblast]]s and [[w:autonomous republic|autonomous republic]]s",
class = "subpolity",
},
["มหาสมุทร"] = {
link = true,
holonym_use_the = true,
class = "natural feature",
addl_bare_category_parents = {"ทะเล", "bodies of water"},
default = {true},
},
["okrug"] = {
link = true,
preposition = "ของ",
affix_type = "Suf",
class = "subpolity",
},
["overseas collectivity"] = {
link = "w",
fallback = "collectivity",
},
["overseas department"] = {
link = "w",
fallback = "department",
},
["overseas territory"] = {
link = "w",
fallback = "dependent territory",
},
["parish"] = {
link = true,
preposition = "ของ",
affix_type = "suf",
class = "subpolity",
},
["parish municipality"] = {
-- in Quebec, often similar to a rural village; the famous [[Saint-Louis-du-Ha! Ha!]] is one of them.
link = "+w:parish municipality (Quebec)",
preposition = "ของ",
fallback = "เทศบาล",
has_neighborhoods = true,
},
["parish seat"] = {
link = true,
entry_placetype_use_the = true,
preposition = "ของ",
class = "capital",
has_neighborhoods = true,
},
["park"] = {
link = true,
class = "man-made structure",
default = {true},
},
["pass"] = {
link = "+mountain pass",
-- The default plural algorithm gets this right but the singularization algorithm incorrectly converts
-- passes -> passe, so put an entry here to ensure we singularize correctly.
plural = "passes",
fallback = "mountain pass",
},
["path"] = {
link = true,
fallback = "road",
},
["peak"] = {
link = true,
fallback = "ภูเขา",
},
["peninsula"] = {
link = true,
class = "natural feature",
addl_bare_category_parents = {"ธรณีสัณฐาน"},
default = {true},
},
["periphery"] = {
link = true,
preposition = "ของ",
class = "subpolity",
},
["สถานที่!"] = {
generic_before_non_cities = "ใน",
generic_before_cities = "ใน",
class = "generic place",
category_link = "[[place]]s of all sorts",
-- `category_link_top_level` control the description used in the top-level [[Category:Places]] and
-- language-specific variants such as [[Category:en:Places]]. The actual text for a language-spefic variant is
-- "{{{langname}}} names of [[geographical]] [[place]]s of all sorts; [[toponym]]s." where the "names of"
-- portion is automatically generated by the appropriate handler in
-- [[Module:category tree/topic cat/data/Places]].
category_link_top_level = "[[geographical]] [[place]]s of all sorts; [[toponym]]s",
bare_category_parent = "ชื่อ (หัวข้อ)",
},
["planned community"] = {
-- Include this so we don't categorize 'planned community' into villages, as 'community' does.
link = true,
class = "settlement",
has_neighborhoods = true,
},
["plateau"] = {
link = true,
class = "natural feature",
addl_bare_category_parents = {"ธรณีสัณฐาน"},
default = {true},
-- FIXME: Should generate both "Plateaus" and the appropriate 'geographic and cultural area' category
},
["Polish colony"] = {
link = "[[w:colony (Poland)|colony]]",
affix_type = "suf",
affix = "colony",
fallback = "village",
has_neighborhoods = true,
},
["political divisions!"] = {
category_link = "[[political]] [[division]]s and [[subdivision]]s, such as [[state]]s, [[province]]s, [[county|counties]] or [[district]]s",
bare_category_parent = "สถานที่",
},
["องค์การทางการเมือง"] = {
link = true,
category_link = "[[independent]] or [[semi-]][[independent]] [[polity|polities]]",
class = "polity", --ห้ามแปล class
bare_category_parent = "สถานที่",
default = {true},
},
["populated place"] = {
link = "+w:populated place",
-- not necessarily true, but usually is the case
fallback = "village",
},
["port"] = {
link = true,
class = "man-made structure",
default = {true},
},
["port city"] = {
-- FIXME: should categorize into "Ports" as well as "นคร"
link = true,
fallback = "นคร",
},
["port town"] = {
-- FIXME: should categorize into "Ports" as well as "เมือง"
link = "w",
fallback = "เมือง",
},
["prefecture"] = {
-- FIXME! `prefecture` is like a county in Japan and elsewhere but a department capital city in France.
-- May need `has_neighborhoods` to be a function.
link = true,
preposition = "ของ",
display_handler = prefecture_display_handler,
class = "subpolity",
},
["prefecture-level city"] = {
-- China; they are huge entities with a central city; not cities themselves.
link = "w",
preposition = "ของ",
class = "subpolity",
},
["preserved county"] = {
-- In Wales; they are former counties enshrined in law; there are 8 of them and each consists of one or more
-- "principal areas" (styled as "เทศมณฑล" or "county boroughs"), of which there are 22.
link = "w",
preposition = "ของ",
class = "subpolity",
inherently_former = {"FORMER"},
},
["primary area"] = {
-- a grouping of "อำเภอ" (neighborhoods) in Gothenburg, Sweden
link = "+w:sv:primärområde",
fallback = "neighborhood",
},
["principality"] = {
link = true,
fallback = "monarchy",
},
["promontory"] = {
link = true,
fallback = "headland",
},
["protectorate"] = {
link = true,
fallback = "dependent territory",
},
["จังหวัด"] = {
link = true,
preposition = "ของ",
display_handler = province_display_handler,
class = "subpolity",
},
["provinces and autonomous regions!"] = {
-- This and other similar "combined placetypes" are for use in the plural when grouping first-level
-- administrative regions of certain countries, in this case China.
category_link = "[[province]]s and [[autonomous region]]s",
class = "subpolity",
},
["provinces and territories!"] = {
-- This and other similar "combined placetypes" are for use in the plural when grouping first-level
-- administrative regions of certain countries, in this case Canada and Pakistan.
category_link = "[[province]]s and [[territory|territories]]",
class = "subpolity",
},
["provincial capital"] = {
link = true,
fallback = "เมืองหลวง",
},
["raion"] = {
link = true,
preposition = "ของ",
affix_type = "Suf",
class = "subpolity",
},
["ranch"] = {
link = true,
fallback = "farm",
},
["range"] = {
-- FIXME: Where is this used? Is it a mountain range?
link = true,
holonym_use_the = true,
class = "natural feature",
},
["regency"] = {
link = true,
preposition = "ของ",
class = "subpolity",
},
["ภูมิภาค"] = {
link = true,
preposition = "ของ",
-- If 'region' isn't a specific administrative division, fall back to 'geographic and cultural area'
fallback = "geographic and cultural area",
-- "former region" is a subpolity but traditional/historic(al)/ancient/medieval/etc. is a geographic region
class = "geographic region",
},
["regional capital"] = {
link = "separately",
fallback = "เมืองหลวง",
},
["regional county municipality"] = {
-- Quebec
link = "w",
preposition = "ของ",
affix_type = "Suf",
no_affix_strings = {"เทศบาล", "เทศมณฑล"},
fallback = "เทศบาล",
},
["regional district"] = {
link = "w",
preposition = "ของ",
affix_type = "Pref",
no_affix_strings = "อำเภอ",
fallback = "อำเภอ",
},
["regional municipality"] = {
link = "w",
preposition = "ของ",
affix_type = "Pref",
no_affix_strings = "เทศบาล",
fallback = "เทศบาล",
},
["regional unit"] = {
link = "w",
preposition = "ของ",
affix_type = "suf",
class = "subpolity",
},
["registration county"] = {
-- Used in Scotland for land registration purposes; formerly used in England, Wales and Ireland for statistical
-- purposes (registration of births, deaths and marriages, and for the output of census information).
link = "w",
fallback = "เทศมณฑล",
},
["republic"] = {
-- Of Russia, Yugoslavia, etc. "Republics" in general are sovereign but we use "ประเทศ" in that case.
link = true,
fallback = "constituent republic",
},
["research base"] = {
link = "+w:research station",
fallback = "research station",
},
["research station"] = {
link = "w",
class = "non-admin settlement", -- or "man-made structure"?
default = {true},
},
["reservoir"] = {
link = true,
fallback = "ทะเลสาบ",
},
["residential area"] = {
link = "separately",
fallback = "neighborhood",
},
["resort city"] = {
link = "w",
fallback = "นคร",
},
["resort town"] = {
link = "w",
fallback = "เมือง",
},
["แม่น้ำ"] = {
link = true,
generic_before_non_cities = "ใน",
holonym_use_the = true,
class = "natural feature",
addl_bare_category_parents = {"bodies of water"},
cat_handler = city_type_cat_handler,
["continent/*"] = {true},
default = {true},
},
["river island"] = {
link = "w",
fallback = "เกาะ",
},
["road"] = {
link = true,
class = "man-made structure",
default = {"Named roads"},
},
["Roman province"] = {
-- FIXME! Eliminate this in favor of 'former province|emp/Roman Empire'
link = "w",
default = {"Provinces of the Roman Empire"},
class = "subpolity",
},
["royal borough"] = {
link = "w",
preposition = "ของ",
affix_type = "Pref",
no_affix_strings = {"royal", "borough"},
fallback = "local government district with borough status",
has_neighborhoods = true,
},
["royal burgh"] = {
link = true,
fallback = "borough",
},
["royal capital"] = {
link = "w",
fallback = "เมืองหลวง",
},
["rural committee"] = {
-- Hong Kong; a group of villages
link = "w",
affix_type = "Suf",
has_neighborhoods = true,
class = "settlement",
},
["rural community"] = {
-- New Brunswick
link = "+w:list of municipalities in New_Brunswick#Rural communities",
fallback = "เทศบาล",
},
["rural hromada"] = {
link = "[[rural]] [[w:hromada|hromada]]",
affix_type = "suf",
fallback = "hromada",
},
["rural municipality"] = {
link = "w",
preposition = "ของ",
affix_type = "Pref",
no_affix_strings = "เทศบาล",
fallback = "เทศบาล",
has_neighborhoods = true, --?
},
["rural township"] = {
-- Taiwan
link = "+w:rural township (Taiwan)",
fallback = "township",
},
["sanctuary"] = {
link = true,
fallback = "temple",
},
["satrapy"] = {
link = true,
preposition = "ของ",
class = "subpolity",
inherently_former = {"ANCIENT", "FORMER"},
},
["ทะเล"] = {
link = true,
holonym_use_the = true,
class = "natural feature",
addl_bare_category_parents = {"bodies of water"},
default = {true},
},
["seaport"] = {
link = true,
fallback = "port",
},
["seat"] = {
link = true,
fallback = "administrative centre",
},
["self-administered area"] = {
-- Myanmar (groups self-administered divisions and zones)
link = "+w:self-administered zone",
preposition = "ของ",
class = "subpolity",
},
["self-administered division"] = {
-- Myanmar (only one of them: Wa Self-Administered Division)
link = "w",
fallback = "self-administered area",
},
["self-administered zone"] = {
-- Myanmar (five of them)
link = "w",
fallback = "self-administered area",
},
["separatist state"] = {
link = "separately",
fallback = "unrecognized country",
},
["การตั้งถิ่นฐาน"] = {
link = true,
category_link = "[[settlement]]s such as [[city|cities]], [[village]]s and [[farm]]s",
bare_category_parent = "สถานที่",
-- not necessarily true, but usually is the case
fallback = "village",
},
["settlement hromada"] = {
link = "[[w:Populated สถานที่ในUkraine#Rural settlements|การตั้งถิ่นฐาน]] [[w:hromada|hromada]]",
affix_type = "suf",
fallback = "hromada",
},
["sheading"] = {
-- Isle of Man
link = true,
fallback = "อำเภอ",
},
["sheep station"] = {
-- Australia
link = true,
fallback = "farm",
},
["shire"] = {
link = true,
fallback = "เทศมณฑล",
},
["shire county"] = {
link = "w",
fallback = "เทศมณฑล",
},
["shire town"] = {
link = true,
fallback = "county seat",
},
["ski resort city"] = {
link = "[[ski resort]] [[city]]",
fallback = "นคร",
},
["ski resort town"] = {
link = "[[ski resort]] [[town]]",
fallback = "เมือง",
},
["spa city"] = {
link = "+w:spa town",
fallback = "นคร",
},
["spa town"] = {
link = "w",
fallback = "เมือง",
},
["space station"] = {
link = true,
fallback = "research station",
},
["special administrative region"] = {
-- in China; in practice they are city-like (Hong Kong, Macau); also [[Oecusse]] in East Timor is formally a
-- "special administrative region"; North Korea had one such region planned (Sinuiju) but abandoned; Indonesia
-- has similar "special regions" of Jakarta, Yogyakarta and Aceh; and South Sudan has three "special
-- administrative areas"
link = "+w:special administrative regions of China",
preposition = "ของ",
class = "subpolity",
has_neighborhoods = true, --?
-- no suffix since สถานที่ในHong Kong or Macau are listed without China, except Hong Kong and Macau themselves
-- they also contain regions (or areas), e.g. [[Kowloon]], so it would be confusing
suffix = "",
},
["special collectivity"] = {
link = "w",
fallback = "collectivity",
},
["special municipality"] = {
-- formerly linked to the Taiwan article but there are also special municipalities of the Netherlands
link = "w",
fallback = "เทศบาล",
},
["special ward"] = {
-- Tokyo
link = true,
fallback = "เทศบาล",
},
["spit"] = {
link = true,
fallback = "peninsula",
},
["spring"] = {
link = true,
class = "natural feature",
default = {true},
},
["star"] = {
link = true,
class = "natural feature",
default = {true},
},
["รัฐ"] = {
link = true,
preposition = "ของ",
class = "subpolity",
-- 'former/historical state' could refer either to a state of a country (a division) or a state = sovereign
-- entity. The latter appears more common (e.g. in various "ancient states" of East Asia).
former_type = "องค์การทางการเมือง",
},
["states and territories!"] = {
-- This and other similar "combined placetypes" are for use in the plural when grouping first-level
-- administrative regions of certain countries, in this case Australia.
category_link = "[[state]]s and [[territory|territories]]",
class = "subpolity",
},
["states and union territories!"] = {
-- This and other similar "combined placetypes" are for use in the plural when grouping first-level
-- administrative regions of certain countries, in this case India.
category_link = "[[state]]s and [[union territory|union territories]]",
class = "subpolity",
},
["state capital"] = {
link = true,
fallback = "เมืองหลวง",
},
["state park"] = {
link = true,
fallback = "park",
},
["state-level new area"] = {
-- China (type of economic development zone, varying greatly in size)
link = "w",
fallback = "new area",
},
["statistical region"] = {
-- Slovenia
link = true,
fallback = "administrative region",
},
["statutory city"] = {
link = "w",
fallback = "นคร",
},
["statutory town"] = {
link = "w",
fallback = "เมือง",
},
["strait"] = {
link = true,
class = "natural feature",
addl_bare_category_parents = {"bodies of water"},
default = {true},
},
["stream"] = {
link = true,
fallback = "แม่น้ำ",
},
["street"] = {
link = true,
fallback = "road",
},
["strip"] = {
link = true,
fallback = "geographic region",
},
["strip of land"] = {
link = "[[strip]] of [[land]]",
plural = "strips of land",
plural_link = "[[strip]]s of [[land]]",
fallback = "geographic region",
},
["sub-metropolitan city"] = {
link = "+w:List of cities in Nepal#Sub-metropolitan cities",
fallback = "นคร",
},
["sub-prefectural city"] = {
link = "w",
fallback = "subprovincial city",
},
["ตำบล"] = {
link = true,
preposition = "ของ",
has_neighborhoods = true, --?
-- FIXME: subdistricts can be neighborhood-like (of Jakarta) or larger (in China); need a handler
class = "subpolity",
default = {true},
},
["subdivision"] = {
link = true,
preposition = "ของ",
affix_type = "suf",
-- FIXME: subdivisions can be neighborhood-like or larger; need a handler
class = "subpolity",
cat_handler = district_neighborhood_cat_handler,
},
["submerged ghost town"] = {
-- FIXME: Consider just having "submerged" as a qualifier.
link = "[[submerged]] [[ghost town]]",
fallback = "ghost town",
},
["subnational kingdom"] = {
link = "+w:subnational monarchy",
fallback = "non-sovereign kingdom",
},
["subnational monarchy"] = {
link = "w",
fallback = "non-sovereign kingdom",
},
["subprefecture"] = {
link = true,
affix_type = "suf",
preposition = "ของ",
class = "subpolity",
},
["subprovince"] = {
link = true,
preposition = "ของ",
class = "subpolity",
},
["subprovincial city"] = {
link = "w",
-- China; special status given to certain prefecture-level cities
fallback = "prefecture-level city",
},
["subprovincial district"] = {
link = "w",
-- China; special status given to Binhai New Area and Pudong New Area, which are county-level districts
preposition = "ของ",
class = "subpolity",
},
["subregion"] = {
link = true,
fallback = "geographic region",
},
["suburb"] = {
link = true,
-- The following text is suitable for the top-level description of a suburb as well as categories of the form
-- 'Suburbs in POLDIV' e.g. 'Suburbs in Illinois, USA' but not for categories of the form 'Suburbs of Chicago',
-- where we'd get "[[suburb]]s of [[city|cities]] of [[Chicago]]".
category_link = "[[suburb]]s of [[city|cities]]",
category_link_before_city = "[[suburb]]s",
-- See comments under "neighborhood" for the following three settings. They are used by
-- [[Module:category tree/topic cat/data/Places]] for generating the text of 'Suburbs in/of PLACE' categories
-- but currently ignored by district_neighborhood_cat_handler (which actually generates the categories for a
-- given page), which hardcodes "ใน" for non-cities and "ของ" for cities. (FIXME: Change this.)
generic_before_non_cities = "ใน",
generic_before_cities = "ของ",
preposition = "ของ",
has_neighborhoods = true, --?
class = "non-admin settlement", --?
cat_handler = district_neighborhood_cat_handler,
},
["suburban area"] = {
link = "w",
fallback = "suburb",
},
["subway station"] = {
link = "w",
fallback = "metro station",
},
["sum"] = {
-- In China, Mongolia, Russia; something like a county in Mongolia but a township in China (Inner Mongolia),
-- and equivalent to a [[selsoviet]] in the parts of Russia where it's in use (a rural council, below a raion).
link = "+w:sum (administrative division)",
-- This fallback is somewha arbitrary. We could use "เทศมณฑล" but that has a display handler
-- which we don't want to be active (FIXME: If the display handler would be active, that's a bug).
fallback = "division",
},
["supercontinent"] = {
link = true,
fallback = "continent",
},
["tehsil"] = {
link = true,
affix_type = "suf",
no_affix_strings = {"tehsil", "tahsil"},
class = "subpolity",
},
["temple"] = {
link = true,
fallback = "building",
},
["territorial authority"] = {
link = "w",
fallback = "อำเภอ",
},
["ดินแดน"] = {
link = true,
preposition = "ของ",
class = "subpolity",
},
["theme"] = {
link = "+w:theme (Byzantine district)",
preposition = "ของ",
class = "subpolity",
},
["เมือง"] = {
link = true,
generic_before_non_cities = "ใน",
has_neighborhoods = true,
class = "settlement",
cat_handler = city_type_cat_handler,
default = {true},
},
["town with bystatus"] = {
-- can't use templates in links currently
link = "[[town]] with [[bystatus#Norwegian Bokmål|bystatus]]",
plural = "towns with bystatus",
plural_link = "[[town]]s with [[bystatus#Norwegian Bokmål|bystatus]]",
fallback = "เมือง",
},
["township"] = {
link = true,
has_neighborhoods = true,
class = "settlement", --?
default = {true},
},
["township municipality"] = {
-- Quebec
link = "+w:township municipality (Quebec)",
preposition = "ของ",
fallback = "เทศบาล",
has_neighborhoods = true, --?
},
["traditional county"] = {
link = true,
fallback = "เทศมณฑล",
},
["traditional region"] = {
-- FIXME: Verify this works. Same for 'historic(al) region'.
-- provided only for the link
link = "w",
fallback = "FORMER geographic region",
},
["trail"] = {
link = true,
fallback = "road",
},
["treaty port"] = {
link = "w",
fallback = "นคร",
class = "settlement",
inherently_former = {"FORMER"},
},
["tributary"] = {
link = true,
preposition = "ของ",
fallback = "แม่น้ำ",
},
["underground station"] = {
link = "w",
fallback = "metro station",
},
["unincorporated area"] = {
link = "w",
-- I don't know if this fallback makes sense everywhere.
fallback = "unincorporated community",
},
["unincorporated community"] = {
link = true,
generic_before_non_cities = "ใน",
class = "non-admin settlement",
},
["unincorporated territory"] = {
link = "w",
fallback = "ดินแดน",
},
["union territory"] = {
-- India
link = true,
preposition = "ของ",
entry_placetype_indefinite_article = "a",
class = "subpolity",
},
["unitary authority"] = {
-- UK, New Zealand
link = true,
entry_placetype_indefinite_article = "a",
fallback = "local government district",
},
["unitary district"] = {
link = "w",
entry_placetype_indefinite_article = "a",
fallback = "local government district",
},
["united township municipality"] = {
-- Quebec
link = "+w:united township municipality (Quebec)",
entry_placetype_indefinite_article = "a",
fallback = "township municipality",
has_neighborhoods = true, --?
},
["university"] = {
link = true,
entry_placetype_indefinite_article = "a",
class = "man-made structure",
default = {true},
},
["unrecognised country"] = {
link = "w",
fallback = "unrecognized country",
},
["unrecognized and nearly unrecognized countries!"] = {
category_link = "[[de facto]] [[independent]] [[state]]s with little or no {{w|international recognition}}",
bare_category_parent = "country-like entities",
},
["unrecognized country"] = {
link = "w",
class = "polity", --ห้ามแปล class
default = {"Unrecognized and nearly unrecognized countries"},
},
["unrecognised state"] = {
link = "w",
fallback = "unrecognized country",
},
["unrecognized state"] = {
link = "w",
fallback = "unrecognized country",
},
["urban area"] = {
link = "separately",
fallback = "neighborhood",
},
["urban hromada"] = {
link = "[[urban]] [[w:hromada|hromada]]",
affix_type = "suf",
fallback = "hromada",
},
["urban service area"] = {
-- A strange beast existing in Alberta; technically a type of hamlet but in practice used for much larger
-- cities and treated equivalent to a city. (There are only two of them, [[Fort McMurray]] and [[Sherwood Park]]).
link = "w",
fallback = "นคร",
},
["urban township"] = {
link = "w",
fallback = "township",
},
["urban-type settlement"] = {
-- appears to be a particular type of small urban settlement in post-Soviet states,
-- had an administrative function.
link = "w",
fallback = "เมือง",
},
["valley"] = {
link = true,
class = "natural feature",
addl_bare_category_parents = {"ธรณีสัณฐาน", "water"},
default = {true},
},
["viceroyalty"] = {
-- in essence, a type of colony
link = true,
fallback = "dependent territory",
},
["village"] = {
link = true,
generic_before_non_cities = "ใน",
category_link = "[[village]]s, [[hamlet]]s, and other small [[community|communities]] and [[settlement]]s",
class = "settlement",
cat_handler = city_type_cat_handler,
default = {true},
},
["village development committee"] = {
-- former administrative structure in Nepal; also exists in India but not as a formal unit
link = "+w:village development committee (Nepal)",
inherently_former = {"FORMER"},
fallback = "village",
},
["village municipality"] = {
-- Quebec
link = "+w:village municipality (Quebec)",
preposition = "ของ",
fallback = "เทศบาล",
has_neighborhoods = true, --?
},
["voivodeship"] = {
-- Poland
link = true,
display_handler = voivodeship_display_handler,
preposition = "ของ",
class = "subpolity",
},
["volcano"] = {
link = true,
plural = "volcanoes",
class = "natural feature",
addl_bare_category_parents = {"ธรณีสัณฐาน"},
default = {true, "ภูเขา"},
},
["ward"] = {
link = true,
class = "settlement",
-- Wards are formal administrative divisions of a city but have some properties of neighborhoods.
fallback = "neighborhood",
},
["watercourse"] = {
link = true,
fallback = "channel",
},
["Welsh community"] = {
-- Wales
link = "[[w:community (Wales)|community]]",
preposition = "ของ",
affix_type = "suf",
affix = "community",
has_neighborhoods = true,
class = "settlement",
},
["zone"] = {
-- administrative division of Ethiopia, Qatar, Nepal, India
link = "+w:zone#Place names",
preposition = "ของ",
class = "subpolity",
},
----------------------------------------------------------------------------------------------
-- Categories for former places --
----------------------------------------------------------------------------------------------
["ANCIENT capital"] = {
link = false,
entry_placetype_use_the = true,
preposition = "ของ",
has_neighborhoods = true,
class = "capital",
-- FIXME: Consider removing 'ancient settlements' here. Ancient capitals, like former capitals, often still
-- exist but just aren't the capital any more. Maybe we should have an 'Ancient capitals' category.
default = {"Ancient settlements", "Former capitals"},
},
["ANCIENT non-admin settlement"] = {
link = false,
class = "non-admin settlement",
fallback = "ANCIENT settlement",
},
["ANCIENT settlement"] = {
link = false,
has_neighborhoods = true,
class = "settlement",
default = {"Ancient settlements"},
},
["ancient settlements!"] = {
category_link = "former [[city|cities]], [[town]]s and [[village]]s that existed in [[antiquity]]",
bare_category_parent = "former settlements",
},
["FORMER capital"] = {
link = false,
entry_placetype_use_the = true,
preposition = "ของ",
has_neighborhoods = true,
class = "capital",
default = {"Former capitals"},
},
["former capitals!"] = {
category_link = "former [[capital]] [[city|cities]] and [[town]]s",
bare_category_parent = "การตั้งถิ่นฐาน",
},
["former counties and county-level cities!"] = {
-- For categorizing former counties and county-level cities of China
category_link = "no-longer existing [[county|counties]] and [[county-level city|county-level cities]]",
bare_category_breadcrumb = "counties and county-level cities",
bare_category_parent = "former political divisions",
},
["FORMER county"] = {
-- For categorizing former counties and county-level cities of China
link = false,
fallback = "FORMER subpolity",
},
["FORMER county-level city"] = {
-- For categorizing former counties and county-level cities of China
link = false,
fallback = "FORMER subpolity",
},
["former countries and country-like entities!"] = {
category_link = "[[country|countries]] and similar [[polity|polities]] that no longer exist",
bare_category_breadcrumb = "countries and country-like entities",
bare_category_parent = "former polities",
},
["FORMER country"] = {
link = false,
class = "polity", --ห้ามแปล class
default = {"Former countries and country-like entities"},
},
["former dependent territories!"] = {
category_link = "[[w:dependent territory|dependent territories]] (colonies, dependencies, protectorates, etc.) that no longer exist",
bare_category_breadcrumb = "dependent territories",
bare_category_parent = "former political divisions",
},
["FORMER dependent territory"] = {
link = false,
preposition = "ของ",
class = "subpolity",
default = {"Former dependent territories"},
},
["former districts!"] = {
-- For categorizing former districts of China
category_link = "no-longer-existing [[district]]s",
bare_category_breadcrumb = "อำเภอ",
bare_category_parent = "former political divisions",
},
["FORMER district"] = {
-- For categorizing former districts of China
link = false,
fallback = "FORMER subpolity",
},
["FORMER geographic region"] = {
link = false,
fallback = "geographic and cultural area",
},
["FORMER man-made structure"] = {
link = false,
class = "man-made structure",
default = {"Former man-made structures"},
},
["former man-made structures!"] = {
category_link = "man-made structures such as [[airport]]s and [[park]]s that no longer exist",
bare_category_breadcrumb = "man-made structures",
bare_category_parent = "former places",
},
["former municipalities!"] = {
-- For categorizing former municipalities of the Netherlands
category_link = "no-longer-existing [[municipality|municipalities]]",
bare_category_breadcrumb = "เทศบาล",
bare_category_parent = "former political divisions",
},
["FORMER municipality"] = {
-- For categorizing former municipalities of the Netherlands
link = false,
fallback = "FORMER subpolity",
},
["FORMER natural feature"] = {
link = false,
class = "natural feature",
default = {"Former natural features"},
},
["former natural features!"] = {
category_link = "natural features such as [[lake]]s, [[river]]s and [[island]]s that no longer exist",
bare_category_breadcrumb = "natural features",
bare_category_parent = "former places",
},
["FORMER non-admin settlement"] = {
link = false,
class = "non-admin settlement",
fallback = "FORMER settlement",
},
["former places!"] = {
category_link = "[[place]]s of all sorts that no longer exist",
bare_category_breadcrumb = "former",
bare_category_parent = "สถานที่",
},
["former political divisions!"] = {
category_link = "[[political]] [[division]]s (states, provinces, counties, etc.) that no longer exist",
bare_category_breadcrumb = "political divisions",
bare_category_parent = "former places",
},
["former polities!"] = {
category_link = "[[polity|polities]] (countries, kingdoms, empires, etc.) that no longer exist",
bare_category_breadcrumb = "องค์การทางการเมือง",
bare_category_parent = "former places",
},
["FORMER polity"] = {
link = false,
class = "polity", --ห้ามแปล class
default = {"Former polities"},
},
["former prefectures!"] = {
-- For categorizing former prefectures of China
category_link = "no-longer-existing [[prefecture]]s",
bare_category_breadcrumb = "prefectures",
bare_category_parent = "former political divisions",
},
["FORMER prefecture"] = {
-- For categorizing former prefectures of China
link = false,
fallback = "FORMER subpolity",
},
["former provinces!"] = {
-- For categorizing former provinces of China, etc.
category_link = "no-longer-existing [[province]]s",
bare_category_breadcrumb = "จังหวัด",
bare_category_parent = "former political divisions",
},
["FORMER province"] = {
-- For categorizing ancient/historical/former provinces of the Roman Empire
link = false,
fallback = "FORMER subpolity",
},
["former region"] = {
-- A former region is considered a former political division, but not a 'historical/traditional/etc.' region.
link = "separately",
preposition = "ของ",
inherently_former = {"FORMER"},
class = "subpolity",
},
["FORMER settlement"] = {
link = false,
has_neighborhoods = true,
class = "settlement",
default = {"Former settlements"},
},
["former settlements!"] = {
category_link = "[[city|cities]], [[town]]s and [[village]]s that no longer exist or have been merged or reclassified",
bare_category_breadcrumb = "การตั้งถิ่นฐาน",
bare_category_parent = "former political divisions",
},
["FORMER subpolity"] = {
link = false,
preposition = "ของ",
class = "subpolity",
default = {"Former political divisions"},
},
----------------------------------------------------------------------------------------------
-- form-of categories --
----------------------------------------------------------------------------------------------
---------- Abbreviations ----------
["abbreviations of counties!"] = {
-- For categorizing abbreviations of counties of e.g. England
full_category_link = "{{glossary|abbreviation}}s of [[name]]s of [[county|counties]]",
bare_category_breadcrumb = "เทศมณฑล",
bare_category_parent = "abbreviations of political divisions",
},
["abbreviations of countries!"] = {
full_category_link = "{{glossary|abbreviation}}s of [[name]]s of [[country|countries]]",
bare_category_breadcrumb = "ประเทศ",
bare_category_parent = "abbreviations of places",
},
["abbreviations of departments!"] = {
-- For categorizing abbreviations of departments of e.g. France
full_category_link = "{{glossary|abbreviation}}s of [[name]]s of [[department]]s",
bare_category_breadcrumb = "departments",
bare_category_parent = "abbreviations of political divisions",
},
["abbreviations of districts!"] = {
-- For categorizing abbreviations of districts of e.g. ???
full_category_link = "{{glossary|abbreviation}}s of [[name]]s of [[district]]s",
bare_category_breadcrumb = "อำเภอ",
bare_category_parent = "abbreviations of political divisions",
},
["abbreviations of divisions!"] = {
-- For categorizing abbreviations of divisions of e.g. Bangladesh
full_category_link = "{{glossary|abbreviation}}s of [[name]]s of [[division]]s",
bare_category_breadcrumb = "divisions",
bare_category_parent = "abbreviations of political divisions",
},
["abbreviations of former countries!"] = {
full_category_link = "{{glossary|abbreviation}}s of [[country|countries]] that no longer [[exist]]",
bare_category_breadcrumb = "ประเทศ",
bare_category_parent = "abbreviations of former places",
},
["abbreviations of former places!"] = {
full_category_link = "{{glossary|abbreviation}}s of [[place]]s that no longer [[exist]]",
bare_category_breadcrumb = "abbreviations",
bare_category_parent = "former places",
addl_bare_category_parents = {{name = "abbreviations of places", sort = "former"}},
},
["abbreviations of places!"] = {
full_category_link = "{{glossary|abbreviation}}s of [[name]]s of [[place]]s",
bare_category_breadcrumb = "abbreviations",
bare_category_parent = "สถานที่",
},
["abbreviations of political divisions!"] = {
full_category_link = "{{glossary|abbreviation}}s of [[name]]s of [[political]] [[division]]s",
bare_category_breadcrumb = "political divisions",
bare_category_parent = "abbreviations of places",
},
["abbreviations of prefectures!"] = {
-- For categorizing abbreviations of prefectures of e.g. Japan
full_category_link = "{{glossary|abbreviation}}s of [[name]]s of [[prefecture]]s",
bare_category_breadcrumb = "prefectures",
bare_category_parent = "abbreviations of political divisions",
},
["abbreviations of provinces!"] = {
-- For categorizing abbreviations of provinces of e.g. Canada
full_category_link = "{{glossary|abbreviation}}s of [[name]]s of [[province]]s",
bare_category_breadcrumb = "จังหวัด",
bare_category_parent = "abbreviations of political divisions",
},
["abbreviations of provinces and territories!"] = {
full_category_link = "{{glossary|abbreviation}}s of [[name]]s of [[province]]s and [[territory|territories]]",
bare_category_breadcrumb = "provinces and territories",
bare_category_parent = "abbreviations of political divisions",
},
["abbreviations of regions!"] = {
-- For categorizing abbreviations of regions of e.g. Italy
full_category_link = "{{glossary|abbreviation}}s of [[name]]s of [[administrative region]]s",
bare_category_breadcrumb = "ภูมิภาค",
bare_category_parent = "abbreviations of political divisions",
},
["abbreviations of states!"] = {
-- For categorizing abbreviations of states of e.g. the United States
full_category_link = "{{glossary|abbreviation}}s of [[name]]s of [[state]]s",
bare_category_breadcrumb = "รัฐ",
bare_category_parent = "abbreviations of political divisions",
},
["abbreviations of states and territories!"] = {
full_category_link = "{{glossary|abbreviation}}s of [[name]]s of [[state]]s and [[territory|territories]]",
bare_category_breadcrumb = "states and territories",
bare_category_parent = "abbreviations of political divisions",
},
["abbreviations of states and union territories!"] = {
full_category_link = "{{glossary|abbreviation}}s of [[name]]s of [[state]]s and [[union territory|union territories]]",
bare_category_breadcrumb = "states and union territories",
bare_category_parent = "abbreviations of political divisions",
},
["abbreviations of territories!"] = {
full_category_link = "{{glossary|abbreviation}}s of [[name]]s of [[territory|territories]]",
bare_category_breadcrumb = "ดินแดน",
bare_category_parent = "abbreviations of political divisions",
},
["ABBREVIATION_OF country"] = {
link = false,
default = {"Abbreviations of countries"},
},
["ABBREVIATION_OF county"] = {
link = false,
fallback = "ABBREVIATION_OF subpolity",
},
["ABBREVIATION_OF department"] = {
link = false,
fallback = "ABBREVIATION_OF subpolity",
},
["ABBREVIATION_OF district"] = {
link = false,
fallback = "ABBREVIATION_OF subpolity",
},
["ABBREVIATION_OF division"] = {
link = false,
fallback = "ABBREVIATION_OF subpolity",
},
["ABBREVIATION_OF FORMER country"] = {
link = false,
default = {"Abbreviations of former countries"},
},
["ABBREVIATION_OF FORMER place"] = {
link = false,
default = {"Abbreviations of former places"},
},
["ABBREVIATION_OF place"] = {
link = false,
default = {"Abbreviations of places"},
},
["ABBREVIATION_OF prefecture"] = {
link = false,
fallback = "ABBREVIATION_OF subpolity",
},
["ABBREVIATION_OF province"] = {
link = false,
fallback = "ABBREVIATION_OF subpolity",
},
["ABBREVIATION_OF region"] = {
link = false,
fallback = "ABBREVIATION_OF subpolity",
},
["ABBREVIATION_OF state"] = {
link = false,
fallback = "ABBREVIATION_OF subpolity",
},
["ABBREVIATION_OF subpolity"] = {
link = false,
default = {"Abbreviations of political divisions"},
},
["ABBREVIATION_OF territory"] = {
link = false,
fallback = "ABBREVIATION_OF subpolity",
},
["ABBREVIATION_OF union territory"] = {
link = false,
fallback = "ABBREVIATION_OF subpolity",
},
---------- Archaic forms ----------
["archaic forms of places!"] = {
full_category_link = "{{glossary|archaic}} [[form]]s of [[name]]s of [[place]]s",
bare_category_breadcrumb = "archaic forms",
bare_category_parent = "สถานที่",
},
["ARCHAIC_FORM_OF place"] = {
link = false,
default = {"Archaic forms of places"},
},
---------- Clippings ----------
["clippings of places!"] = {
full_category_link = "{{glossary|clipping}}s of [[name]]s of [[place]]s",
bare_category_breadcrumb = "clippings",
bare_category_parent = "สถานที่",
},
["CLIPPING_OF place"] = {
link = false,
default = {"Clippings of places"},
},
---------- Dated forms ----------
["dated forms of places!"] = {
full_category_link = "{{glossary|dated}} [[form]]s of [[name]]s of [[place]]s",
bare_category_breadcrumb = "dated forms",
bare_category_parent = "สถานที่",
},
["DATED_FORM_OF place"] = {
link = false,
default = {"Dated forms of places"},
},
---------- Derogatory names ----------
["derogatory names for cities!"] = {
full_category_link = "{{glossary|derogatory}} [[name]]s for [[city|cities]]",
bare_category_breadcrumb = "นคร",
bare_category_parent = "derogatory names for places",
addl_bare_category_parents = {"nicknames for cities"},
},
["derogatory names for continents!"] = {
full_category_link = "{{glossary|derogatory}} [[name]]s for [[continent]]s",
bare_category_breadcrumb = "ทวีป",
bare_category_parent = "derogatory names for places",
addl_bare_category_parents = {"nicknames for continents"},
},
["derogatory names for countries!"] = {
full_category_link = "{{glossary|derogatory}} [[name]]s for [[country|countries]]",
bare_category_breadcrumb = "ประเทศ",
bare_category_parent = "derogatory names for places",
addl_bare_category_parents = {"nicknames for countries"},
},
["derogatory names for places!"] = {
full_category_link = "{{glossary|derogatory}} [[name]]s for [[place]]s",
bare_category_breadcrumb = "derogatory names",
bare_category_parent = "nicknames for places",
},
["derogatory names for states!"] = {
full_category_link = "{{glossary|derogatory}} [[name]]s for [[state]]s",
bare_category_breadcrumb = "รัฐ",
bare_category_parent = "derogatory names for places",
addl_bare_category_parents = {"nicknames for states"},
},
["DEROGATORY_NAME_FOR capital"] = {
link = false,
default = {"Derogatory names for cities"},
},
["DEROGATORY_NAME_FOR city"] = {
link = false,
default = {"Derogatory names for cities"},
},
["DEROGATORY_NAME_FOR continent"] = {
link = false,
default = {"Derogatory names for continents"},
},
["DEROGATORY_NAME_FOR country"] = {
link = false,
default = {"Derogatory names for countries"},
},
["DEROGATORY_NAME_FOR metropolitan city"] = {
-- "metropolitan city" doesn't fall back to "นคร"
link = false,
default = {"Derogatory names for cities"},
},
["DEROGATORY_NAME_FOR place"] = {
link = false,
default = {"Derogatory names for places"},
},
["DEROGATORY_NAME_FOR prefecture-level city"] = {
-- "prefecture-level city" doesn't fall back to "นคร" but things like "county-level city" and
-- "subprovincial city" fall back to "prefecture-level city"
link = false,
default = {"Derogatory names for cities"},
},
["DEROGATORY_NAME_FOR state"] = {
link = false,
default = {"Derogatory names for states"},
},
["DEROGATORY_NAME_FOR town"] = {
link = false,
default = {"Derogatory names for cities"},
},
---------- Ellipses ----------
["ellipses of places!"] = {
full_category_link = "{{glossary|ellipsis|ellipses}} of [[name]]s of [[place]]s",
bare_category_breadcrumb = "ellipses",
bare_category_parent = "สถานที่",
},
["ELLIPSIS_OF place"] = {
link = false,
default = {"Ellipses of places"},
},
---------- Former long-form names ----------
["former long-form names of countries!"] = {
full_category_link = "no-longer-[[use]]d [[long]]-[[form]] (but typically [[unofficial]]) [[name]]s of [[country|countries]]",
bare_category_breadcrumb = "ประเทศ",
bare_category_parent = "former long-form names of places",
addl_bare_category_parents = {{name = "former names of countries", sort = "long-form"}},
},
["former long-form names of places!"] = {
full_category_link = "no-longer-[[use]]d [[long]]-[[form]] (but typically [[unofficial]]) [[name]]s of [[place]]s",
bare_category_breadcrumb = "long-form",
bare_category_parent = "former names of places",
},
["FORMER_LONG_FORM_OF country"] = {
link = false,
default = {"Former long-form names of countries"},
},
["FORMER_LONG_FORM_OF place"] = {
link = false,
default = {"Former long-form names of places"},
},
---------- Former names ----------
["former names of capitals!"] = {
full_category_link = "[[former]] [[name]]s of [[capital city|capital cities]] that generally still exist but under a different name",
bare_category_breadcrumb = "capitals",
bare_category_parent = "former names of settlements",
},
["former names of countries!"] = {
full_category_link = "[[former]] [[name]]s of [[country|countries]] that generally still exist but under a different name",
bare_category_breadcrumb = "ประเทศ",
bare_category_parent = "former names of places",
},
["former names of places!"] = {
full_category_link = "[[former]] [[name]]s of [[place]]s that generally still exist but under a different name",
bare_category_breadcrumb = "former names",
bare_category_parent = "สถานที่",
},
["former names of political divisions!"] = {
full_category_link = "[[former]] [[name]]s of [[political]] [[division]]s (states, provinces, counties, etc.) that generally still exist but under a different name",
bare_category_breadcrumb = "political divisions",
bare_category_parent = "former names of places",
},
["former names of polities!"] = {
full_category_link = "[[former]] [[name]]s of [[polity|polities]] (e.g. [[country|countries]]) that generally still exist but under a different name",
bare_category_breadcrumb = "องค์การทางการเมือง",
bare_category_parent = "former names of places",
},
["former names of settlements!"] = {
full_category_link = "[[former]] [[name]]s of [[city|cities]], [[town]]s, [[village]]s, etc. that generally still exist but under a different name",
bare_category_breadcrumb = "การตั้งถิ่นฐาน",
bare_category_parent = "former names of political divisions",
},
["FORMER_NAME_OF capital"] = {
link = false,
default = {"Former names of capitals"},
},
["FORMER_NAME_OF country"] = {
link = false,
default = {"Former names of countries"},
},
["FORMER_NAME_OF place"] = {
link = false,
default = {"Former names of places"},
},
["FORMER_NAME_OF polity"] = {
link = false,
default = {"Former names of polities"},
},
["FORMER_NAME_OF region"] = {
link = false,
fallback = "FORMER_NAME_OF subpolity",
},
["FORMER_NAME_OF settlement"] = {
link = false,
default = {"Former names of settlements"},
},
["FORMER_NAME_OF subpolity"] = {
link = false,
default = {"Former names of political divisions"},
},
---------- Former nicknames ----------
["former nicknames for cities!"] = {
full_category_link = "no-longer-used [[nickname]]s for [[city|cities]], e.g. the [[Eternal City]] for [[Kyoto]] during the {{w|Heian period}} ({{circa2|800–1100|short=yes}} {{AD}})",
bare_category_breadcrumb = "นคร",
bare_category_parent = "former nicknames for places",
addl_bare_category_parents = {"nicknames for cities"},
},
["former nicknames for places!"] = {
full_category_link = "no-longer-used [[nickname]]s for [[place]]s",
bare_category_breadcrumb = "former",
bare_category_parent = "nicknames for places",
addl_bare_category_parents = {{name = "former names of places", sort = "nicknames"}},
},
["FORMER_NICKNAME_FOR capital"] = {
link = false,
default = {"Former nicknames for cities"},
},
["FORMER_NICKNAME_FOR city"] = {
link = false,
default = {"Former nicknames for cities"},
},
["FORMER_NICKNAME_FOR metropolitan city"] = {
-- "metropolitan city" doesn't fall back to "นคร"
link = false,
default = {"Former nicknames for cities"},
},
["FORMER_NICKNAME_FOR place"] = {
link = false,
default = {"Former nicknames for places"},
},
["FORMER_NICKNAME_FOR prefecture-level city"] = {
-- "prefecture-level city" doesn't fall back to "นคร" but things like "county-level city" and
-- "subprovincial city" fall back to "prefecture-level city"
link = false,
default = {"Former nicknames for cities"},
},
["FORMER_NICKNAME_FOR town"] = {
link = false,
default = {"Former nicknames for cities"},
},
---------- Former official names ----------
["former official names of countries!"] = {
full_category_link = "no-longer-[[use]]d [[official]] [[name]]s of [[country|countries]]",
bare_category_breadcrumb = "ประเทศ",
bare_category_parent = "former official names of places",
addl_bare_category_parents = {{name = "former names of countries", sort = "official"}},
},
["former official names of places!"] = {
full_category_link = "no-longer-[[use]]d [[official]] [[name]]s of [[place]]s",
bare_category_breadcrumb = "official",
bare_category_parent = "former names of places",
},
["FORMER_OFFICIAL_NAME_OF country"] = {
link = false,
default = {"Former official names of countries"},
},
["FORMER_OFFICIAL_NAME_OF place"] = {
link = false,
default = {"Former official names of places"},
},
---------- Long-form names ----------
["long-form names of countries!"] = {
full_category_link = "[[long]]-[[form]] (but typically [[unofficial]]) [[name]]s of [[country|countries]]",
bare_category_breadcrumb = "ประเทศ",
bare_category_parent = "long-form names of places",
},
["long-form names of places!"] = {
full_category_link = "[[long]]-[[form]] (but typically [[unofficial]]) [[name]]s of [[place]]s",
bare_category_breadcrumb = "long-form names",
bare_category_parent = "สถานที่",
},
["LONG_FORM_OF country"] = {
link = false,
default = {"Long-form names of countries"},
},
["LONG_FORM_OF place"] = {
link = false,
default = {"Long-form names of places"},
},
---------- Nicknames ----------
["nicknames for cities!"] = {
full_category_link = "[[nickname]]s for [[city|cities]], e.g. the [[Big Apple]] for [[New York City]]",
bare_category_breadcrumb = "นคร",
bare_category_parent = "nicknames for places",
addl_bare_category_parents = {"นคร"},
},
["nicknames for continents!"] = {
full_category_link = "[[nickname]]s for [[continent]]s",
bare_category_breadcrumb = "ทวีป",
bare_category_parent = "nicknames for places",
addl_bare_category_parents = {"ทวีป"},
},
["nicknames for countries!"] = {
full_category_link = "[[nickname]]s for [[country|countries]]",
bare_category_breadcrumb = "ประเทศ",
bare_category_parent = "nicknames for places",
addl_bare_category_parents = {"ประเทศ"},
},
["nicknames for places!"] = {
full_category_link = "[[nickname]]s for [[place]]s",
bare_category_breadcrumb = "สถานที่",
bare_category_parent = "nicknames",
addl_bare_category_parents = {"สถานที่"},
},
["nicknames for states!"] = {
-- For categorizing nicknames for states of e.g. the United States
full_category_link = "[[nicknames]] for [[state]]s",
bare_category_breadcrumb = "รัฐ",
bare_category_parent = "nicknames for places",
addl_bare_category_parents = {"รัฐ"},
},
["NICKNAME_FOR capital"] = {
link = false,
default = {"Nicknames for cities"},
},
["NICKNAME_FOR city"] = {
link = false,
default = {"Nicknames for cities"},
},
["NICKNAME_FOR continent"] = {
link = false,
default = {"Nicknames for continents"},
},
["NICKNAME_FOR country"] = {
link = false,
default = {"Nicknames for countries"},
},
["NICKNAME_FOR metropolitan city"] = {
-- "metropolitan city" doesn't fall back to "นคร"
link = false,
default = {"Nicknames for cities"},
},
["NICKNAME_FOR place"] = {
link = false,
default = {"Nicknames for places"},
},
["NICKNAME_FOR prefecture-level city"] = {
-- "prefecture-level city" doesn't fall back to "นคร" but things like "county-level city" and
-- "subprovincial city" fall back to "prefecture-level city"
link = false,
default = {"Nicknames for cities"},
},
["NICKNAME_FOR state"] = {
link = false,
default = {"Nicknames for states"},
},
["NICKNAME_FOR town"] = {
link = false,
default = {"Nicknames for cities"},
},
---------- Obsolete forms ----------
["obsolete forms of places!"] = {
full_category_link = "{{glossary|obsolete}} [[form]]s of [[name]]s of [[place]]s",
bare_category_breadcrumb = "obsolete forms",
bare_category_parent = "สถานที่",
},
["OBSOLETE_FORM_OF place"] = {
link = false,
default = {"Obsolete forms of places"},
},
---------- Official names ----------
["official names of countries!"] = {
full_category_link = "[[official]] [[name]]s of [[country|countries]]",
bare_category_breadcrumb = "ประเทศ",
bare_category_parent = "official names of places",
},
["official names of former countries!"] = {
full_category_link = "[[official]] [[name]]s of [[country|countries]] that no longer [[exist]]",
bare_category_breadcrumb = "ประเทศ",
bare_category_parent = "official names of former places",
},
["official names of former places!"] = {
full_category_link = "[[official]] [[name]]s of [[place]]s that no longer [[exist]]",
bare_category_breadcrumb = "official names",
bare_category_parent = "former places",
addl_bare_category_parents = {{name = "official names of places", sort = "former"}},
},
["official names of places!"] = {
full_category_link = "[[official]] [[name]]s of [[place]]s",
bare_category_breadcrumb = "official names",
bare_category_parent = "สถานที่",
},
["OFFICIAL_NAME_OF country"] = {
link = false,
default = {"Official names of countries"},
},
["OFFICIAL_NAME_OF FORMER country"] = {
link = false,
default = {"Official names of former countries"},
},
["OFFICIAL_NAME_OF FORMER place"] = {
link = false,
default = {"Official names of former places"},
},
["OFFICIAL_NAME_OF place"] = {
link = false,
default = {"Official names of places"},
},
---------- Official nicknames ----------
["official nicknames for places!"] = {
full_category_link = "[[official]] [[nickname]]s for [[place]]s",
bare_category_breadcrumb = "official",
bare_category_parent = "nicknames for places",
},
["official nicknames for states!"] = {
-- For categorizing official nicknames for states of e.g. the United States
full_category_link = "[[official]] [[nicknames]] for [[state]]s",
bare_category_breadcrumb = "official",
bare_category_parent = "nicknames for states",
addl_bare_category_parents = {"รัฐ"},
},
["OFFICIAL_NICKNAME_FOR place"] = {
link = false,
default = {"Official nicknames for places"},
},
["OFFICIAL_NICKNAME_FOR state"] = {
link = false,
default = {"Official nicknames for states"},
},
}
export.plural_placetype_to_singular = {}
for sg_placetype, spec in pairs(export.placetype_data) do
if spec.plural then
export.plural_placetype_to_singular[spec.plural] = sg_placetype
end
end
return export
3v98feuk221e1wze9d2owir8ih5i7wo
5720700
5720699
2026-04-21T01:48:04Z
OctraBot
3198
5720700
Scribunto
text/plain
local export = {}
export.force_cat = false -- set to true for testing
local m_locations = require("Module:place/locations")
local m_links = require("Module:links")
local m_table = require("Module:table")
local m_strutils = require("Module:string utilities")
local debug_track_module = "Module:debug/track"
local en_utilities_module = "Module:en-utilities"
local dump = mw.dumpObject
local insert = table.insert
local concat = table.concat
local internal_error = m_locations.internal_error
export.internal_error = internal_error
local process_error = m_locations.process_error
export.process_error = process_error
local unpack = unpack or table.unpack -- Lua 5.2 compatibility
local ucfirst = m_strutils.ucfirst
local ulower = m_strutils.lower
local rmatch = m_strutils.match
local split = m_strutils.split
--[==[ intro:
This module contains placetype data used by [[Module:place]] and {{tl|place}}, along with a significant amount of code
to work with both placetypes and locations, as well as some placename-related info (FIXME: Consider moving it to
[[Module:place/locations]]). See also [[Module:place/locations]], which has definitions of all known locations. You must
currently load this module using {{cd|require()}}, not using {{cd|mw.loadData()}}.
In particular, it contains two fundamental and tricky functions:
# `get_placetype_equivs`, which finds the equivalent placetypes to look under in order to find a given property, and in
the process correctly handles placetypes with qualifiers (including qualifiers that act similar to "type-raising"
operators in that they do something non-trivial to the placetype to their right) as well as form-of directives and
fallbacks.
# `find_matching_holonym_location`, which looks up a holonym to find a matching known location, but in the process
checks holonyms to the right to make sure there isn't a clash between the user-specified containing holonyms and the
containers of the known location being considered. This is done to prevent overcategorizing when either there are two
known locations with the same name (e.g. Birmingham in England and Birmingham, Alabama in the US), or more generally
two locations with the same name, one of which is a known location but where the other is not (e.g. we're processing
non-known-location Mérida, Spain and don't want it categorized like known location Mérida, Yucatán, Mexico).
Both of these functions are invoked repeatedly, and probably are invoked several times on the same inputs and as a
result are candidates for memoization to speed up the operation of {{tl|place}}.
]==]
------------------------------------------------------------------------------------------
-- Basic utilities --
------------------------------------------------------------------------------------------
--[==[
Return true if `force_cat` is set either in this module or in [[Module:place/locations]].
]==]
function export.get_force_cat()
return export.force_cat or m_locations.force_cat
end
-- Add the page to a tracking "category". To see the pages in the "category",
-- go to [[Wiktionary:Tracking/place/PAGE]] and click on "What links here".
local function track(page)
require(debug_track_module)("place/" .. page)
return true
end
function export.remove_links_and_html(text)
text = m_links.remove_links(text)
return text:gsub("<.->", "")
end
--[==[
Return the singular version of a maybe-plural placetype, or nil if not plural. This correctly handles placetypes with
irregular plurals such as `kibbutzim` plural of `kibbutz` by looking up in a table constructed from the `plural` values
specified in `placetype_data`. If a special plural value is not found, the regular singularization algorithm in
[[Module:en-utilities]] is invoked, which reverses the y -> ies change after vowels and the 'es' addition after sh/ch/x,
and otherwise just subtracts a final 's' (which will incorrectly generate 'passe' for plural 'passes'; FIXME: consider
changing this for words ending in '-sses'). If the generated singular is the same as the passed-in value, nil is
returned.
]==]
function export.maybe_singularize_placetype(placetype)
if not placetype then
return nil
end
if export.plural_placetype_to_singular[placetype] then
return export.plural_placetype_to_singular[placetype]
end
local retval = --[[require(en_utilities_module).singularize(placetype)]] placetype
if retval == placetype then
return nil
end
return retval
end
-- Return the correct plural of a placetype, and (if `do_ucfirst` is given) make the first letter uppercase. We first
-- look up the plural in `placetype_data`, falling back to pluralize() in [[Module:en-utilities]], which is almost
-- always correct.
function export.pluralize_placetype(placetype, do_ucfirst)
local ptdata = export.placetype_data[placetype]
if ptdata and ptdata.plural then
placetype = ptdata.plural
else
placetype = --[[require(en_utilities_module).pluralize(placetype)]] placetype
end
if do_ucfirst then
return ucfirst(placetype)
else
return placetype
end
end
--[==[
Get the data associated with a placetype, which may be in its singular or plural form. If `from_category` is specified,
we also look for category-only placetypes (generally plural) followed by `!`. Return three values: (a) the placetype
under which the data can be looked up (i.e. in its singular form if the passed-in `placetype` is plural and did not
match a category-only placetype followed by `!`); (b) the placetype data structure; (c) the type of `placetype` match
that occurred, one of `"direct"` if the canonical placetype is the same as the passed-in `placetype` and also the same
as the key under which `ptdata` was looked up, or `"direct-category"` if the `ptdata` was looked up under a key formed
from the passed-in `placetype` by adding `!`, or `"plural"` if the `ptdata` was looked up under the singularized version
of the plural passed-in `placetype`.
]==]
function export.get_placetype_data(placetype, from_category)
local ptdata = export.placetype_data[placetype]
if ptdata then
return placetype, ptdata, "direct"
end
if from_category then
ptdata = export.placetype_data[placetype .. "!"]
if ptdata then
return placetype .. "!", ptdata, "direct-category"
end
end
local sg_placetype = export.maybe_singularize_placetype(placetype)
if sg_placetype then
ptdata = export.placetype_data[sg_placetype]
if ptdata then
return sg_placetype, ptdata, "plural"
end
end
return nil
end
--[==[
Check for special pseudo-placetypes that should be ignored for categorization purposes.
]==]
function export.placetype_is_ignorable(placetype)
return placetype == "and" or placetype == "or" or placetype == "และ" or placetype == "หรือ" or placetype:find("^%(")
end
function export.resolve_placetype_aliases(placetype)
return export.placetype_aliases[placetype] or placetype
end
--[==[
Return a property from `placetype_data` for a given placetype. If the placetype isn't found in `placetype_data`, or the
key isn't found in the placetype's entry in `placetype_data`, return nil.
]==]
function export.get_placetype_prop(placetype, key)
-- Usually we are called on equivalent placetypes returned from `get_placetype_equivs`, in which case placetype
-- aliases have been resolved, but sometimes not, e.g. when fetching the indefinite article in
-- get_placetype_article(). `resolve_placetype_aliases` is just a simple lookup and it doesn't hurt to do it twice.
placetype = export.resolve_placetype_aliases(placetype)
if export.placetype_data[placetype] then
return export.placetype_data[placetype][key]
else
return nil
end
end
--[==[
Given a placetype, split the placetype into one or more potential ''splits'', each consisting of a three-element list
{ {``prev_qualifiers``, ``this_qualifier``, ``reduced_placetype``}}, i.e.
# the concatenation of zero or more previously-recognized qualifiers on the left, normally canonicalized (if there are
zero such qualifiers, the value will be nil);
# a single recognized qualifier, normally canonicalized (if there is no qualifier, the value will be nil);
# the "reduced placetype" on the right.
Splitting between the qualifier in (2) and the reduced placetype in (3) happens at each space character, proceeding from
left to right, and stops if a qualifier isn't recognized. All placetypes are canonicalized by checking for aliases
in `placetype_aliases`, but no other checks are made as to whether the reduced placetype is recognized. Canonicalization
of qualifiers does not happen if `no_canon_qualifiers` is specified.
For example, given the placetype `"small beachside unincorporated community"`, the return value will be
{ {
{nil, nil, "small beachside unincorporated community"},
{nil, "small", "beachside unincorporated community"},
{"small", "[[beachfront]]", "unincorporated community"},
{"small [[beachfront]]", "[[unincorporated]]", "community"},
}}
Here, `"beachside"` is canonicalized to `"[[beachfront]]"` and `"unincorporated"` is canonicalized to
`"[[unincorporated]]"`, in both cases according to the entry in `placetype_qualifiers`.
On the other hand, if given `"small former haunted community"`, the return value will be
{ {
{nil, nil, "small former haunted community"},
{nil, "small", "former haunted community"},
{"small", "former", "haunted community"},
}}
because `"small"` and `"former"` but not `"haunted"` are recognized as qualifiers.
Finally, if given `"former adr"`, the return value will be
{ {
{nil, nil, "former adr"},
{nil, "former", "administrative region"},
}}
because `"adr"` is a recognized placetype alias for `"administrative region"`.
]==]
function export.split_qualifiers_from_placetype(placetype, no_canon_qualifiers)
local splits = {{nil, nil, export.resolve_placetype_aliases(placetype)}}
local prev_qualifier = nil
while true do
local qualifier, reduced_placetype = placetype:match("^(.-) (.*)$")
if qualifier then
local canon = export.placetype_qualifiers[qualifier]
if canon == nil then
break
end
local new_qualifier = qualifier
if type(canon) == "table" then
canon = canon.link
end
if not no_canon_qualifiers and canon ~= false then
if canon == true then
new_qualifier = "[[" .. qualifier .. "]]"
else
new_qualifier = canon
end
end
insert(splits, {prev_qualifier, new_qualifier, export.resolve_placetype_aliases(reduced_placetype)})
prev_qualifier = prev_qualifier and prev_qualifier .. " " .. new_qualifier or new_qualifier
placetype = reduced_placetype
else
break
end
end
return splits
end
--[==[
Given a `placetype` (which may be pluralized), return an ordered list of equivalent placetypes to look under to find the
placetype's properties (such as the category or categories to be inserted). The return value is actually an ordered list
of objects of the form `{qualifier=``qualifier``, placetype=``equiv_placetype``}` where ``equiv_placetype`` is a
placetype whose properties to look up, derived from the passed-in placetype or from a contiguous subsequence of the
words in the passed-in placetype (always including the rightmost word in the placetype, i.e. we successively chop off
qualifier words from the left and use the remainder to find equivalent placetypes). ``qualifier`` is the remaining words
not part of the subsequence used to find ``equiv_placetype``; or nil if all words in the passed-in placetype were used
to find ``equiv_placetype``. (FIXME: This qualifier is not currently used anywhere.) Only placetypes for which there is
an entry in `placetype_data` are included. The placetype passed in is always checked first, and will form the first
entry if it exists in `placetype_data`.
'''NOTE:''' This is a tricky function as it implements handling of (a) qualifiers, (b) fallback logic, (c)
"type-raising" qualifiers such as `former`/`ancient`/etc. as well as `fictional` and `mythological`, and (d) form-of
directives, which act somewhat similarly to `former`, and allows interaction between more than one of these
simultaneously (e.g. official names of former places, which have their own categorization).
If {{tl|place}} gets too slow, one potential speedup is to memoize the results of this function, as it appears to be
getting called more than once on the same inputs. Another similar potential speedup is to memoize the results of
`iterate_matching_holonym_location()`.
For example, given the placetype `left tributary`, the following placetype/qualifier combinations are checked in turn:
```
{qualifier = nil, placetype="left tributary"}
{qualifier = "left", placetype="tributary"}
{qualifier = "left", placetype="แม่น้ำ"}
```
and the return value will be
{ {
{qualifier = "left", placetype="tributary"},
{qualifier = "left", placetype="แม่น้ำ"},
}}
The algorithm first enters the placetype itself into the list, then checks for `left tributary` as a recognized
placetype in `placetype_data` and doesn't find it, so it doesn't enter it into the returned list (if it found it, it
would add it as well as any fallbacks directly after it). It then splits off the recognized qualifier `left` to form the
''reduced placetype'' `tributary`, which is entered into the list because it is found in `placetype_data`. Then, because
it has a fallback `river`, which exists in `placetype_data`, the fallback is entered next.
Another example is `small rural fraziones` (where a ''frazione'' is type of subdivision of a ''comune'' or municipality,
often specifically an outlying hamlet). the placetype/qualifier combinations checked are:
```
{qualifier = nil, placetype="small rural fraziones"}
{qualifier = nil, placetype="small rural frazione"}
{qualifier = "small", placetype="rural fraziones"}
{qualifier = "small", placetype="rural frazione"}
{qualifier = "small [[rural]]", placetype="fraziones"}
{qualifier = "small [[rural]]", placetype="frazione"}
{qualifier = "small [[rural]]", placetype="hamlet"}
{qualifier = "small [[rural]]", placetype="village"}
```
The return value ends up as
{qualifier = "small [[rural]]", placetype="frazione"},
{qualifier = "small [[rural]]", placetype="hamlet"},
{qualifier = "small [[rural]]", placetype="village"},
}}
Here, because the result of singularizing `fraziones` returns a different value from the placetype itself, that
singularized value is checked after the original plural value. Also, in the process of splitting off qualifiers,
they are canonicalized if the entry in `placetype_qualifiers` says to do so; in this case, links are placed around
`rural`. Finally, `frazione` has `hamlet` as its fallback, which in turn has `village` as its fallback, so both
fallbacks end up being returned.
`no_fallback`, if set, disables returning equivalent placetypes based on the `fallback` setting for a placetype. This is
used in the first of two loops in find_placetype_cat_specs() in [[Module:place]] to prefer exact matches for placetypes
such as barangays with later holonyms to matches based on a fallback such as `neighborhood` with an earlier holonym.
See the comment in that function in [[Module:place]] for a more detailed explanation of why this is needed. Only the
placetype itself, and any reduced placetypes created by chopping off recognized qualifiers at the beginning, are
returned; but we do not return reduced placetypes if a containing placetype exists in `placetype_data`. (For example,
`"overseas territory"` has a fallback `"dependent territory"`, and `"overseas"` is also a recognized qualifier. When
`no_fallback` is in place, without the above proviso, we would return `"overseas territory"` followed by `"ดินแดน"`
with the incorrect effect of classifying an `"overseas territory"` of the United Kingdom such as `"Gibraltar"` under
[[:Category:Territories of the United Kingdom]] instead of [[:Category:Dependent territories of the United Kingdom]].)
As an exception, if `historical`, `ancient`, `former` or the like are found, they proceed ignoring `no_fallback`,
because it seems tricky to handle them correctly in the presence of `no_fallback`, and historical/former placetypes
rarely occur with exact match category specs anyway.
`no_split_qualifiers` prevents splitting off recognized qualifiers and returning the remainder of the placetype as an
equivalent placetype. Only the passed-in placetype, and any fallbacks, will be returned. This is used in
[[Module:category tree/topic cat/data/Places]] when looking up placetypes found in categories. Such placetypes won't
have qualifiers and so it doesn't make sense to try and look for them.
`from_category`, if set, causes category-only placetypes (those ending in `!`) to also be checked.
`form_of_directive`, if set, causes the specified form-of directive (e.g. `FORMER_NAME_OF`) to be prepended to checked
placetypes, their directive-specific type (e.g. `FORMER_NAME_OF_type`), and their classes (`class`) to get the
appropriate placetypes to check for form-of-directive categories. It falls back to the prepended generic `place` as a
placetype, e.g. `FORMER_NAME_OF place`, if nothing else matches.
`no_check_for_inherently_former` is used internally to prevent an infinite loop when checking for `inherently_former`.
`register_former_as_non_former` is a major hack used in `get_bare_categories` to deal with the mismatch between e.g.
known location `Yugoslavia` declaring itself a `country` but definitions of it declaring it a `former country`. It
causes the non-former version of the specified placetype to be included in the returned equivalents along with the
former placetypes. [FIXME: This should apply only to the entries in `former_countries` but it's tricky to do that now;
fix this in the known-location refactor. -- The known-location refactor is already done but we haven't yet fixed this.]
]==]
function export.get_placetype_equivs(placetype, props)
local no_fallback, no_split_qualifiers, no_check_for_inherently_former, from_category, register_former_as_non_former
local form_of_directive
if props then
no_fallback, no_split_qualifiers, no_check_for_inherently_former, from_category, register_former_as_non_former =
props.no_fallback, props.no_split_qualifiers, props.no_check_for_inherently_former, props.from_category,
props.register_former_as_non_former
form_of_directive = props.form_of_directive
end
local equivs = {}
-- Insert `placetype` into `equivs`, along with any fallback placetypes listed in `placetype_data`. `qualifier` is
-- the preceding qualifier to insert into `equivs` along with the placetype (see comment at top of function). If
-- `from_category` is given, we also check for a category-specific entry consisting of the placetype followed by
-- `!`, and in all cases we also check to see if `placetype` is plural, and if so, insert the singularized version
-- along with its fallbacks (if any) in `placetype_data`. `form_of_prefix` is a form-of prefix such as
-- `OFFICIAL_NAME_OF`. If specified, we check the fallbacks of `placetype` without the prefix but then insert into
-- `equivs` the prefixed placetype. This way, if the user says e.g. {{tl|place|pt|@official name of:Cuba|island country|r/Caribbean}},
-- we will correctly categorize into [[:Category:Official names of countries]], rather than only trying to look up
-- `OFFICIAL_NAME_OF island country` and failing, falling back ultimately to [[:Category:Official names of places]].
local function insert_placetype_and_fallbacks(qualifier, placetype, form_of_prefix)
local function insert_equiv(pt)
if form_of_prefix then
-- Let's say the user says {{tl|place|pt|@official name of:Cuba|island country|r/Caribbean}} and we have
-- no entry for `OFFICIAL_NAME_OF island country` but we do for `OFFICIAL_NAME_OF country` (which we end
-- up processing because `island country` falls back to `country`), and that entry in turn is defined
-- using a fallback. We have to insert that fallback-of-fallback, and the easiest/cleanest way of
-- handling this is by calling ourselves recursively.
insert_placetype_and_fallbacks(qualifier, form_of_prefix .. " " .. pt)
else
insert(equivs, {qualifier=qualifier, placetype=pt})
end
end
-- Insert the placetype, along with any fallbacks.
local canon_placetype, ptdata, ptmatch = export.get_placetype_data(placetype, from_category)
if ptdata then
insert_equiv(canon_placetype)
if no_fallback then
return
end
local first_placetype = #equivs + 1
local prev_placetype = nil
while true do
local pt_value = export.placetype_data[canon_placetype]
if not pt_value then
internal_error("Fallback value %s specified for placetype %s but is not in `placetype_data`",
canon_placetype, prev_placetype)
end
if pt_value.fallback then
insert_equiv(pt_value.fallback)
local last_placetype = #equivs
if last_placetype - first_placetype >= 10 then
local fallback_loop = {}
for i = first_placetype, last_placetype do
insert(fallback_loop, equivs[i].placetype)
end
internal_error("Apparent loop in fallback chain: %s", table.concat(fallback_loop, " -> "))
end
prev_placetype = canon_placetype
canon_placetype = pt_value.fallback
else
break
end
end
end
end
-- Insert `placetype` into `equivs`, along with any fallback placetypes listed in `placetype_data`. This is a
-- wrapper around the more basic `insert_placetype_and_fallbacks()` which handles form-of directives. If there is no
-- form-of directive, this function directly calls `insert_placetype_and_fallbacks()`. We do things this way so that
-- form-of directives correctly combine with `former`-type qualifiers. Note that we also have special backups for
-- form-of directives that check `DIRECTIVE place` (and before that, `DIRECTIVE FORMER/ANCIENT place` is there's a
-- `former`-type directive); these backups live outside this function because we want them done once, late, rather
-- than in each invocation of `process_and_insert_placetype()`.
local function process_and_insert_placetype(qualifier, reduced_placetype)
if form_of_directive then
-- First check for e.g. `OFFICIAL_NAME_OF island country` and its fallbacks; then we look for fallbacks of
-- `island country` and check e.g. `OFFICIAL_NAME_OF country` and its fallbacks. All of this is handled by
-- `insert_placetype_and_fallbacks()` with appropriate parameters. After that, check the general class of
-- the directive, e.g. `subpolity` if something like `district` is given. (Eventually, we check for
-- `OFFICIAL_NAME_OF place` as a backup, but this happens at the end outside the loop over qualifiers.)
insert_placetype_and_fallbacks(qualifier, reduced_placetype, form_of_directive)
if not no_fallback then
local reduced_placetype_equivs = export.get_placetype_equivs(reduced_placetype)
local directive_type = export.get_equiv_placetype_prop_from_equivs(reduced_placetype_equivs,
function(pt) return export.get_placetype_prop(pt, form_of_directive .. "_type") or
export.get_placetype_prop(pt, "class") end
)
if not directive_type then
local pt_data = export.get_equiv_placetype_prop_from_equivs(reduced_placetype_equivs,
function(pt) return export.placetype_data[pt] end
)
if pt_data then
internal_error("For placetype %s in conjunction with form-of directive %s, placetype data " ..
'located but directive-specific type property %s missing, and so is "class"; ' ..
"placetypes searched are %s", reduced_placetype, form_of_directive,
form_of_directive .. "_type", reduced_placetype_equivs)
else
-- This should be allowed, as we allow unrecognized placetypes in general.
end
elseif directive_type ~= "!" then
insert_placetype_and_fallbacks(qualifier, directive_type, form_of_directive)
end
end
else
insert_placetype_and_fallbacks(qualifier, reduced_placetype)
end
end
-- Successively split off recognized qualifiers and loop over successively greater sets of qualifiers from the left
-- (unless `no_split_qualifiers` is specified, in which case we don't check for qualifiers).
local splits
if no_split_qualifiers then
splits = {{nil, nil, export.resolve_placetype_aliases(placetype)}}
else
splits = export.split_qualifiers_from_placetype(placetype)
end
for _, split in ipairs(splits) do
local prev_qualifier, this_qualifier, reduced_placetype = unpack(split, 1, 3)
-- If a special "former" qualifier like `former` or `historical` isn't present, and
-- `no_check_for_inherently_former` is not given (this flag is used to avoid infinite loops), check for
-- "inherently former" placetypes like `satrapy` and `treaty port` that always refer to no-longer-existing
-- placetypes, and handle accordingly.
local unlinked_this_qualifier
if this_qualifier and this_qualifier:find("%[") then
unlinked_this_qualifier = export.remove_links_and_html(this_qualifier)
else
unlinked_this_qualifier = this_qualifier
end
local former_qualifiers = this_qualifier and export.former_qualifiers[unlinked_this_qualifier] or nil
if not former_qualifiers and not no_check_for_inherently_former then
former_qualifiers = export.get_equiv_placetype_prop(reduced_placetype,
function(pt) return export.get_placetype_prop(pt, "inherently_former") end,
{no_check_for_inherently_former = true})
end
-- If a special "former" qualifier like `former` or `historical` is present, map it to the appropriate internal
-- qualifiers (`ANCIENT` and/or `FORMER`, which are written in all-caps to distinguish them from user-specified
-- qualifiers), fetch the `former_type` property, and treat the placetype as if a concatenation of the mapped
-- qualifier(s) and the value of `former_type`. For example, if `medieval village` is given, we map `medieval`
-- to `ANCIENT` and `FORMER`, and `village` to its `former_type` of `settlement`, and enter the placetypes
-- `ANCIENT settlement` and `FORMER settlement` (in that order) into `equivs`. If the placetype following the
-- "former" qualifier is recognized in `placetype_data` but has no `former_type` and no fallback with a
-- `former_type` specified, it is an internal error; but if the placetype isn't recognized (e.g. something like
-- `former greenhouse` is specified and we don't have an entry for `greenhouse`), just track the occurrence and
-- don't enter anything into `equivs`.
if former_qualifiers then
-- FIXME: Should we respect `no_fallback` here? My instinct says no.
local reduced_placetype_equivs = export.get_placetype_equivs(reduced_placetype, {
no_check_for_inherently_former = true
})
local former_type = export.get_equiv_placetype_prop_from_equivs(reduced_placetype_equivs,
function(pt) return export.get_placetype_prop(pt, "former_type") or
export.get_placetype_prop(pt, "class") end
)
if not former_type then
local pt_data = export.get_equiv_placetype_prop_from_equivs(reduced_placetype_equivs,
function(pt) return export.placetype_data[pt] end
)
if pt_data then
internal_error("For placetype %s, placetype data located but `former_type` missing; " ..
"placetypes searched are %s", reduced_placetype, reduced_placetype_equivs)
else
-- Enable error when we've verified there aren't any examples.
track("bad-former-placetype")
track("bad-former-placetype/" .. reduced_placetype)
--process_error("For placetype '%s', unrecognized placetype following 'former'-type " ..
-- "qualifier; searched placetype(s) %s", reduced_placetype, dump(reduced_placetype_equivs))
end
elseif former_type ~= "!" then
-- First check directly for `ANCIENT/FORMER` + the original following placetype. This makes it possible
-- for (e.g.) former provinces of the Roman empire to be categorized specially.
for _, former_qualifier in ipairs(former_qualifiers) do
process_and_insert_placetype(prev_qualifier, former_qualifier .. " " .. reduced_placetype)
end
for _, former_qualifier in ipairs(former_qualifiers) do
process_and_insert_placetype(prev_qualifier, former_qualifier .. " " .. former_type)
end
-- HACK! See explanation above for `register_former_as_non_former`.
if register_former_as_non_former then
process_and_insert_placetype(prev_qualifier, reduced_placetype)
end
-- If we're processing a form-of directive, after doing everything else we do
-- `DIRECTIVE ANCIENT/FORMER place` e.g. `OFFICIAL_NAME_OF FORMER place` as a backup.
if form_of_directive and not no_fallback then
for _, former_qualifier in ipairs(former_qualifiers) do
insert_placetype_and_fallbacks(prev_qualifier, form_of_directive .. " " .. former_qualifier ..
" place")
end
end
-- Don't continue processing equivs. The reason is probably the same as the `break` below for
-- qualifier_to_placetype_equivs[]; categories for `former BLAH` are set using `default`, and
-- non-former equivs will otherwise take precedence.
break
end
end
-- Then see if the rightmost split-off qualifier is in qualifier_to_placetype_equivs
-- (e.g. 'fictional *' -> 'fictional location'). If so, add the mapping.
if this_qualifier and export.qualifier_to_placetype_equivs[unlinked_this_qualifier] then
insert(equivs, {
qualifier=prev_qualifier,
placetype=export.qualifier_to_placetype_equivs[unlinked_this_qualifier]
})
-- Don't continue processing equivs; otherwise, if we specify 'mythological city', even though the
-- equivalent entry for 'mythological location' gets inserted ahead of the entry for 'city', the
-- latter ends up generating the category because the category for 'mythological location' is set as
-- the default value, which is used only when no non-default category can be found.
break
end
-- Finally, join the rightmost split-off qualifier to the previously split-off qualifiers to form a combined
-- qualifier, and add it along with reduced_placetype and any mapping in placetype_data for reduced_placetype.
-- NOTE: The first time through this loop, both `prev_qualifier` and `this_qualifier` are nil, and this inserts
-- the full placetype into `equivs`.
local qualifier = prev_qualifier and prev_qualifier .. " " .. this_qualifier or this_qualifier
process_and_insert_placetype(qualifier, reduced_placetype)
-- If `no_fallback` and there's an entry in `placetype_data` for this placetype, don't include any reduced
-- placetypes to avoid the "overseas territory treated as a territory" issue describe above.
if no_fallback then
local canon_placetype, ptdata, ptmatch = export.get_placetype_data(reduced_placetype, from_category)
if canon_placetype then
break
end
end
end
-- If we're processing a form-of directive, after doing everything else we do `DIRECTIVE place` e.g.
-- `OFFICIAL_NAME_OF place` as a backup; but only if either the placetype as a whole is recognized or the placetype
-- begins with a recognized qualifier. This latter check is to avoid categorizing into e.g.
-- [[Category:en:Former names of places]] in an invocation like
-- {{place|en|@former name of:Democratic Republic of the Congo|country|r/Central Africa|;|used from 1971–1997}};
-- the `used from 1971–1997` gets treated as a placetype and we're called on it.
if form_of_directive and not no_fallback and (splits[2] or export.get_placetype_data(placetype, from_category)) then
insert_placetype_and_fallbacks(nil, form_of_directive .. " place")
end
return equivs
end
function export.get_equiv_placetype_prop_from_equivs(equivs, fun, continue_on_nil_only)
for _, equiv in ipairs(equivs) do
local retval = fun(equiv.placetype)
if continue_on_nil_only and retval ~= nil or not continue_on_nil_only and retval then
return retval, equiv
end
end
return nil, nil
end
--[==[
Given a placetype `placetype` and a function `fun` of one argument, iteratively call the function on equivalent
placetypes fetched from `get_placetype_equivs` until the function returns a non-falsy value (i.e. not {nil} or {false});
but if `continue_on_nil_only` is specified, the iterations continue until the function returns non non-{nil} value.
FIXME: We should make `continue_on_nil_only` the default; but this requires changing some callers.) When `fun` returns a
non-falsy or non-{nil} value, `get_equiv_placetype_prop` returns two values: the value returned by `fun` and the
equivalent placetype that triggered the non-falsy (or non-{nil}) return value. If `fun` never returns a non-falsy (or
non-{nil}) value, `get_equiv_placetype_prop` returns {nil} for both return values. If `placetype` is passed in as {nil},
the return value is the result of calling `fun` on {nil} (whatever it is) with {nil} for the second return value.
]==]
function export.get_equiv_placetype_prop(placetype, fun, props)
if not placetype then
return fun(nil), nil
end
return export.get_equiv_placetype_prop_from_equivs(export.get_placetype_equivs(placetype, props), fun,
props and props.continue_on_nil_only)
end
--[==[
Return the article that is used with an entry placetype. We proceed as follows:
# See if there is a recognized qualifier at the beginning that specifies an article (including `false` for no article).
This takes precedence over anything else, so that e.g. `various capitals` gets no article rather than "`the"`.
# Then check the placetype or any equivalent placetype for the `entry_placetype_use_the` property, indicating that
`"the"` should be used.
# Otherwise we look to see if the placetype itself (not any equivalents, even those involving deleting a qualifier from
the beginning) has an entry in `placetype_data` that specifies the indefinite article using `entry_placetype_use_the`
(principally for use with placetypes like `union territory`).
# Otherwise, we use [[Module:en-utilities]] to apply the standard algorithm to generate `"an"` for words beginning with
a vowel and `"a"` otherwise.
If `ucfirst` is true, the first letter of the article is made upper-case.
]==]
function export.get_placetype_article(placetype, ucfirst)
local art
local qualifier, reduced_placetype = placetype:match("^(.-) (.*)$")
if qualifier then
local canon = export.placetype_qualifiers[qualifier]
if type(canon) == "table" then
art = canon.article
end
end
if art == false then
return art
end
if art == nil then
local placetype_use_the = export.get_equiv_placetype_prop(placetype,
function(pt) return export.get_placetype_prop(pt, "entry_placetype_use_the") end)
if placetype_use_the then
art = "the"
else
art = export.get_placetype_prop(placetype, "entry_placetype_indefinite_article")
if not art then
art = --[[require(en_utilities_module).get_indefinite_article(placetype)]] ""
end
end
end
if ucfirst then
art = m_strutils.ucfirst(art)
end
return art
end
--[==[
Return the preposition that should be used after `placetype` when occurring as an entry placetype or in categories
(e.g. `city >in< France` but `country >of< South America`). The preposition defaults to `"ใน"` if not specified.
]==]
function export.get_placetype_entry_preposition(placetype)
local pt_prep = export.get_equiv_placetype_prop(placetype,
function(pt) return export.get_placetype_prop(pt, "preposition") end
)
return pt_prep or "ใน"
end
--[==[
Given a place desc (see top of file) and a holonym object (see top of file), add a key/value into the place desc's
`holonyms_by_placetype` field corresponding to the placetype and placename of the holonym. For example, corresponding
to the holonym "c/Italy", a key "ประเทศ" with the list value {"Italy"} will be added to the place desc's
`holonyms_by_placetype` field. If there is already a key with that place type, the new placename will be added to the
end of the value's list.
]==]
function export.key_holonym_into_place_desc(place_desc, holonym)
if not holonym.placetype then
return
end
-- Key in equivalent placetypes, so that e.g. `cities/San Francisco` gets keyed under `city`; but don't do
-- fallbacks, as it doesn't seem correct for the "do other holonyms of the same placetype" algorithm to do holonyms
-- of different types just because they have the same fallback.
local equiv_placetypes = export.get_placetype_equivs(holonym.placetype, {no_fallback = true})
local unlinked_placename = holonym.unlinked_placename
for _, equiv in ipairs(equiv_placetypes) do
local placetype = equiv.placetype
if not place_desc.holonyms_by_placetype then
place_desc.holonyms_by_placetype = {}
end
if not place_desc.holonyms_by_placetype[placetype] then
place_desc.holonyms_by_placetype[placetype] = {unlinked_placename}
else
insert(place_desc.holonyms_by_placetype[placetype], unlinked_placename)
end
end
end
--[=[
Construct a formatted link from the raw link spec `link` given the canonical singular placetype `sg_placetype`. If the
placetype was originally plural, `orig_placetype` should contain this plural value; otherwise it should be nil. This
will construct the appropriate type of link that displays as `orig_placetype` (or otherwise `sg_placetype`) but links to
whatever the `link` spec specifies (which may be `sg_placetype`, a Wikipedia article, etc.). `ptdata` is the placetype
data structure for the placetype, and `from_category` indicates that we are generating the description of a category
(otherwise we are generating the display form of an entry placetype).
]=]
local function make_placetype_link(link, sg_placetype, orig_placetype, ptdata, from_category, noerror)
if not from_category and ptdata.disallow_in_entries then
if noerror then
return "[not meant to be specified directly, with warning: " .. ptdata.disallow_in_entries .. "]"
else
process_error("Placetype %s is not meant to be specified directly: " .. ptdata.disallow_in_entries, sg_placetype)
end
end
if link == nil then
internal_error("Placetype data present for placetype %s but no link= setting given", sg_placetype)
elseif link == true then
if orig_placetype then
return ("[[%s|%s]]"):format(sg_placetype, orig_placetype)
else
return ("[[%s]]"):format(sg_placetype)
end
elseif link == false then
process_error("Placetype %s is not meant to be specified directly, but is only for internal use", sg_placetype)
elseif link == "w" then
return ("[[w:%s|%s]]"):format(sg_placetype, orig_placetype or sg_placetype)
elseif link == "separately" then
if orig_placetype then
local sg_words = split(sg_placetype, " ")
local orig_words = split(orig_placetype, " ")
if #sg_words ~= #orig_words then
internal_error("Can't construct 'separately' link for plural placetype %s as original placetype %s " ..
"has different number of words", orig_placetype, sg_placetype)
else
for i = 1, #sg_words do
if sg_words[i] == orig_words[i] then
sg_words[i] = ("[[%s]]"):format(sg_words[i])
else
sg_words[i] = ("[[%s|%s]]"):format(sg_words[i], orig_words[i])
end
end
return concat(sg_words, " ")
end
else
return (sg_placetype:gsub("([^ ]+)", "[[%1]]"))
end
elseif link:find("^%+") then
link = link:sub(2) -- discard initial +
return ("[[%s|%s]]"):format(link, orig_placetype or sg_placetype)
elseif not orig_placetype then
return link
else
return --[[require(en_utilities_module).pluralize(link)]] link
end
end
--[==[
Get the display form of a placetype by looking it up in `placetype_data`. If the placetype is recognized, or is the
plural of a recognized placetype, the corresponding linked display form is returned (with plural placetypes displaying
as plural but linked to the singular form of the placetype). Otherwise, return nil. If we're generating the description
of a category, `category_type` should be set to one of `"top-level"` (for top-level categories like
[[:Category:Neighborhoods]]), `"noncity"` (for non-city categories like [[:Category:Neighborhoods in Illinois, USA]]) or
`"city"` (for city categories like [[:Category:Neighborhoods of Chicago]]). Otherwise, we're generating the description
for use in formatting a {{tl|place}} call, and category-only placetypes ending in `!` will be ignored, along with
special `category_link*` settings. `return_full` is used along with `category_type` and will preferably return the
"full" variant of category link settings, i.e. `full_category_link*`; if they don't exist, the `category_link*` value is
prepended with `"names of"`. `noerror` says to not throw an error when encountering entry placetypes that would be
disallowed.
]==]
function export.get_placetype_display_form(placetype, category_type, return_full, noerror)
local from_category = not not category_type
local canon_placetype, ptdata, ptmatch = export.get_placetype_data(placetype, from_category)
if canon_placetype then
local raw_link
local function is_linked_string(str)
return type(str) == "string" and str:find("%[%[")
end
if category_type then
local fetched_full
local function fetch_maybe_full(prop)
local retval = ptdata["full_" .. prop]
if retval ~= nil then
if return_full then
return retval, true
else
internal_error("Saw full_" .. prop .. "=%s but `return_full` not set, can't handle", retval)
end
end
return ptdata[prop], false
end
local function maybe_prefix(str)
if return_full and not fetched_full then
return "names of " .. str
else
return str
end
end
-- Careful with `false` as possible value.
if category_type == "top-level" then --ห้ามแปล
raw_link, fetched_full = fetch_maybe_full("category_link_top_level")
elseif category_type == "noncity" then --ห้ามแปล
raw_link, fetched_full = fetch_maybe_full("category_link_before_noncity")
elseif category_type == "city" then --ห้ามแปล
raw_link, fetched_full = fetch_maybe_full("category_link_before_city")
else
internal_error('Unrecognized value for `category_type` %s, should be "top-level", "noncity" or "city"', --ห้ามแปล
category_type)
end
if type(raw_link) == "string" then
return maybe_prefix(raw_link), ptdata
elseif raw_link ~= nil then
return raw_link, ptdata
end
raw_link, fetched_full = fetch_maybe_full("category_link")
if raw_link == false then
return raw_link, ptdata
end
if is_linked_string(raw_link) then
return maybe_prefix(raw_link), ptdata
end
if ptmatch == "plural" then
raw_link, fetched_full = fetch_maybe_full("plural_link")
if raw_link == false then
return raw_link, ptdata
end
if is_linked_string(raw_link) then
return maybe_prefix(raw_link), ptdata
end
end
if raw_link == nil then
raw_link, fetched_full = fetch_maybe_full("link")
end
if raw_link == false then
return raw_link, ptdata
end
return maybe_prefix(make_placetype_link(raw_link, canon_placetype,
placetype ~= canon_placetype and placetype or nil, ptdata, from_category, noerror)), ptdata
else
if ptmatch == "plural" then
raw_link = ptdata.plural_link
if raw_link == false then
process_error("Placetype %s cannot appear plural", placetype)
end
if is_linked_string(raw_link) then
return raw_link, ptdata
end
end
if raw_link == nil then
raw_link = ptdata.link
end
return make_placetype_link(raw_link, canon_placetype,
placetype ~= canon_placetype and placetype or nil, ptdata, from_category, noerror), ptdata
end
end
return nil
end
local function resolve_unlinked_placename_display_aliases(placetype, placename)
local equiv_placetypes = export.get_placetype_equivs(placetype)
for i, equiv in ipairs(equiv_placetypes) do
equiv_placetypes[i] = equiv.placetype
end
local all_display_aliases_found = {}
local all_others_found = {}
for group, key, spec in m_locations.iterate_matching_location {
placetypes = equiv_placetypes,
placename = placename,
alias_resolution = "display",
} do
if spec.alias_of and spec.display then
insert(all_display_aliases_found, {group, key, spec, spec.display_as_full})
else
insert(all_others_found, {group, key, spec})
end
end
if not all_display_aliases_found[1] then
return placename
elseif all_display_aliases_found[2] then
internal_error("Found multiple matching display aliases for placename %s, placetype %s: " ..
"all_display_aliases_found=%s, all_others_found=%s", placename, placetype, all_display_aliases_found,
all_others_found)
elseif all_others_found[1] then
internal_error("Found a display alias along with other possible meanings for placename %s, placetype %s: " ..
"all_display_aliases_found=%s, all_others_found=%s", placename, placetype, all_display_aliases_found,
all_others_found)
else
local group, key, spec, as_full = unpack(all_display_aliases_found[1])
local full, elliptical = m_locations.key_to_placename(group, key)
return as_full and full or elliptical
end
end
--[==[
If `placename` of type `placetype` is a display alias, convert it to its canonical form; otherwise, return unchanged.
Display aliases transform certain placenames into canonical displayed forms. For example, if any of `country/US`,
`country/USA` or `country/United States of America` (or `c/US`, etc.) are given, the result will be displayed as
`United States`.
'''NOTE''': Display aliases change what is displayed from what the editor wrote in the Wikitext. As a result, they
should (a) be non-political in nature, and (b) not involve a change where the word `the` needs to be added or removed.
For example, normalizing `US` and `USA` to `United States` for display purposes is OK but normalizing `Burma` to
`Myanmar` is not (instead a cat alias should be used) because the terms `Burma` and `Myanmar` have clear political
connotations. Similarly, we have a display alias that maps the old name of `Macedonia` as a country (but not a region!)
to `North Macedonia`, but `Republic of Macedonia` is mapped to `North Macedonia` only as a cat alias because the two
terms differ in their use of `the`. (For example, if we had a display alias mapping `Republic of Macedonia` to
`North Macedonia`, the call {{tl|place|en|the <<capital city>> of the <<c/Republic of Macedonia>>}} would wrongly
display as `the [[capital city]] of the [[North Macedonia]]`.) Generally, display normalizations tend to involve
alternative forms (e.g. abbreviations, ellipses, foreign spellings) where the normalization improves clarity and
consistency.
]==]
function export.resolve_placename_display_aliases(placetype, placename)
-- If the placename is a link, apply the alias inside the link.
-- This pattern matches both piped and unpiped links. If the link is not piped, the second capture (linktext) will
-- be empty.
local link, linktext = rmatch(placename, "^%[%[([^|%[%]]+)|?([^|%[%]]-)%]%]$")
if link then
if linktext ~= "" then
local alias = resolve_unlinked_placename_display_aliases(placetype, linktext)
return "[[" .. link .. "|" .. alias .. "]]"
else
local alias = resolve_unlinked_placename_display_aliases(placetype, link)
return "[[" .. alias .. "]]"
end
else
return resolve_unlinked_placename_display_aliases(placetype, placename)
end
end
--[==[
Generate the "prefixed" version of a bare key, i.e. prefix it with `the` if correct for this key.
]==]
function export.get_prefixed_key(key, spec)
if spec.the then
return "the " .. key
else
return key
end
end
-- Necessary for use by [[Module:place]]. FIXME: Reorganize the modules so this isn't necessary.
export.iterate_matching_location = m_locations.iterate_matching_location
--[=[
Iterator that iterates over holonyms in `place_desc`. If `first_holonym_index` is given, start iterating at the
specified holonym and stop either when there are no more holonyms or a holonym with modifier `:also` is found. If
`first_holonym_index` is nil or omitted, iterate over all holonyms regardless. If `include_raw_text_holonyms` is
specified, raw text holonyms (those not of the form `placetype/placename`) are returned as well; they can be identified
by the fact that the `placetype` field in the holonym structure is nil. Two values are returned at each iteration, the
holonym index and holonym structure, similar to `ipairs()`.
]=]
function export.get_holonyms_to_check(place_desc, first_holonym_index, include_raw_text_holonyms)
local stop_at_also = not not first_holonym_index
return function(place_desc, index)
while true do
index = index + 1
local this_holonym = place_desc.holonyms[index]
-- If we were passed in a starting holonym index, go up to but not including a holonym marked with `:also`
-- (continue_cat_loop); the categorization code will then restart the loop at that holonym. That holonym
-- will have `:also` marked on it, so make sure not to stop immediately if the first holonym is marked with
-- `:also`.
if not this_holonym or stop_at_also and index > first_holonym_index and this_holonym.continue_cat_loop then
return nil
end
-- If not placetype, we're processing raw text, which we normally want to skip.
if include_raw_text_holonyms or this_holonym.placetype then
return index, this_holonym
end
end
end, place_desc, first_holonym_index and first_holonym_index - 1 or 0
end
--[==[
If the holonym in `data` (in the format as passed to a category handler) refers to a known location, iterate over all
such known locations, returning for each location the corresponding key, spec and group as well as the trail of
ancestral containers. Unlike `iterate_matching_location()`, this specifically checks that there is no mismatch between
the location's containers at any level and any of the following holonyms in the {{tl|place}} spec. The fields in `data`
are:
* `holonym_placetype`: The placetype of the holonym. It can actually be a list of possible placetypes, as with
`iterate_matching_location()`.
* `holonym_placename`: The placename of the holonym.
* `holonym_index`: The index of the holonym among the holonyms in `place_desc`, or nil if the holonym is not among the
holonyms in `place_desc`. (If a holonym index is given, we check for container mismatches among the holonyms
following the specified index, stopping either when encountering a holonym marked with modifier `:also` or, if none
exist, when we run out of holonyms. If no holonym index is given, we check all holonyms for container mismatches.)
* `place_desc`: Description of the place; used for the holonyms, to check for container mismatches.
Returns four values: the location group, the canonical key by which the location is known, the spec object describing
the location and the trail of ancestral containers for the location. The first three values are the same as for
`iterate_matching_location`.
]==]
function export.iterate_matching_holonym_location(data)
local holonym_placetype, holonym_placename, holonym_index, place_desc =
data.holonym_placetype, data.holonym_placename, data.holonym_index, data.place_desc
local matching_location_iterator = m_locations.iterate_matching_location {
placetypes = holonym_placetype,
placename = holonym_placename,
}
return function()
while true do
local group, key, spec = matching_location_iterator()
if not group then
return nil
end
local container_trail = {}
-- For each level of container, check that there are no mismatches (i.e. other location of the same
-- placetype) mentioned. We allow a mismatch at a given level if there's also a match with the container
-- at that level. For example, in the case of Kansas City, defined in [[Module:place/locations]] as a city
-- in Missouri, if we define it as {{tl|place|city|s/Missouri,Kansas}}, we ignore the mismatching state of
-- Kansas because the correct state of Missouri was also mentioned. But imagine we are defining Newark,
-- Delaware as {{tl|place|city|s/Delaware|c/US}} and (as is the case) we have an entry for Newark, New
-- Jersey in [[Module:place/locations]]. Just because the containing location `US` matches isn't enough,
-- because Newark, NJ also has New Jersey as a containing location and there's a mismatch at that level. If
-- there are no mismatches at any level we assume we're dealing with the right known location.
--
-- If at a given level there are multiple containing locations, we count a match if any holonym matches any
-- containing location, and a mismatch only if a holonym exists of the same placetype that doesn't match any
-- containing location.
local containers_mismatch = false
for containers in m_locations.iterate_containers(group, key, spec) do
insert(container_trail, containers)
local match_at_level = false
local mismatch_at_level = false
for other_holonym_index, other_holonym in export.get_holonyms_to_check(place_desc,
holonym_index and holonym_index + 1 or nil) do
local other_source_holonym = other_holonym.augmented_from_holonym
if other_source_holonym and other_source_holonym.placetype == holonym_placetype and
other_source_holonym.unlinked_placename ~= holonym_placename then
-- Ignore holonyms added during the augmentation process for other holonyms of the same
-- placetype as the placetype of the holonym we're considering. See comment in
-- augment_holonyms_with_container() for why we do this.
-- continue; grrr, no 'continue' in Lua
else
local holonym_matches_at_level = false
local holonym_exists_with_same_placetype = false
for _, container in ipairs(containers) do
if not container.spec.no_check_holonym_mismatch then
local full_container_placename, elliptical_container_placename =
m_locations.key_to_placename(container.group, container.key)
local placetypes = container.spec.placetype
if type(placetypes) ~= "table" then
placetypes = {placetypes}
end
local placetype_equivs = {}
for _, pt in ipairs(placetypes) do
m_table.extend(placetype_equivs, export.get_placetype_equivs(pt))
end
local this_holonym_matches = export.get_equiv_placetype_prop_from_equivs(
placetype_equivs, function(placetype)
return other_holonym.placetype == placetype and
(other_holonym.unlinked_placename == full_container_placename or
other_holonym.unlinked_placename == elliptical_container_placename)
end
)
if this_holonym_matches then
holonym_matches_at_level = true
break
end
local this_holonym_exists_with_same_placetype = export.get_equiv_placetype_prop_from_equivs(
placetype_equivs, function(placetype)
return other_holonym.placetype == placetype
end
)
if this_holonym_exists_with_same_placetype then
-- We seem to have a mismatch at this level. But before we decide conclusively that this
-- is the case, check to see whether the putative mismatch is an alias and matches when
-- we resolve the alias.
for oh_group, oh_key, oh_spec, oh_container_trail in
export.iterate_matching_holonym_location {
holonym_placetype = other_holonym.placetype,
holonym_placename = other_holonym.unlinked_placename,
holonym_index = other_holonym_index,
place_desc = place_desc,
} do
local oh_full_placename, oh_elliptical_placename =
m_locations.key_to_placename(oh_group, oh_key)
if oh_full_placename == full_container_placename or
oh_elliptical_placename == elliptical_container_placename then
-- Alias matched when resolved.
this_holonym_matches = true
break
end
end
if this_holonym_matches then
-- Alias matched above when resolved.
holonym_matches_at_level = true
break
else
-- Not an alias, or doesn't match when resolved. We have a true mismatch.
holonym_exists_with_same_placetype = true
end
end
end
end
if holonym_matches_at_level then
match_at_level = true
break
end
if holonym_exists_with_same_placetype then
mismatch_at_level = true
end
end
end
if not match_at_level and mismatch_at_level then
containers_mismatch = true
break
end
end
if not containers_mismatch then
return group, key, spec, container_trail
end
end
end
end
--[==[
If the holonym in `data` (in the format as passed to a category handler) refers to a known location, find and return the
corresponding key, spec and group as well as the trail of ancestral containers. This is like
`iterate_matching_holonym_location()` but throws an error if more than one location matches. (An example where this
would happen is {{tl|place|en|neighborhood|city/Newcastle}}, because there are two known locations named Newcastle. To
fix this, specify additional following disambiguating holonyms, e.g.
{{tl|place|en|neighborhood|city/Newcastle|s/New South Wales}}.
]==]
function export.find_matching_holonym_location(data)
local all_found = {}
for group, key, spec, container_trail in export.iterate_matching_holonym_location(data) do
insert(all_found, {group, key, spec, container_trail})
end
if not all_found[1] then
return nil
elseif all_found[2] then
local holonym_placetype = data.holonym_placetype
if type(holonym_placetype) == "table" then
holonym_placetype = concat(holonym_placetype, ",")
end
local found_keys = {}
for _, found in ipairs(all_found) do
local _, key, _, _ = unpack(found)
insert(found_keys, key)
end
error(("Found multiple matching locations for holonym '%s/%s'; specify disambiguating context in the " ..
"containing holonyms: %s"):format(holonym_placetype, data.holonym_placename, dump(found_keys)))
else
return unpack(all_found[1])
end
end
------------------------------------------------------------------------------------------
-- Placename and placetype data --
------------------------------------------------------------------------------------------
--[==[ var:
This is a map from aliases to their canonical forms. Any placetypes appearing as keys here will be mapped to their
canonical forms in all respects, including the display form. Contrast entries in 'placetype_data' with a fallback, which
applies to categorization and other processes but not to display.
The most important aliases are for holonym placetypes, particularly those that occur often such as "ประเทศ", "รัฐ",
"จังหวัด" and the like. Particularly long placetypes that mostly occur as entry placetypes (e.g.
"census-designated place") can be given abbreviations, but it is generally preferred to spell out the entry placetype.
Note also that we purposely avoid certain abbreviations that would be ambiguous (e.g. "d", which could variously be
interpreted as "department", "อำเภอ" or "division").
]==]
export.placetype_aliases = {
["acomm"] = "autonomous community",
["adr"] = "administrative region",
["adterr"] = "administrative territory", -- Pakistan
["aobl"] = "autonomous oblast",
["aokr"] = "autonomous okrug",
["ap"] = "autonomous province",
["apref"] = "autonomous prefecture",
["aprov"] = "autonomous province",
["ar"] = "autonomous region",
["arch"] = "archipelago",
["arep"] = "autonomous republic",
["aterr"] = "autonomous territory",
["atu"] = "autonomous territorial unit",
["bor"] = "borough",
["c"] = "ประเทศ",
["can"] = "canton",
["carea"] = "council area",
["cc"] = "constituent country",
["cdblock"] = "community development block",
["cdep"] = "Crown dependency",
["CDP"] = "census-designated place",
["cdp"] = "census-designated place",
["clcity"] = "county-level city",
["co"] = "เทศมณฑล",
["cobor"] = "county borough",
["colcity"] = "county-level city",
["coll"] = "collectivity",
["comm"] = "community",
["cont"] = "ทวีป",
["contr"] = "continental region",
["contregion"] = "continental region",
["cpar"] = "civil parish",
["damun"] = "direct-administered municipality",
["dep"] = "dependency",
["department capital"] = "departmental capital",
["dept"] = "department",
["depterr"] = "dependent territory",
["dist"] = "อำเภอ",
["distmun"] = "district municipality",
["div"] = "division",
["emp"] = "จักรวรรดิ",
["fpref"] = "French prefecture",
["gov"] = "governorate",
["govnat"] = "governorate",
["home-rule city"] = "home rule city",
["home-rule municipality"] = "home rule municipality",
["inner-city area"] = "inner city area",
["ires"] = "Indian reservation",
["isl"] = "เกาะ",
["lbor"] = "London borough",
["lga"] = "local government area",
["lgarea"] = "local government area",
["lgd"] = "local government district",
["lgdist"] = "local government district",
["metbor"] = "metropolitan borough",
["metcity"] = "มหานคร",
["metmun"] = "metropolitan municipality",
["mtn"] = "ภูเขา",
["mun"] = "เทศบาล",
["mundist"] = "municipal district",
["nonmetropolitan county"] = "non-metropolitan county",
["obl"] = "oblast",
["okr"] = "okrug",
["p"] = "จังหวัด",
["par"] = "parish",
["parmun"] = "parish municipality",
["pen"] = "peninsula",
["plcity"] = "prefecture-level city",
["plcolony"] = "Polish colony",
["pref"] = "prefecture",
["prefcity"] = "prefecture-level city",
["preflcity"] = "prefecture-level city",
["prov"] = "จังหวัด",
["r"] = "ภูมิภาค",
["range"] = "เทือกเขา",
["rcm"] = "regional county municipality",
["rcomun"] = "regional county municipality",
["rdist"] = "regional district",
["rep"] = "republic",
["rhrom"] = "rural hromada",
["riv"] = "แม่น้ำ",
["rmun"] = "regional municipality",
["robor"] = "royal borough",
["romp"] = "Roman province",
["runit"] = "regional unit",
["rurmun"] = "rural municipality",
["s"] = "รัฐ",
["sar"] = "special administrative region",
["shrom"] = "settlement hromada",
["spref"] = "subprefecture",
["sprefcity"] = "sub-prefectural city",
["sprovcity"] = "subprovincial city",
["submet city"] = "sub-metropolitan city",
["submetropolitan city"] = "sub-metropolitan city",
["sub-prefecture-level city"] = "sub-prefectural city",
["sub-provincial city"] = "subprovincial city",
["sub-provincial district"] = "subprovincial district",
["terr"] = "ดินแดน",
["terrauth"] = "territorial authority",
["twp"] = "township",
["twpmun"] = "township municipality",
["uauth"] = "unitary authority",
["ucomm"] = "unincorporated community",
["udist"] = "unitary district",
["uhrom"] = "urban hromada",
["uterr"] = "union territory",
["utwpmun"] = "united township municipality",
["val"] = "valley",
["vdc"] = "village development committee",
["vil"] = "village",
["voi"] = "voivodeship",
["wcomm"] = "Welsh community",
}
local no_link_def_article = {link = false, article = "the"}
local no_link_no_article = {link = false, article = false}
--[==[ var:
These qualifiers can be prepended onto any placetype and will be handled correctly. For example, the placetype
`large city` will be displayed as `large <nowiki>[[city]]</nowiki>` and categorized as if `city` were specified. If the
value in the following table is a string, the qualifier will display according to the string. If the value is `true`,
the qualifier will be linked to its corresponding Wiktionary entry. If the value is `false`, the qualifier will not be
linked but will appear as-is. Note that these qualifiers do not override placetypes with entries elsewhere that contain
those same qualifiers. For example, the entry for `inland sea` in `placetype_data` will apply in preference to treating
`inland sea` as equivalent to `sea`.
]==]
export.placetype_qualifiers = {
-- generic qualifiers
["huge"] = false,
["tiny"] = false,
["large"] = false,
["big"] = false,
["mid-size"] = false,
["mid-sized"] = false,
["small"] = false,
["sizable"] = false,
["important"] = false,
["long"] = false,
["short"] = false,
["major"] = false,
["minor"] = false,
["high"] = false,
["tall"] = false,
["low"] = false,
["left"] = false, -- left tributary
["right"] = false, -- right tributary
["modern"] = false, -- for use in opposition to "ancient" in another definition
-- "former" qualifiers
["abandoned"] = true,
["ancient"] = true,
["deserted"] = true,
["extinct"] = true,
["former"] = false,
["historic"] = "historical",
["historical"] = true,
["medieval"] = true,
["mediaeval"] = true,
["ruined"] = true,
["traditional"] = true,
-- sea qualifiers
["coastal"] = true,
["inland"] = true, -- note, we also have an entry in placetype_data for 'inland sea' to get a link to [[inland sea]]
["maritime"] = true,
["overseas"] = true,
["seaside"] = true,
["beachfront"] = true,
["beachside"] = true,
["riverside"] = true,
-- lake qualifiers
["freshwater"] = true,
["saltwater"] = true,
["endorheic"] = true,
["oxbow"] = true,
["ox-bow"] = "[[oxbow]]", -- [[ox-bow]] is a red link
["tidal"] = true,
-- land qualifiers
["hilltop"] = true,
["hilly"] = true,
["insular"] = true,
["peninsular"] = true,
["chalk"] = true,
["karst"] = true,
["limestone"] = true,
["mountainous"] = true,
["mountaintop"] = true,
["alpine"] = true,
["volcanic"] = true, -- for an island
-- political status qualifiers
["autonomous"] = true,
["incorporated"] = true,
["special"] = true,
["unincorporated"] = true,
["coterminous"] = true,
-- monetary status/etc. qualifiers
["fashionable"] = true,
["wealthy"] = true,
["affluent"] = true,
["declining"] = true,
-- city vs. rural qualifiers
["urban"] = true,
["suburban"] = true,
["exurban"] = true,
["outlying"] = true,
["remote"] = true,
["rural"] = true,
["outback"] = true,
["inner"] = false,
["inner-city"] = true,
["central"] = false,
["outer"] = false,
-- land use qualifiers
["residential"] = true,
["agricultural"] = true,
["business"] = true,
["commercial"] = true,
["industrial"] = true,
-- business use qualifiers
["railroad"] = true,
["railway"] = true,
["farming"] = true,
["fishing"] = true,
["mining"] = true,
["logging"] = true,
["cattle"] = true,
-- tourism use qualifiers
["resort"] = true, -- note, we also have 'resort city' and 'resort town', that take precedecne
["spa"] = true, -- note, we also have 'spa city' and 'spa town', that take precedecne
["ski"] = true, -- note, we also have 'ski resort city' and 'ski resort town', that take precedecne
-- religious qualifiers
["holy"] = true,
["sacred"] = true,
["religious"] = true,
["secular"] = true,
-- qualifiers for nonexistent places
["claimed"] = false,
["fictional"] = true,
["legendary"] = true,
["mythical"] = true,
["mythological"] = true,
-- directional qualifiers
["northern"] = false,
["southern"] = false,
["eastern"] = false,
["western"] = false,
["north"] = false,
["south"] = false,
["east"] = false,
["west"] = false,
["northeastern"] = false,
["southeastern"] = false,
["northwestern"] = false,
["southwestern"] = false,
["northeast"] = false,
["southeast"] = false,
["northwest"] = false,
["southwest"] = false,
-- seasonal qualifiers
["summer"] = true, -- e.g. for 'summer capital'
["winter"] = true,
-- legal status qualifiers
-- FIXME: Two-word qualifiers don't work yet. But you can enter "de-facto" and it's canonicalized to [[de facto]].
["official"] = true,
["unofficial"] = true,
["de facto"] = true, -- 'de facto capital'
["de-facto"] = "[[de facto]]", -- [[de-facto]] is a red link
["de jure"] = true, -- 'de jure capital'
["de-jure"] = "[[de jure]]", -- [[de-jure]] is a red link
-- NOTE: 'unrecognized/unrecognised' are handled as placetypes 'unrecognized country', 'unrecognized state'
-- misc. qualifiers
["planned"] = true,
["chartered"] = true,
["landlocked"] = true,
["uninhabited"] = true,
-- superlative qualifiers
["first"] = no_link_def_article,
["second"] = no_link_def_article, -- for "second largest" etc.
["third"] = no_link_def_article,
["fourth"] = no_link_def_article,
["last"] = no_link_def_article,
["only"] = no_link_def_article,
["sole"] = no_link_def_article,
["main"] = no_link_def_article,
["largest"] = no_link_def_article,
["biggest"] = no_link_def_article,
["smallest"] = no_link_def_article,
["shortest"] = no_link_def_article,
["longest"] = no_link_def_article,
["tallest"] = no_link_def_article,
["highest"] = no_link_def_article,
["lowest"] = no_link_def_article,
["leftmost"] = no_link_def_article,
["rightmost"] = no_link_def_article,
["innermost"] = no_link_def_article,
["outermost"] = no_link_def_article,
["northernmost"] = no_link_def_article,
["southernmost"] = no_link_def_article,
["westernmost"] = no_link_def_article,
["easternmost"] = no_link_def_article,
["northwesternmost"] = no_link_def_article,
["southwesternmost"] = no_link_def_article,
["northeasternmost"] = no_link_def_article,
["southeasternmost"] = no_link_def_article,
-- several/various
["several"] = no_link_no_article,
["various"] = no_link_no_article,
["numerous"] = no_link_no_article,
["multiple"] = no_link_no_article,
["many"] = no_link_no_article,
["other"] = no_link_no_article,
}
--[==[ var:
In this table, the key qualifiers should be treated the same as the value qualifiers for categorization purposes. This
is overridden by `placetype_data` and `qualifier_to_placetype_equivs`.
]==]
export.former_qualifiers = {
["abandoned"] = {"FORMER"},
["ancient"] = {"ANCIENT", "FORMER"},
["former"] = {"FORMER"},
["extinct"] = {"FORMER"},
["historic"] = {"FORMER"},
["historical"] = {"FORMER"},
["medieval"] = {"ANCIENT", "FORMER"},
["mediaeval"] = {"ANCIENT", "FORMER"},
["ruined"] = {"ANCIENT", "FORMER"},
["traditional"] = {"FORMER"},
}
--[==[ var:
In this table, any placetypes containing these qualifiers that do not occur in `placetype_data` should be mapped to the
specified placetypes for categorization purposes. Entries here are overridden by `placetype_data`.
]==]
export.qualifier_to_placetype_equivs = {
["fictional"] = "fictional location",
["legendary"] = "mythological location",
["mythical"] = "mythological location",
["mythological"] = "mythological location",
-- For e.g. Taiwan as a "claimed province" of China; parts of Belize as claimed by Guatemala; various islands
-- claimed by various parties in East Asia. FIXME: We should conditionalize on what is being claimed since there are
-- also claimed capitals, e.g. Israel and Palestine claim Jerusalem as their capital.
["claimed"] = "claimed political division",
}
--[==[ var:
Mapping from placetypes to the corresponding plural category-only placetype for a capital of that placetype. The reverse
mapping also exists.
]==]
export.placetype_to_capital_cat = {
["autonomous community"] = "autonomous community capitals",
["canton"] = "cantonal capitals",
["comarca"] = "comarca capitals",
["ประเทศ"] = "เมืองหลวงของประเทศ",
-- The following are not obviously different from 'county seats' but the latte terminology is used in the US.
["เทศมณฑล"] = "เมืองหลวงของเทศมณฑล",
["department"] = "departmental capitals",
["อำเภอ"] = "เมืองหลวงของอำเภอ",
["division"] = "division capitals",
["emirate"] = "emirate capitals",
["governorate"] = "governorate capitals",
["hromada"] = "hromada capitals",
["krai"] = "krai capitals",
["มหานคร"] = "เมืองหลวงของมหานคร",
["เทศบาล"] = "เมืองหลวงของเทศบาล",
["oblast"] = "oblast capitals",
["okrug"] = "okrug capitals",
["prefecture"] = "prefectural capitals",
["จังหวัด"] = "เมืองหลวงของจังหวัด",
["raion"] = "raion capitals",
["regency"] = "regency capitals",
["ภูมิภาค"] = "เมืองหลวงของภูมิภาค",
["regional unit"] = "regional unit capitals",
["republic"] = "republic capitals",
["รัฐ"] = "เมืองหลวงของรัฐ",
["ดินแดน"] = "เมืองหลวงของดินแดน",
["voivodeship"] = "voivodeship capitals",
}
--[==[ var:
This contains placenames that should be preceded by an article (almost always "the"). '''NOTE''': There are multiple
ways that placenames can come to be preceded by "the":
# Listed here.
# Given in [[Module:place/locations]] with an initial "the". All such placenames are added to this map by the code
just below the map.
# The placetype of the placename has `holonym_use_the = true` in its placetype_data.
# A regex in placename_the_re matches the placename.
Note that "the" is added only before the first holonym in a place description.
]==]
export.placename_article = {
-- This should only contain info that can't be inferred from [[Module:place/locations]].
["archipelago"] = {
["Cyclades"] = "the",
["Dodecanese"] = "the",
},
["ประเทศ"] = {
["Holy Roman Empire"] = "the",
},
["จักรวรรดิ"] = {
["Holy Roman Empire"] = "the",
},
["เกาะ"] = {
["North Island"] = "the",
["South Island"] = "the",
},
["ภูมิภาค"] = {
["Balkans"] = "the",
["Russian Far East"] = "the",
["Caribbean"] = "the",
["Caucasus"] = "the",
["Middle East"] = "the",
["New Territories"] = "the",
["North Caucasus"] = "the",
["South Caucasus"] = "the",
["West Bank"] = "the",
["Gaza Strip"] = "the",
},
["valley"] = {
["San Fernando Valley"] = "the",
},
}
--[==[ var:
Regular expressions to apply to determine whether we need to put 'the' before a holonym. The key "*" applies to all
holonyms, otherwise only the regexes for the holonym's placetype apply.
]==]
export.placename_the_re = {
-- We don't need entries for peninsulas, seas, oceans, gulfs or rivers
-- because they have holonym_use_the = true.
["*"] = {"^Isle of ", " Islands$", " Mountains$", " Empire$", " Country$", " Region$", " District$", "^City of "},
["bay"] = {"^Bay of "},
["ทะเลสาบ"] = {"^Lake of "},
["ประเทศ"] = {"^Republic of ", " Republic$"},
["republic"] = {"^Republic of ", " Republic$"},
["ภูมิภาค"] = {" [Rr]egion$"},
["แม่น้ำ"] = {" River$"},
["local government area"] = {"^Shire of "},
["เทศมณฑล"] = {"^Shire of "},
["Indian reservation"] = {" Reservation", " Nation"},
["tribal jurisdictional area"] = {" Reservation", " Nation"},
}
--[==[ var:
If any of the following holonyms are present, the associated holonyms are automatically added to the end of the list of
holonyms for categorization (but not display) purposes.
]==]
export.cat_implications = {
["ภูมิภาค"] = {
["Eastern Europe"] = {"continent/Europe"},
["Central Europe"] = {"continent/Europe"},
["Western Europe"] = {"continent/Europe"},
["South Europe"] = {"continent/Europe"},
["Southern Europe"] = {"continent/Europe"},
["Northern Europe"] = {"continent/Europe"},
["Northeast Europe"] = {"continent/Europe"},
["Northeastern Europe"] = {"continent/Europe"},
["Southeast Europe"] = {"continent/Europe"},
["Southeastern Europe"] = {"continent/Europe"},
["North Caucasus"] = {"continent/Europe"},
["South Caucasus"] = {"continent/Asia"},
["South Asia"] = {"continent/Asia"},
["Southern Asia"] = {"continent/Asia"},
["East Asia"] = {"continent/Asia"},
["Eastern Asia"] = {"continent/Asia"},
["Central Asia"] = {"continent/Asia"},
["West Asia"] = {"continent/Asia"},
["Western Asia"] = {"continent/Asia"},
["Southeast Asia"] = {"continent/Asia"},
["North Asia"] = {"continent/Asia"},
["Northern Asia"] = {"continent/Asia"},
["Anatolia"] = {"continent/Asia"},
["Asia Minor"] = {"continent/Asia"},
["Mesopotamia"] = {"continent/Asia"},
["North Africa"] = {"continent/Africa"},
["Central Africa"] = {"continent/Africa"},
["West Africa"] = {"continent/Africa"},
["East Africa"] = {"continent/Africa"},
["Southern Africa"] = {"continent/Africa"},
["Central America"] = {"continent/Central America"},
["Caribbean"] = {"continent/North America"},
["Polynesia"] = {"continent/Oceania"},
["Micronesia"] = {"continent/Oceania"},
["Melanesia"] = {"continent/Oceania"},
["Siberia"] = {"country/Russia", "continent/Asia"},
["Russian Far East"] = {"country/Russia", "continent/Asia"},
["South Wales"] = {"constituent country/Wales", "continent/Europe"},
["Balkans"] = {"continent/Europe"},
["West Bank"] = {"country/Palestine", "continent/Asia"},
["Gaza"] = {"country/Palestine", "continent/Asia"},
["Gaza Strip"] = {"country/Palestine", "continent/Asia"},
}
}
------------------------------------------------------------------------------------------
-- Category and display handlers --
------------------------------------------------------------------------------------------
local function city_type_cat_handler(data)
local entry_placetype = data.entry_placetype
local generic_before_non_cities = export.get_placetype_prop(entry_placetype, "generic_before_non_cities")
if not generic_before_non_cities then
internal_error("city_type_cat_handler called on placetype %s that doesn't have a `generic_before_non_cities`" ..
" setting", entry_placetype)
end
local plural_entry_placetype = export.pluralize_placetype(entry_placetype)
local group, key, spec, container_trail = export.find_matching_holonym_location(data)
if group and not spec.is_former_place and not spec.is_city then
-- Categorize both in key, and in the larger polity that the key is part of, e.g. [[Hirakata]] goes in both
-- "Cities in Osaka Prefecture" and "Cities in Japan". (But don't do the latter if no_container_cat is set.)
local cap_plural_entry_placetype = ucfirst(plural_entry_placetype)
local retcats = {("%s%s%s"):format(cap_plural_entry_placetype, generic_before_non_cities, export.get_prefixed_key(key, spec))} --th
if container_trail[1] and not spec.no_container_cat then
for _, container in ipairs(container_trail[1]) do
insert(retcats, ("%s%s%s"):format(cap_plural_entry_placetype, generic_before_non_cities, export.get_prefixed_key(container.key, container.spec))) --th
end
end
return retcats
end
end
local function capital_city_cat_handler(data, non_city)
local holonym_placetype, holonym_placename, holonym_index, place_desc =
data.holonym_placetype, data.holonym_placename, data.holonym_index, data.place_desc
-- The first time we're called we want to return something; otherwise we will be called for later-mentioned
-- holonyms, which can result in wrongly classifying into e.g. `National capitals`. Simulate the loop in
-- find_placetype_cat_specs() over holonyms so we get the proper `Cities in ...` categories as well as the capital
-- category/categories we add below.
local retcats
if not non_city and place_desc.holonyms then
for h_index, holonym in export.get_holonyms_to_check(place_desc, holonym_index) do
local h_placetype, h_placename = holonym.placetype, holonym.unlinked_placename
retcats = city_type_cat_handler {
entry_placetype = "นคร",
holonym_placetype = h_placetype,
holonym_placename = h_placename,
holonym_index = h_index,
place_desc = place_desc,
}
if retcats then
break
end
end
end
if not retcats then
retcats = {}
end
-- Now find the appropriate capital-type category for the placetype of the holonym, e.g. 'State capitals'. If we
-- recognize the holonym among the known holonyms in [[Module:place/locations]], also add a category like 'State
-- capitals of the United States'. Truncate e.g. 'autonomous region' to 'region', 'union territory' to 'territory'
-- when looking up the type of capital category, if we can't find an entry for the holonym placetype itself (there's
-- an entry for 'autonomous community').
local capital_cat = export.placetype_to_capital_cat[holonym_placetype]
if not capital_cat then
capital_cat = export.placetype_to_capital_cat[holonym_placetype:gsub("^.* ", "")]
end
if capital_cat then
capital_cat = ucfirst(capital_cat)
local inserted_specific_variant_cat = false
if holonym_index then
-- Now find the first recognized holonym location. We don't stop when :also is seen because of the common pattern
-- where we use :also to specify that a given city is the capital at multiple surrounding levels.
local matching_group, matching_key, matching_spec, matching_container_trail, matching_holonym_index
for h_index = holonym_index, #place_desc.holonyms do
if place_desc.holonyms[h_index].placetype then
matching_group, matching_key, matching_spec, matching_container_trail = export.find_matching_holonym_location {
holonym_placetype = place_desc.holonyms[h_index].placetype,
holonym_placename = place_desc.holonyms[h_index].unlinked_placename,
holonym_index = h_index,
place_desc = place_desc,
}
if matching_group then
matching_holonym_index = h_index
break
end
end
end
if matching_holonym_index == holonym_index then
if matching_container_trail[1] and not matching_spec.no_container_cat then
for _, container in ipairs(matching_container_trail[1]) do
insert(retcats, ("%sของ%s"):format(capital_cat, export.get_prefixed_key(container.key,
container.spec)))
inserted_specific_variant_cat = true
end
end
elseif matching_holonym_index then
-- Check to make sure that the holonym placetype we were called on is listed among the
-- divtypes of the location we found.
local function insert_specific_variant_if_possible(key, spec)
return export.get_equiv_placetype_prop(holonym_placetype, function(pt)
local plural_holonym_placetype = export.pluralize_placetype(pt)
local saw_matching_div
if spec.divs then
local divs = spec.divs
if type(divs) ~= "table" then
divs = {divs}
end
for _, div in ipairs(divs) do
if type(div) ~= "table" then
div = {type = div}
end
if plural_holonym_placetype == div.type then
saw_matching_div = true
break
end
end
end
if saw_matching_div then
insert(retcats, ("%sของ%s"):format(capital_cat, export.get_prefixed_key(key, spec)))
return true
end
return false
end)
end
if insert_specific_variant_if_possible(matching_key, matching_spec) then
inserted_specific_variant_cat = true
elseif not matching_spec.no_container_cat then
for _, containers in ipairs(matching_container_trail) do
local saw_no_container_cat = false
for _, container in ipairs(containers) do
if insert_specific_variant_if_possible(container.key, container.spec) then
inserted_specific_variant_cat = true
break
end
saw_no_container_cat = saw_no_container_cat or container.spec.no_container_cat
end
if inserted_specific_variant_cat or saw_no_container_cat then
break
end
end
end
end
else
-- This happens when in an invocation like {{place|en|capital city|s/Haryana,Punjab}} for
-- [[Chandigarh]]. We fall back to older code that doesn't depend on the holonym index existing.
-- FIXME: This may not be necessary. In the example just given, when processing Haryana we add to
-- [[:Category:en:State capitals of India]], and nothing extra gets added when processing Punjab.
-- Possibly we can just skip this case entirely.
local group, key, spec, container_trail = export.find_matching_holonym_location(data)
if group and container_trail[1] and not spec.no_container_cat then
for _, container in ipairs(container_trail[1]) do
insert(retcats, ("%sของ%s"):format(capital_cat, export.get_prefixed_key(container.key,
container.spec)))
inserted_specific_variant_cat = true
end
end
end
if not inserted_specific_variant_cat then
insert(retcats, capital_cat)
end
else
-- We didn't recognize the holonym placetype; just put in 'Capital cities'.
insert(retcats, "เมืองหลวง")
end
return retcats
end
--[=[
This is invoked specially for all placetypes (see the `*` placetype key at the bottom of `placetype_data`). This is used
in two ways:
# To add pages to generic holonym categories like [[:Category:en:สถานที่ในMerseyside, England]] (and
[[:Category:en:สถานที่ในEngland]]) for any pages that have `co/Merseyside` as their holonym.
# To categorize demonyms in bare placename categories like [[:Category:en:Merseyside, England]] if the demonym
description mentions `co/Merseyside` and doesn't mention a more specific placename that also has a category. (In this
case there are none, but we can have demonyms at multiple levels, e.g. in France for individual villages, departments,
administrative regions, and for the entire country, and for example we only want to categorize a demonym into
[[:Category:France]] if no more specific category applies.) Unlike when invoked from {{tl|place}}, a demonym
invocation only adds the most specific holonym category and not the category of any containing polity (hence if we
add [[:Category:en:Merseyside, England]] we won't also add [[:Category:England]]).
This code also handles cities; e.g. for the first use case above, it would be used to add a page that has `city/Boston`
as a holonym to [[:Category:en:สถานที่ในBoston]], along with [[:Category:en:สถานที่ในMassachusetts, USA]] and
[[:Category:en:สถานที่ในthe United States]]. The city handler tries to deal with the possibility of multiple cities
having the same name. For example, the code in [[Module:place/locations]] knows about the city of [[Columbus]],
[[Ohio]], which has containing polities `Ohio` (a state) and `the United States` (a country). If either containing
polity is mentioned, the handler proceeds to return the key `Columbus` (along with `Ohio, USA` and `the United States`).
Otherwise, if any other state or country is mentioned, the handler returns nothing, and otherwise it assumes the
mentioned city is the one we're considering and returns `Columbus` etc. This works correctly if the place only mentions
Ohio and a holonym for a Columbus in a different country is encountered, because of the function
`augment_holonyms_with_container`, which adds the US as a holonym when Ohio is encountered.
The single parameter `data` is as in category handlers. The return value is a list of categories (without the preceding
language code).
]=]
local function generic_place_cat_handler(data)
local from_demonym = data.from_demonym
local retcats = {}
local function insert_retkey(key, spec)
if from_demonym then
insert(retcats, key)
else
insert(retcats, ("สถานที่ใน%s"):format(export.get_prefixed_key(key, spec)))
end
end
local group, key, spec, container_trail = export.find_matching_holonym_location(data)
if group then
if not spec.no_generic_place_cat then
-- This applies to continents and continental regions.
insert_retkey(key, spec)
end
-- Categorize both in key, and in the larger location(s) that the key is part of, e.g. [[Hirakata]] goes in
-- both [[Category:สถานที่ในOsaka Prefecture, Japan]] and [[Category:สถานที่ในJapan]]. But not when
-- no_container_cat is set (e.g. for 'United Kingdom').
if not spec.no_container_cat then
for _, container_set in ipairs(container_trail) do
local stop_adding_containers = false
for _, container in ipairs(container_set) do
if not container.spec.no_generic_place_cat then
insert_retkey(container.key, container.spec)
end
if container.spec.no_container_cat then
stop_adding_containers = true
end
end
if stop_adding_containers then
break
end
end
end
return retcats
end
end
--[==[
Special category handler run for all placetypes that checks for specified division placetypes of known locations and
categorizes appropriately.
]==]
function export.political_division_cat_handler(data)
if data.from_demonym then
return
end
local group, key, spec, container_trail = export.find_matching_holonym_location(data)
if group then
local divlists = {}
if spec.divs then
insert(divlists, spec.divs)
end
if spec.addl_divs then
insert(divlists, spec.addl_divs)
end
for _, divlist in ipairs(divlists) do
if type(divlist) ~= "table" then
divlist = {divlist}
end
for _, div in ipairs(divlist) do
if type(div) == "string" then
div = {type = div}
end
local sgdiv = export.maybe_singularize_placetype(div.type) or div.type
local prep = div.prep or "ของ"
local cat_as = div.cat_as or div.type
if type(cat_as) ~= "table" then
cat_as = {cat_as}
end
if not export.placetype_data[sgdiv] then
internal_error("Placetype %s associated with known location key %s and data %s not found in " ..
"`placetype_data`", sgdiv, key, spec)
end
if sgdiv == data.entry_placetype then
local retcats = {}
for _, pt_cat in ipairs(cat_as) do
if type(pt_cat) == "string" then
pt_cat = {type = pt_cat}
end
local pt_prep = pt_cat.prep or prep
insert(retcats, ucfirst(pt_cat.type) .. pt_prep .. export.get_prefixed_key(key, spec)) --th
end
return retcats
end
end
end
end
end
--[==[
This is used to add pages to "bare" categories like [[:Category:en:Georgia, USA]] for `[[Georgia]]` and any
foreign-language terms that are translations of the state of Georgia. We look at the page title (or its overridden value
in {{para|pagename}}) as well as the glosses in {{para|t}}/{{para|t2}} etc., various extra-info values such as the
modern names in {{para|modern}}, and any values specified using a form-of directive. We need to pay attention to the
entry placetypes specified so we don't overcategorize; e.g. the US state of Georgia is `[[Джорджия]]` in Russian but the
country of Georgia is `[[Грузия]]`, and if we just looked for matching names, we'd get both Russian terms categorized
into both [[:Category:ru:Georgia, USA]] and [[:Category:ru:Georgia]]. We also need to check the containing holonyms to
make sure there isn't a mismatch (so we don't e.g. categorize Newark, Delaware in [[:Category:en:Newark]], which is
intended for Newark, New Jersey).
]==]
function export.get_bare_categories(args, overall_place_spec)
local bare_cats = {}
local place_descs = overall_place_spec.descs
local possible_placetypes_by_place_desc = {}
for i, place_desc in ipairs(place_descs) do
possible_placetypes_by_place_desc[i] = {}
for _, placetype in ipairs(place_desc.placetypes) do
if not export.placetype_is_ignorable(placetype) then
local equivs = export.get_placetype_equivs(placetype, {register_former_as_non_former = true})
for _, equiv in ipairs(equivs) do
insert(possible_placetypes_by_place_desc[i], equiv.placetype)
end
end
end
end
local function check_term(term)
-- Treat Wikipedia links like local ones.
term = term:gsub("%[%[w:", "[["):gsub("%[%[wikipedia:", "[[")
term = export.remove_links_and_html(term)
term = term:gsub("^the ", "")
for i, place_desc in ipairs(place_descs) do
-- Iterate over all matching locations in case there are multiple, as with Delhi defined as
-- {{place|en|megacity/and/union territory|c/India|containing the national capital [[New Delhi]]}}.
for group, key, spec, container_trail in export.iterate_matching_holonym_location {
holonym_placetype = possible_placetypes_by_place_desc[i],
holonym_placename = term,
place_desc = place_desc,
} do
insert(bare_cats, key)
end
end
end
-- FIXME: Should we only do the following if the language is English (requires that the lang is passed in)?
-- We should always do it if `pagename` is given (as it is with {{tcl}}) but maybe not otherwise unless 1=en. There
-- are cases like [[Ankara]] = English name for capital of Turkey, but also the name in various languages for the
-- capital of Ghana (= English [[Accra]]). But this should get caught by mismatching the containing country. The
-- advantage of checking when the language isn't English is we catch those places that fail to give an English
-- translation but where the translation happens to be the same as the other-language spelling. However, I don't
-- know how often this situation occurs.
check_term(args.pagename or mw.title.getCurrentTitle().subpageText)
for _, t in ipairs(args.t) do
check_term(t)
end
local function check_termobj_list(terms)
for _, term in ipairs(terms) do
if term.eq then
check_term(term.eq)
end
if term.alt or term.term then
check_term(term.alt or term.term)
end
end
end
for _, extra_info_terms in ipairs(overall_place_spec.extra_info) do
local arg = extra_info_terms.arg
if arg == "modern" or arg == "now" or arg == "full" or arg == "short" then
check_termobj_list(extra_info_terms.terms)
end
end
for _, directive in ipairs(overall_place_spec.directives) do
check_termobj_list(directive.terms)
end
return bare_cats
end
--[==[
This is used to augment the holonyms associated with a place description with the containing polities. For example,
given the following:
`# {{tl|place|en|subprefecture|pref/Hokkaido}}.`
We auto-add Japan as another holonym so that the term gets categorized into [[:Category:Subprefectures of Japan]].
To avoid over-categorizing we need to check to make sure no other countries are specified as holonyms.
]==]
function export.augment_holonyms_with_container(place_descs)
for _, place_desc in ipairs(place_descs) do
if place_desc.holonyms then
-- This ends up containing a copy of the original holonyms, with the augmented holonyms inserted in their
-- appropriate position. We don't just put them at the end because some holonyms have use the `:also`
-- modifier, which causes category processing to restart at that point after generating categories for a
-- preceding holonym, and we don't want the preceding holonym's augmented holonyms interfering with
-- categorization of a later holonym. We proceed from right to left, and each time we augment, we copy
-- the holonyms with the augmented holonym(s) inserted appropriately and replace the place description's
-- holonyms with the augmented ones before the next iteration. The reason for this is so that e.g.
-- {{place|neighborhood|city/Birmingham|co/West Midlands|cc/England}} doesn't throw an error during the
-- augmentation process due to 'Birmingham' referring to two known locations (in England and Alabama). If
-- we go left to right, we will throw an ambiguity error on `city/Birmingham` because code to exclude
-- Birmingham, Alabama needs `c/United Kingdom` present (to cause a mismatch with `c/United States`),
-- which isn't yet present as the augmentation code hasn't gotten to `cc/England` yet. For similar
-- reasons, we need to include the augmented holonyms in the holonyms considered in the next iteration
-- rather than modifying the place description once at athe end.
for i = #place_desc.holonyms, 1, -1 do
local holonym = place_desc.holonyms[i]
if holonym.placetype and not export.placetype_is_ignorable(holonym.placetype) then
local group, key, spec, container_trail = export.find_matching_holonym_location {
holonym_placetype = holonym.placetype,
holonym_placename = holonym.unlinked_placename,
holonym_index = i,
place_desc = place_desc,
}
if group and container_trail[1] and not spec.no_auto_augment_container then
local augmented_holonyms = {}
for j = 1, i do
insert(augmented_holonyms, place_desc.holonyms[j])
end
for _, containers in ipairs(container_trail) do
local any_no_auto_augment_container = false
for _, container in ipairs(containers) do
any_no_auto_augment_container = any_no_auto_augment_container or
container.spec.no_auto_augment_container
local containing_type = container.spec.placetype
if type(containing_type) == "table" then
-- If the containing type is a list, use the first element as the canonical variant.
containing_type = containing_type[1]
end
local full_container_placename, elliptical_container_placename =
m_locations.key_to_placename(container.group, container.key)
-- Don't side-effect holonyms while processing them.
local new_holonym = {
-- By the time we run, the display has already been generated so we don't need to
-- set display_placename.
placetype = containing_type,
-- placename_to_key() for the group should correctly handle both full and elliptical
-- placenames, but the full placename seems less likely to be ambiguous. FIXME: We
-- should just store the key directly and use it when available to avoid having to
-- convert key to placename and back to key.
unlinked_placename = full_container_placename,
-- Indicate that this is an augmented holonym, and was derived from the specified
-- holonym. In iterate_matching_holonym_location(), we ignore augmented holonyms
-- derived from holonyms that are different from the holonym we're searching for but
-- of the same placetype. This is to correctly handle a situation like
-- {{place|river|dept/Ardèche,Gard,Vaucluse,Bouches-du-Rhône|c/France}}. Here,
-- `Ardèche` is in `r/Auvergne-Rhône-Alpes`, while `Gard` is in `r/Occitania` and
-- the other two are in `r/Provence-Alpes-Côte d'Azur`. Augmenting proceeds from
-- right to left, so after it adds `r/Provence-Alpes-Côte d'Azur` to
-- `Bouches-du-Rhône`, Vaucluse gets augmented correctly but `Gard` fails to match
-- in find_matching_holonym_location() because of the mismatch between augmented
-- `r/Provence-Alpes-Côte d'Azur` and actual `r/Occitania`. Similarly, all later
-- calls to find_matching_holonym_location() fail to match `Gard` (and likewise
-- `Ardèche`) against any known location. To deal with this, we mark augmented
-- holoynms as being augmented due to a source holonym, and when processing a given
-- holonym, ignore augmented holonyms from other holonyms of the same placetype.
-- The restriction to the same placetype is so that `Birmingham` still gets
-- correctly disambiguated to Birmingham, England in the example given above near
-- the top of this function, using the augmented holonym `c/United Kingdom` added by
-- the specified `cc/England` (whose placetype `constituent country` differs from
-- the placetype `city` of Birmingham).
augmented_from_holonym = holonym,
}
insert(augmented_holonyms, new_holonym)
-- But it is safe to modify other parts of the place_desc.
export.key_holonym_into_place_desc(place_desc, new_holonym)
end
if any_no_auto_augment_container then
break
end
end
for j = i + 1, #place_desc.holonyms do
insert(augmented_holonyms, place_desc.holonyms[j])
end
place_desc.holonyms = augmented_holonyms
end
end
end
end
end
end
-- Cat handler for district, areas, neighborhoods and suburbs. Districts are tricky because they can either be political
-- divisions or city neighborhoods. Areas similarly can be political divisions (rarely; specifically, in Kuwait), city
-- neighborhoods or larger geographical areas/regions. We handle this as follows:
-- (1) `placetype_data` cat entries for specific countries or country divisions take precedence over cat_handlers, so if
-- the user says {{tl|place|district|s/Maharashtra|c/India}}, we won't even be called because there is an entry that
-- categorizes into [[:Category|Districts of Maharashtra, India]].
-- (2) If we're called, we check the holonym we're called on to see if it is a recognized city, e.g. if we're called
-- using {{tl|place|district|city/Mumbai|s/Maharashtra|c/India}}. If so, we categorize under e.g.
-- [[:Category:Neighbourhoods of Mumbai]]. (Choosing the spelling "neighbourhoods" because we're in India.)
-- (3) If we're called and the holonym is not a recognized city, we check if the placetype has has_neighborhoods set.
-- If so, it's "city-like" and we categorize under the first containing polity that we recognize. For example, if
-- we're called using {{tl|place|district|town/Northampton|co/Hampshire|s/Massachusetts|c/US}}, we should recognize
-- town as "city-like" and categorize under [[:Category:Neighborhoods in Massachusetts]]. (Note "ใน" not "ของ", and
-- note the spelling "neighborhoods" because we're in the US.)
-- (4) If the holonym is not city-like, we do nothing. If there's a city or city-like placetype farther up (e.g. we're
-- called as {{tl|place|district|ward/Foo|mun/Bar|...}}), we will handle the city-like entity according to (2) or
-- (3) when called on that holonym. Otherwise either the categorization in (1) takes place or there's no
-- categorization.
local function district_neighborhood_cat_handler(data)
local function get_plural_entry_placetype(location_spec, container_trail)
if data.entry_placetype == "suburb" then
return "Suburbs"
else
-- Check for `british_spelling` setting on the spec itself or any container.
local uses_british_spelling = location_spec.british_spelling
if uses_british_spelling == nil and container_trail then
for _, container_set in ipairs(container_trail) do
local must_outer_break = false
for _, container in ipairs(container_set) do
if container.spec.british_spelling ~= nil then
uses_british_spelling = container.spec.british_spelling
must_outer_break = true
break
end
end
if must_outer_break then
break
end
end
end
return uses_british_spelling and "Neighbourhoods" or "Neighborhoods"
end
end
-- First check the immediate holonym to see if it's a city or a city-like top-level entity (Hong Kong, Bonaire,
-- etc.)
local group, key, spec, container_trail = export.find_matching_holonym_location(data)
if group and not spec.is_former_place and spec.is_city then
return {get_plural_entry_placetype(spec, container_trail) .. " of " .. export.get_prefixed_key(key, spec)}
end
-- If the entry placetype is neighbo(u)rhood, assume it is a neighborhood even if there isn't a city-like
-- entity father up the chain. (E.g. due to a mistaken use of m/ instead of mun/ for municipality.)
local has_neighborhoods
local entry_placetype = data.entry_placetype
if entry_placetype == "neighborhood" or entry_placetype == "neighbourhood" or entry_placetype == "suburb" then
has_neighborhoods = true
else
-- Otherwise, make sure the current holonym is city-like.
has_neighborhoods = export.get_equiv_placetype_prop(data.holonym_placetype, function(pt)
return export.get_placetype_prop(pt, "has_neighborhoods")
end, {continue_on_nil_only = true})
end
if has_neighborhoods then
-- Loop up the holonyms, looking for city and city-like entities in case of e.g. [[Sepulveda]] written
-- {{place|en|neighborhood|valley/San Fernando Valley|city/Los Angeles|s/California|c/USA}}
-- but also look for a recognizable poldiv, and if so categorize as "Neighborhoods in POLDIV". We need
-- to start with the current holonym, which is especially important for neighborhoods and suburbs that
-- may have the first holonym be a recognizable province, etc. but can't hurt otherwise. (Previously
-- we skipped the first/current holonym.)
for other_holonym_index, other_holonym in export.get_holonyms_to_check(data.place_desc,
data.holonym_index) do
local other_holonym_data = {
holonym_placetype = other_holonym.placetype,
holonym_placename = other_holonym.unlinked_placename,
holonym_index = other_holonym_index,
place_desc = data.place_desc,
}
local group, key, spec, container_trail = export.find_matching_holonym_location(other_holonym_data)
if group and not spec.is_former_place then
return {get_plural_entry_placetype(spec, container_trail) .. (spec.is_city and "ของ" or "ใน") ..
export.get_prefixed_key(key, spec)}
end
end
end
end
function export.check_already_seen_string(holonym_placename, already_seen_strings)
local canon_placename = ulower(m_links.remove_links(holonym_placename))
if type(already_seen_strings) ~= "table" then
already_seen_strings = {already_seen_strings}
end
for _, already_seen_string in ipairs(already_seen_strings) do
if canon_placename:find(already_seen_string) then
return true
end
end
return false
end
-- Prefix display handler that adds a prefix such as "Metropolitan Borough of " to the display
-- form of holonyms. We make sure the holonym doesn't contain the prefix or some variant already.
-- We do this by checking if any of the strings in ALREADY_SEEN_STRINGS, either a single string or
-- a list of strings, or the prefix if ALREADY_SEEN_STRINGS is omitted, are found in the holonym
-- placename, ignoring case and links. If the prefix isn't already present, we create a link that
-- uses the raw form as the link destination but the prefixed form as the display form, unless the
-- holonym already has a link in it, in which case we just add the prefix.
local function prefix_display_handler(prefix, holonym_placename, already_seen_strings)
if export.check_already_seen_string(holonym_placename, already_seen_strings or ulower(prefix)) then
return holonym_placename
end
if holonym_placename:find("%[%[") then
return prefix .. " " .. holonym_placename
end
return prefix .. " [[" .. holonym_placename .. "]]"
end
-- Suffix display handler that adds a suffix such as " parish" to the display form of holonyms.
-- Works identically to prefix_display_handler but for suffixes instead of prefixes.
local function suffix_display_handler(suffix, holonym_placename, already_seen_strings, include_suffix_in_link)
if export.check_already_seen_string(holonym_placename, already_seen_strings or ulower(suffix)) then
return holonym_placename
end
if holonym_placename:find("%[%[") then
return holonym_placename .. " " .. suffix
end
if include_suffix_in_link then
return "[[" .. holonym_placename .. " " .. suffix .. "]]"
else
return "[[" .. holonym_placename .. "]] " .. suffix
end
end
-- Display handler for boroughs. New York City boroughs are display as-is. Others are suffixed
-- with "borough".
local function borough_display_handler(holonym_placetype, holonym_placename)
local unlinked_placename = m_links.remove_links(holonym_placename)
if m_locations.new_york_boroughs[unlinked_placename] then
-- Hack: don't display "borough" after the names of NYC boroughs
return holonym_placename
end
return suffix_display_handler("borough", holonym_placename)
end
local function county_display_handler(holonym_placetype, holonym_placename)
local unlinked_placename = m_links.remove_links(holonym_placename)
-- Display handler for Irish counties. Irish counties are displayed as e.g. "County [[Cork]]".
if m_locations.ireland_counties["County " .. unlinked_placename .. ", Ireland"] or
m_locations.northern_ireland_counties["County " .. unlinked_placename .. ", Northern Ireland"] then
return prefix_display_handler("เทศมณฑล", holonym_placename)
end
-- Display handler for Taiwanese counties. Taiwanese counties are displayed as e.g. "[[Chiayi]] County".
if m_locations.taiwan_counties[unlinked_placename .. " County, Taiwan"] then
return suffix_display_handler("เทศมณฑล", holonym_placename)
end
-- Display handler for Romanian counties. Romanian counties are displayed as e.g. "[[Cluj]] County".
if m_locations.romania_counties[unlinked_placename .. " County, Romania"] then
return suffix_display_handler("เทศมณฑล", holonym_placename)
end
-- FIXME, we need the same for US counties but need to key off the country, not the specific county.
-- Others are displayed as-is.
return holonym_placename
end
-- Display handler for prefectures. Japanese prefectures are displayed as e.g. "[[Fukushima]] Prefecture".
-- Others are displayed as e.g. "[[Fthiotida]] prefecture".
local function prefecture_display_handler(holonym_placetype, holonym_placename)
local unlinked_placename = m_links.remove_links(holonym_placename)
local suffix = m_locations.japan_prefectures[unlinked_placename .. " Prefecture, Japan"] and "Prefecture" or "prefecture"
return suffix_display_handler(suffix, holonym_placename)
end
-- Display handler for provinces of Iran, Laos, North and South Korea, Thailand, Turkey and Vietnam. Recognized
-- provinces are displayed as e.g. "[[Gyeonggi]] Province" or "[[Antalya]] Province". Others are displayed as-is.
local function province_display_handler(holonym_placetype, holonym_placename)
local unlinked_placename = m_links.remove_links(holonym_placename)
if
m_locations.iran_provinces[unlinked_placename .. ", Iran"] or
m_locations.laos_provinces[unlinked_placename .. ", Laos"] or
m_locations.north_korea_provinces[unlinked_placename .. ", North Korea"] or
m_locations.south_korea_provinces[unlinked_placename .. ", South Korea"] or
m_locations.thailand_provinces[unlinked_placename .. ", ไทย"] or
m_locations.turkey_provinces[unlinked_placename .. ", Turkey"] or
m_locations.vietnam_provinces[unlinked_placename .. ", เวียดนาม"] then
return suffix_display_handler("จังหวัด", holonym_placename)
end
return holonym_placename
end
-- Display handler for Nigerian states. Nigerian states are display as "[[Kano]] State". Others are displayed as-is.
local function state_display_handler(holonym_placetype, holonym_placename)
local unlinked_placename = m_links.remove_links(holonym_placename)
if m_locations.nigeria_states[unlinked_placename .. " State, Nigeria"] then
return suffix_display_handler("รัฐ", holonym_placename)
end
return holonym_placename
end
-- Display handler for voivodeships. Display as e.g. [[Subcarpathian Voivodeship]].
local function voivodesip_display_handler(holonym_placetype, holonym_placename)
return suffix_display_handler("Voivodeship", holonym_placename, nil, "include_suffix_in_link")
end
------------------------------------------------------------------------------------------
-- Placetype data --
------------------------------------------------------------------------------------------
--[==[ var:
Main placetype data structure. This specifies, for each canonicalized placetype, various properties. The keys are
placetypes (in the singular, except for category-only placetypes, which are plural and followed by `!`), and the value
is a table of properties. The `"*"` key is special and is used for adding "generic" categories of the form
`สถานที่ใน``location`` `; it runs for all entry placetypes. Keys in the form of plural placetypes followed by `!` are
used only in [[Module:category tree/topic cat/data/Places]] for specifying the properties of categories containing the
specified placetype, esp. bare categories like [[:Category:States and territories]] (rather than qualified categories
like [[:Category:States and territories of Australia]]).
Keys under the value table for a given placetype of are two types: ''property keys'' (which specify the value of
specific properties) and ''categorization keys'' (which tell how to categorize certain sorts of holonyms if the
placetype in question occurs as an entry placetype). Categorization keys are either the special value `default` or are
wildcard strings with a slash in them, such as `"country/*"`. Note that only wildcard strings are currently allowed
directly in the placetype data; everything else is handled through category handlers, either per-placetype or special
(such as `political_division_cat_handler`). The algorithm for how category keys and handlers are used to generate
categories is described at the top of [[Module:place]].
There are several recognized property keys, of various types:
1. The following link-related property keys are recognized:
* `link`: '''Required''' except in category-only placetypes ending in `!`. Describes how to link and display the
placetype in the formatted description when occurring as an entry placetype. Also used for formatting pluralized
placetypes (which may occur in entry placetypes, esp. new-format ones, such as `two <<islands>>`) and may occur in
categories). The possible values are:
*# `true`: Link to the same-named Wiktionary entry. This creates a raw link, e.g. `<nowiki>[[city]]</nowiki>`, which is
converted to an English-specific link by JavaScript postprocessing. If the placetype is plural, this creates a
two-part raw link e.g. `<nowiki>[[city|cities]]</nowiki>`.
*# `"w"`: Link to the same-named Wikipedia entry. This creates a two-part link, e.g.
`<nowiki>[[w:census town|census town]]</nowiki>`, or `<nowiki>[[w:census town|census towns]]</nowiki>` if the
placetype is given plural.
*# `"+..."`: Create a two-part link to the entry following the `+` sign. For example, if `cercle` specifies
`"+w:cercles of Mali"`, a two-part link `<nowiki>[[w:cercles of Mali|cercle]]</nowiki>` will be generated, or
`<nowiki>[[w:cercles of Mali|cercles]]</nowiki>` if plural `cercles` is specified.
*# `"separately"`: Link each word separately. For example, if `administrative territory` specifies `"separately"`, it
will be linked as `<nowiki>[[administrative]] [[territory]]</nowiki>`, or as
`<nowiki>[[administrative]] [[territory|territories]]</nowiki>` if plural `administrative territories` is given.
*# another string: Use that string directly. If the placetype is plural, `pluralize()` in [[Module:en-utilities]] is
called on the string, which will correctly pluralize most strings, including those with links in them. (If there
are multiple links, the display form of the last link is pluralized.)
*# `false`: This placetype is not allowed as an entry placetype. An error will be thrown if this placetype is given as
an entry placetype. This is specified for internal-use placetypes, especially placetypes used in conjunction with
the qualifiers `former`, `ancient`, `historical` and such.
* `plural_link`: If specified and the placetype is plural, use the value in place of generating a pluralized version of
the link spec in `link`. Most commonly, this is either a string with links in it (which is used directly) or the
value `false`, indicating that the placetype cannot occur plural. (This is used for example by `caplc`, which displays
as `<nowiki>[[capital]] and [[large]]st [[city]]</nowiki>`, where a plural version doesn't make sense.) Generally if
this is specified, `plural` also needs to be specified to give a special placetype plural; this situation occurs
especially with multiword placetypes where something other than the last word is pluralized. An example is
`town with bystatus`, whose plural is `towns with bystatus`, which needs to be explicitly given. This example uses
`link = <nowiki>"[[town]] with [[bystatus#Norwegian Bokmål|bystatus]]"</nowiki>` ({{m|nb|bystatus}}) is a Norwegian
Bokmål word, and template calls aren't currently permitted in link strings), along with
`plural_link = <nowiki>"[[town]]s with [[bystatus#Norwegian Bokmål|bystatus]]"</nowiki>`.
* `category_link`: Spec indicating how to display the placetype when occurring in category descriptions. Defaults to
the value of `link`, and in turn is overridden by more specific `category_link_*` keys; see below. Category-only
placetypes (which are plural and end in `!`) usually use `category_link` in preference to `link`. The value of
`category_link` can be any of the types of specs given above, but most commonly is a plural string with links in it,
spelling out the description; in this case it is used directly. When both `category_link` and `link` are given, the
value in `category_link` is typically longer and more descriptive. For example, `polity` uses `link = true`, which
just generates a link `<nowiki>[[polity]]</nowiki>` or plural `<nowiki>[[polity|polities]]</nowiki>`, but specifies a
separate `category_link = <nowiki>"[[independent]] or [[semi-]][[independent]] [[polity|polities]]"</nowiki>`, which
clarifies in the category description what a polity is.
* `category_link_top_level`: Spec indicating how to display top-level (bare/unqualified) categories, i.e. categories
where the placetype is not followed by `in ``location`` ` or `of ``location`` `. If given, this overrides
`category_link` for this type of category.
* `category_link_before_noncity`: Spec indicating how to display qualified categories of the form
` ``placetypes`` in/of ``location`` ` where ``location`` does not refer to a city. If given, this overrides
`category_link` for this type of category.
* `category_link_before_city`: Spec indicating how to display qualified categories of the form
` ``placetypes`` in/of ``location`` ` where ``location`` refer to a city. If given, this overrides `category_link` for
this type of category. An example where this is given is `neighborhood`, which uses the following specs:<ol>
<li>`link = true`</li>
<li>`category_link = <nowiki>"[[neighborhood]]s, [[district]]s and other subportions of [[city|cities]]"</nowiki>`</li>
<li>`category_link_before_city = <nowiki>"[[neighborhood]]s, [[district]]s and other subportions"</nowiki>`</li>
</ol> This has the effect of making the entry placetype `neighborhood` display as just
`<nowiki>[[neighborhood]]</nowiki>`, while e.g. a category like `Neighborhoods of Chicago` displays as
`<nowiki>[[neighborhood]]s, [[district]]s and other subportions of [[Chicago]], ...</nowiki>` and a category like
`Neighborhoods in Illinois, USA` displays as
`<nowiki>[[neighborhood]]s, [[district]]s and other subportions of [[city|cities]] in [[Illinois]], ...</nowiki>`.
* `disallow_in_entries`: If specified, this placetype cannot occur as an entry placetype, and the specified value
(a message indicating what to use instead) is displayed in the error message.
* `disallow_in_holonyms`: If specified, this placetype cannot occur as a holonym placetype, and the specified value
(a message indicating what to use instead) is displayed in the error message.
2. There is currently one fallback-related property key recognized:
* `fallback`: If specified, its value is a placetype which will be used for categorization purposes if no categories
get added using the placetype itself. As an example, `branch` sets a fallback of `river` but also sets
`preposition = "ของ"`, meaning that {{tl|place|en|branch|riv/Mississippi}} displays as `a branch of the Mississippi`
(whereas `river` itself uses the preposition `in`), but otherwise categorizes the same as `river`. A more complex
example is `area`, which sets a fallback of `geographic and cultural area` and also sets a category handler that
checks for cities or city-like entities (e.g. boroughs) occurring as holonyms and categorizes the toponym under
[[:Category:Neighborhoods of CITY]] (for recognized cities) or otherwise [[:Category:Neighborhoods of POLDIV]] (for
the nearest containing recognized location). In addition, `area` is set as a political division of Kuwait, meaning if
`c/Kuwait` occurs as holonym, the toponym is categorized under [[:Category:Areas of Kuwait]]. If none of these
categories trigger, the fallback of `geographic and cultural area` will take effect, and the toponym will be
categorized as e.g. [[:Category:Geographic and cultural areas of England]].
3. There is currently one property to control irregular plurals of placetypes:
* `plural`: If specified, its value is the plural of the placetype. Otherwise, the default pluralization algorithm in
[[Module:en-utilities]] applies (which correctly pluralizes most words, including those ending in `-y`, `-ch`, `-sh`,
`-x`, etc.). The value of `plural` is also used when converting a pluralized placetype into its singular equivalent;
for example, since the placetype `kibbutz` has `plural = "kibbutzim"`, the placetype `kibbutzim` will be recognized
as a plural and singularized to `kibbutz`. For this reason, it's occasionally necessary to specify a `plural` value
even when the default pluralization algorithm works correctly, if the default singularization algorithm won't
correctly reverse the pluralization (as with `pass` and other terms ending in `-ss`).
4. The following property keys relate to generating categories for entry placetypes and specifying the parents of those
categories:
* `class`: The general class of placetype. This is used for various purposes: (a) to categorize placetypes preceded by
a qualifier such as `former`, `ancient`, `medieval` or `historical` (note that these placetypes are not all treated
alike); (b) to determine the parent category of bare placetype categories (e.g. [[:Category:Villages]] for placetype
`village`); (c) to determine whether to add a parent category `political divisions of specific countries` to
qualified placetype categories (e.g. [[:Category:Villages in Mali]]). The possible values are:
*# `polity`: a more-or-less sovereign/independent polity, such as a country, kingdom or empire.
*# `subpolity`: a non-sovereign division of a polity, above the level of an individual settlement.
*# `settlement`: a city or smaller equivalent, such as a village. This also includes administrative divisions of a
settlement, such as wards and barangays.
*# `non-admin settlement`: similar to a settlement but without administrative or political significance, such as an
unincorporated community, farm or neighborhood.
*# `capital`: a settlement that is a capital. A former capital is generally still in existence, just not the capital
any more.
*# `natural feature`: any non-man-made feature, such as a lake, mountain, island, ocean, etc.
*# `man-made structure`: a man-made feature below the level of a neighborhood, such as a house, airport, university,
metro station, park or the like.
*# `geographic region`: a geographic or cultural region or area that has no administrative significance. These may vary
greatly in size but typically have some sort of cultural significance (possibly historical). The `former`, `ancient`,
etc. qualifier has no effect on the category of these placetypes.
*# `generic place`: a place that isn't further qualified into any specific subtype.
* `former_type`: The class of placetype used for categorizing placetypes preceded by a qualifier such as `former`,
`ancient`, `medieval` or `historical`. The possible values are the same as for `class` but with the addition of
`dependent territory` (for colonies, protectorates and the like) and `!` (ignore the historical/former/ancient/etc.
qualifier; used e.g. with `fictional location` and `mythological location`). If not specified, the value of `class`
is used. When a qualifier such as `former`, `ancient`, `medieval` or `historical` is encountered (specifically, those
in `former_qualifiers`), it is mapped using `former_qualifiers` to the appropriate internal qualifier or qualifiers
(one or both of `ANCIENT` and/or `FORMER`, which are written in all-caps to distinguish them from user-specified
qualifiers), which is prepended to the value of `former_type` or `class` to form a placetype whose properties are
looked up to determine how to categorize the toponym in question. For example, if `medieval village` is given, we map
`medieval` to `ANCIENT` and `FORMER`, and `village` to its `class` of `settlement`, and enter the placetypes
`ANCIENT settlement` and `FORMER settlement` (in that order) into the list of equivalent placetypes returned by
`get_placetype_equivs`. In this case, there is an entry in `placetype_data` for `ANCIENT settlement`, so its default
category spec `Ancient settlements` is used as the category. If on the other hand `medieval kingdom` is given, where
`kingdom` has a `class` value `polity`, we first look up `ANCIENT polity`, see there is no entry in `placetype_data`
for it, and then look up `FORMER polity`, which exists and has a default category spec `Former polities`, which is
used as the category. Note that if the placetype following the "former" qualifier is recognized in `placetype_data`
but has no `former_type` or `class` and no fallback with a `former_type` or `class` specified, it is an internal
error; but if the placetype isn't recognized (e.g. something like `former greenhouse` is specified and we don't have
an entry for `greenhouse`), we just track the occurrence and end up not categorizing.
* `bare_category_parent`: This specifies the first parent category of a bare placetype category named according to the
placetype in question (e.g. [[:Category:Atolls]] for placetype `atoll`, or [[:Category:Named buildings]] for
placetype `named buildings!`). If not specified, the first parent category is determined by the value of `class`,
using the mapping `class_to_bare_category_parent` in [[Module:category tree/topic cat/data/Places]].
* `addl_bare_category_parents`: Extra parent categories to add a bare placetype category to (see `bare_category_parent`
just above).
* `bare_category_breadcrumb`: Breadcrumb for bare placetype categories. Also used as the sort key of
`bare_category_parent` if it is a string.
* `inherently_former`: If specified and the given placetype is used as an entry placetype, act as if `former` or
`ancient` (depending on the value of `inherently_former`) were prefixed to the placetype. This is for placetypes that
always refer to no-longer-existing entities, such as `satrapy` and `treaty port`. The value of `inherently_former` is
a list of internal qualifiers (one or more of `ANCIENT` and/or `FORMER`), just as for `former_qualifiers`, and the
implementation is the same.
* `cat_handler`: Handler used to generate the categories to add a given toponym to, if its entry placetype is the
placetype in question. Generally the `cat_handler` function checks the holonyms specified in order to determine which
category or categories to generate. For example, `district_neighborhood_cat_handler` handles placetypes `district`,
`neighborhood`, `subdivision`, `suburb` and the like, and either adds the toponym to a category like
`Neighborhoods of ``city`` ` (if a recognized city is given as a holonym), or otherwise a category like
`Neighborhoods in ``location`` ` (for the first recognized non-city location given as a holonym, if an unrecognized
city or city-like entity is given before the recognized non-city). The algorithm that runs the category handlers
iterates over holonyms from left to right, running the `cat_handler` function on each holonym in turn until one or
more categories are returned; see below for more specifics. (Note that countries for which e.g. a `district` is a
political division do not get the corresponding category added by the `district_neighborhood_cat_handler` function but
by `political_division_cat_handler`.) `cat_handler` functions are called with one argument, `data`, describing the
resolved entry placetype (i.e. after resolving placetype aliases and fallbacks) and the holonym being processed. The
return value should be a list of category specs (categories minus the langcode prefix, with `+++` standing for the
holonym key, or the value `true`, which stands for ` ``Placetypes`` in/of ``Holonym`` `, i.e. the pluralized placetype
with the appropriate preposition as specified in `placetype_data`). `data` contains the following fields:
** `entry_placetype`: the resolved entry placetype for the entry placetype being processed (i.e. it will always have an
entry in `placetype_data` but may not be the original placetype given by the user);
** `holonym_placetype` and `holonym_placename`: the holonym placetype and placename being processed;
** `holonym_index`: the index of the holonym being processed, or {nil} if we're handling an overriding holonym (FIXME:
we will change the overriding holonym algorithm so there will be an index even when processing overriding holonyms);
** `place_desc`: a full description of the {{tl|place}} call, as specified at the top of [[Module:place]];
** `from_demonym`: If set, we are called from [[Module:demonym]], triggered by {{tl|demonym-adj}} or
{{tl|demonym-noun}}, instead of being triggered by {{tl|place}}.
* `has_neighborhoods`: If `true`, the specified placetype is city-like. This is used in the
`district_neighborhood_cat_handler` to determine whether to add a category such as `Neighborhoods in ``location`` `;
see the section just above on `cat_handler`.
5. The following preposition-related property keys are recognized:
* `preposition`: The preposition used after this placetype when it occurs as an entry placetype. Defaults to `"ใน"`.
* `generic_before_non_cities`: If specified, the appropriate category description handler in
[[Module:category tree/topic cat/data/Places]] will recognize categories of the form
` ``Placetype`` in/of ``location`` ` for the specified placetype and preposition, if ``location`` is a non-city. This
is used to generate descriptions for categories added by category handlers and by explicit category specs in the
placetype data. All placetypes that specify `generic_before_non_cities` or `generic_before_cities` *MUST* also specify
a value for `class` so that the category tree code can determine whether it's a political or non-political division.
* `generic_before_cities`: Like `generic_before_non_cities` but for locations referring to cities.
6. The following property keys control the auto-addition of affixes when formatting holonyms of a particular placetype:
* `affix_type`: If specified, add the placetype as an affix before or after holonyms of this placetype. Possible values
are:
*# `"pref"` (the holonym will display as `(the) placetype of Holonym`, where `the` appears when the holonym directly
follows an entry placetype);
*# `"Pref"` (same as `"pref"` but the placetype is capitalized; each word is capitalized if there are multiple);
*# `"suf"` (the holonym will display as `Holonym placetype`);
*# `"Suf"` (the holonym will display as `Holonym Placetype`, i.e. same as `"suf"` but the placetype is capitalized).
* `suffix`: String to use in place of the placetype itself when the placetype is displayed as a suffix after a holonym.
Note that `suffix` can be used independently of `affix_type` because the user can also request a suffix explicitly
using a syntax like `adr:suf/Occitania`, which will display as `Occitania region` because the placetype
`administrative region` specifies `suffix = "ภูมิภาค"`.
* `prefix`: Like `suffix` but for use when the placetype is displayed as a prefix before the holonym.
* `affix`: Like `suffix` and `prefix` but for use when the placetype is displayed as an affix either before or after the
holonym. If both `suffix` or `prefix` and `affix` are given for a single placetype, `suffix` or `prefix` take
precedence.
* `no_affix_strings`: String or list of strings that, if they occur in the holonym, suppress the addition of any affix
requested using `affix_type`. Defaults to the placetype itself. For example, `autonomous okrug` specifies
`affix_type = "Suf"` so that `aokr/Nenets` displays as `Nenets Autonomous Okrug`, but also specifies
`no_affix_strings = "okrug"` so that `aokr/Nenets Okrug` or `aokr/Nenets Autonomous Okrug` displays as specified,
without a redundant `Autonomous Okrug` added. Matching is case-insensitive but whole-word.
* `display_handler`: A function of two arguments, `holonym_placetype` and `holonym_placename` (specifying a holonym).
Its return value is a string specifying the display form of the holonym.
7. The following property keys control the indefinite and definite articles used before entry placetypes and/or holonyms
of the specified placetype.
* `entry_placetype_use_the`: Use `"the"` before this placetype when it occurs as an entry placetype.
* `entry_placetype_indefinite_article`: Indefinite article used before this placetype when it occurs as an entry
placetype (usually `"a"`, specifically for placetypes beginning with u- that don't take the indefinite article
`"an"`). Defaults to the appropriate indefinite article (`"a"` or `"an"` depending on whether the placetype begins
with a vowel). Overridden by `entry_placetype_use_the`, and unlike for most properties, does not apply to equivalent
placetypes (i.e. fallbacks or those formed by removing a qualifier from the beginning); only to the exact placetype
specified.
* `holonym_use_the`: Use `"the"` before holonyms of this placetype.
'''NOTE:'''
# The `link` property must be specified on all placetypes, except those ending in `!` (category-only placetypes), which
must have either `link` or `category_link` specified.
# Either the `class` or `former_type` property must be specified on all placetypes not ending in `!` that do not have a
fallback (if a placetype has a fallback and omits the `class` and `former_type` properties, they are taken from the
fallback). An internal error will result if a placetype has no `class` or `former_type` property derivable either
directly or through a fallback, if an attempt is made to categorize a former/ancient/historical/etc. entity of this
placetype.
# It is possible to have multiple levels of fallback (e.g. `frazione` falls back to `hamlet`, which falls back
to `village`). Fallback loops will cause an internal error. All placetypes specified as fallbacks must exist in
`placetype_data` or an internal error occurs.
]==]
export.placetype_data = {
--[=[
If you need to sort the following, do this (using Vim):
1. Make sure all full-line comments are within the { ... } table, or are moved after and on the same line as single-line
entries.
2. Make sure the table uses tabs everywhere for indent, and not spaces.
3. Mark the top of the table with `ma`, go to the bottom and execute the following two lines in sequence:
:'a,.s/\n/\\n/g
:s/\\n\(\t\[\)/\r\1/g
The first command converts every newline to a literal `\n` sequence, so the whole thing becomes a single line, while
the second command restores the newlines before the beginning of each entry. The effect is to convert all entries to
a single line while not losing any information. (Potentially a negative lookahead could be used to do it all in one
command.)
4. Execute the following to sort:
:'a,.!perl -pe 's/^(\t\[")(.*?)(".*)$/$2 @@@ $1$2$3/' | sort -f | perl -pe 's/.*? @@@ //'
Note that a simple `sort -f` (where `-f` means case-insensitive) would almost work, but it would sort "hill station"
before "hill" and "county borough" before "เทศมณฑล" because the space after e.g. "hill station" sorts before the
quotation mark after e.g. "hill". The above command deals with this by extracting the key, prepending it followed by
` @@@ `, sorting, and then removing key (the classic decorate-sort-undecorate pattern).
5. Put the table back to multi-line format by marking the top of the table with `ma`, going to the bottom and executing
:'a,.s/\\n/\r/g
Note that for some reason, in order to get a match a newline in the left side of a replacement, you must use \n, but
to insert a newline in the right sode of a replacement you must use \r.
]=]
["*"] = {
link = false,
cat_handler = generic_place_cat_handler,
},
["administrative atoll"] = {
-- Maldives
link = "+w:administrative divisions of the Maldives",
preposition = "ของ",
class = "subpolity",
},
["administrative capital"] = {
link = "w",
fallback = "เมืองหลวง",
},
["administrative center"] = {
link = "w",
fallback = "เมืองหลวงที่ไม่ใช่นคร",
},
["administrative centre"] = {
link = "w",
fallback = "administrative center",
},
["administrative county"] = {
link = "w",
fallback = "เทศมณฑล",
},
["administrative district"] = {
link = "w",
fallback = "อำเภอ",
},
["administrative headquarters"] = {
link = "separately",
fallback = "administrative centre",
},
["administrative region"] = {
link = true,
preposition = "ของ",
suffix = "ภูมิภาค", -- but prefix is still "administrative region (of)"
fallback = "ภูมิภาค",
class = "subpolity",
},
["administrative seat"] = {
link = "w",
fallback = "administrative centre",
},
["administrative territory"] = {
link = "separately",
preposition = "ของ",
suffix = "ดินแดน", -- but prefix is still "administrative territory (of)"
fallback = "ดินแดน",
class = "subpolity",
},
["administrative unit"] = {
-- Grrr, it's difficult to generalize about "administrative units". In Albania, "administrative unit" is an
-- official term for a city-level division of municipalities; Wikipedia renders it using the more practical term
-- "commune". In Pakistan, "administrative unit" is a collective term used to refer to all the different types
-- of first-level divisions (four provinces, one federal territory, and two "disputed territories", i.e. Azad
-- Kashmir and Gilgit-Balistan, that are variously described). For this reason, we set no fallback, but we need
-- to include this so that it can be used as a placetype for Albania, categorizing as communes.
link = "w",
class = "subpolity",
},
["administrative village"] = {
link = "w",
preposition = "ของ",
has_neighborhoods = true,
class = "settlement",
},
["aimag"] = {
-- used in Mongolia, Russia and China (Inner Mongolia); in Mongolia, equivalent to a province;
-- in China, equivalent to a prefecture (below a province); in Russia, equivalent to a municipal district.
link = "w",
fallback = "prefecture",
},
["airport"] = {
link = true,
class = "man-made structure",
default = {true},
},
["alliance"] = {
link = true,
fallback = "confederation",
},
["archipelago"] = {
link = true,
fallback = "เกาะ",
},
["area"] = {
link = true,
preposition = "ของ",
fallback = "geographic and cultural area",
-- Areas can either be administrative divisions (specifically of Kuwait) or geographic areas. Assume the former
-- when categorizing 'Areas' but the latter when handling e.g. 'historical area'.
class = "subpolity",
former_type = "geographic region",
cat_handler = district_neighborhood_cat_handler,
},
["arm"] = {
link = true,
preposition = "ของ",
class = "natural feature",
default = {"ทะเล"},
},
["arrondissement"] = {
link = true,
preposition = "ของ",
-- FIXME!!! Grrrrr!!! In some countries, arrondissements are divisions of cities; in others, they are divisions
-- of departments or provinces. Need to conditionalize on the country for both of the following.
class = "subpolity",
has_neighborhoods = true,
},
["associated province"] = {
link = "separately",
fallback = "จังหวัด",
},
["atoll"] = {
-- FIXME! Atolls are administrative divisions of the Maldives but natural features elsewhere. Need to
-- conditionalize `class` on the country. See also `administrative atoll`.
link = true,
class = "natural feature",
bare_category_parent = "เกาะ",
default = {true},
},
["autonomous city"] = {
link = "w",
preposition = "ของ",
fallback = "นคร",
has_neighborhoods = true,
},
["autonomous community"] = {
-- Spain; refers to regional entities, not village-like entities, as might be expected from "community"
link = true,
preposition = "ของ",
class = "subpolity",
},
["autonomous island"] = {
-- Comoros; seems like an administrative atoll of the Maldives.
link = "+w:autonomous islands of Comoros",
preposition = "ของ",
class = "subpolity",
},
["autonomous oblast"] = {
link = true,
preposition = "ของ",
affix_type = "Suf",
no_affix_strings = "oblast",
class = "subpolity",
},
["autonomous okrug"] = {
link = true,
preposition = "ของ",
affix_type = "Suf",
no_affix_strings = "okrug",
class = "subpolity",
},
["autonomous prefecture"] = {
link = true,
fallback = "prefecture",
},
["autonomous province"] = {
link = "w",
fallback = "จังหวัด",
},
["autonomous region"] = {
link = "w",
preposition = "ของ",
fallback = "administrative region",
-- "administrative region" sets an affix of "ภูมิภาค" but we want to display as "Tibet Autonomous Region"
-- if the user writes 'ar:Suf/Tibet'.
affix = "autonomous region",
},
["autonomous republic"] = {
link = "w",
preposition = "ของ",
class = "subpolity",
},
["autonomous territorial unit"] = {
-- Moldova; only two of them, one for Gagauzia and one for Transnistria.
link = "w",
preposition = "ของ",
class = "subpolity",
},
["autonomous territory"] = {
link = "w",
fallback = "dependent territory",
},
["bailiwick"] = {
-- Jersey, etc.
link = true,
fallback = "องค์การทางการเมือง",
},
["barangay"] = {
-- Philippines
link = true,
class = "settlement",
-- Barangays are formal administrative divisions of a city rather than informal neighborhoods, but can use
-- some of the properties of a neighborhood.
fallback = "neighborhood",
},
["barrio"] = {
-- Spanish-speaking countries; Philippines
link = true,
-- FIXME: Not completely correct, in some countries barrios are formal administrative divisions of a city.
-- `class` will need to conditionalize on the country to be completely correct.
fallback = "neighborhood",
},
["basin"] = {
link = true,
fallback = "ทะเลสาบ",
},
["bay"] = {
link = true,
preposition = "ของ",
class = "natural feature",
addl_bare_category_parents = {"bodies of water"},
default = {true},
},
["beach"] = {
link = true,
class = "natural feature",
addl_bare_category_parents = {"water"},
default = {true},
},
["beach resort"] = {
link = "w",
fallback = "resort town",
},
["bishopric"] = {
link = true,
fallback = "องค์การทางการเมือง",
},
["bodies of water!"] = {
-- FIXME: This is (maybe?) a type category not a name category. There should be an option for this. We need to
-- straighten out the type vs. name vs. related-to issue.
category_link = "[[body of water|bodies of water]]",
class = "natural feature",
addl_bare_category_parents = {"ธรณีสัณฐาน", "ecosystems", "water"},
},
["borough"] = {
link = true,
preposition = "ของ",
display_handler = borough_display_handler,
has_neighborhoods = true,
-- "former borough" could be a former settlement or a former part of a city but seems more likely to
-- be a former subpolity, particularly in England. FIXME, we really need a handler to take care of this
-- properly.
class = "subpolity",
-- Grr, some boroughs are city-like but some (e.g. in Britain) may be larger.
},
["borough seat"] = {
link = true,
entry_placetype_use_the = true,
preposition = "ของ",
has_neighborhoods = true,
class = "capital",
},
["branch"] = {
link = true,
preposition = "ของ",
fallback = "แม่น้ำ",
},
["bridge"] = {
link = true,
class = "man-made structure",
default = {"Named bridges"},
},
["building"] = {
link = true,
class = "man-made structure",
default = {"Named buildings"},
},
["built-up area"] = {
link = "w",
fallback = "area",
},
["burgh"] = {
link = true,
fallback = "borough",
},
["business park"] = {
link = true,
fallback = "park",
},
["caliphate"] = {
link = true,
fallback = "องค์การทางการเมือง",
},
["canton"] = {
link = true,
preposition = "ของ",
affix_type = "suf",
class = "subpolity",
},
["cape"] = {
link = true,
fallback = "headland",
},
["capital"] = {
link = true,
fallback = "เมืองหลวง",
},
["เมืองหลวง"] = {
link = true,
category_link = "[[capital city|capital cities]]: the [[seat of government|seats of government]] for a country or [[political]] [[division]] of a country",
entry_placetype_use_the = true,
preposition = "ของ",
has_neighborhoods = true,
class = "capital",
bare_category_parent = "นคร",
cat_handler = capital_city_cat_handler,
default = {true},
-- The following is necessary so that e.g. [[Melbourne]] defined as {{place|en|capital city|s/Victoria|c/Australia}}
-- gets categorized in the bare category [[Category:en:Melbourne]]; otherwise placetype 'capital city' wouldn't
-- match against the placetype 'city' of Melbourne.
fallback = "นคร",
},
["caplc"] = {
link = "[[capital]] and [[large]]st [[city]]",
plural_link = false,
fallback = "เมืองหลวง",
},
["captaincy"] = {
link = true,
preposition = "ของ",
class = "subpolity",
inherently_former = {"FORMER"},
},
["caravan city"] = {
link = "w",
fallback = "นคร",
class = "settlement",
inherently_former = {"ANCIENT", "FORMER"},
},
["castle"] = {
link = true,
fallback = "building",
},
["cathedral city"] = {
link = true,
fallback = "นคร",
},
["cattle station"] = {
-- Australia
link = true,
fallback = "farm",
},
["census area"] = {
link = true,
affix_type = "Suf",
has_neighborhoods = true,
class = "non-admin settlement",
},
["census-designated place"] = {
-- United States
link = true,
class = "non-admin settlement",
},
["census division"] = {
-- Canada
link = "w",
preposition = "ของ",
class = "subpolity",
},
["census town"] = {
link = "w",
fallback = "เมือง",
},
["central business district"] = {
link = true,
fallback = "neighborhood",
},
["cercle"] = {
-- Mali
link = "+w:cercles of Mali",
preposition = "ของ",
class = "subpolity",
},
["ceremonial county"] = {
link = true,
fallback = "เทศมณฑล",
},
["chain of islands"] = {
link = "[[chain]] of [[island]]s",
plural = "chains of islands",
plural_link = "[[chain]]s of [[island]]s",
fallback = "เกาะ",
},
["channel"] = {
link = true,
fallback = "strait",
},
["charter community"] = {
-- Northwest Territories, Canada
link = "w",
fallback = "village",
},
["นคร"] = {
link = true,
generic_before_non_cities = "ใน",
has_neighborhoods = true,
class = "settlement",
cat_handler = city_type_cat_handler,
default = {true},
},
["city-state"] = {
link = true,
category_link = "[[sovereign]] [[microstate]]s consisting of a single [[city]] and [[w:dependent territory|dependent territories]]",
has_neighborhoods = true,
class = "settlement",
["continent/*"] = {"City-states", "นครใน+++", "ประเทศใน+++", "เมืองหลวงของ"},
default = {"City-states", "นคร", "ประเทศ", "เมืองหลวงของประเทศ"},
},
["civil parish"] = {
-- Mostly England; similar to municipalities
link = true,
preposition = "ของ",
affix_type = "suf",
has_neighborhoods = true,
class = "subpolity",
},
["claimed political division"] = {
link = "[[claim]]ed [[political]] [[division]]",
class = "subpolity",
default = {true},
},
["co-capital"] = {
link = "[[co-]][[capital]]",
fallback = "เมืองหลวง",
},
["coal city"] = {
link = "+w:coal town",
fallback = "นคร",
},
["coal town"] = {
link = "w",
fallback = "เมือง",
},
["collectivity"] = {
link = "w",
preposition = "ของ",
-- No default; these are weird one-off governmental divisions in France (esp. for overseas collectivities)
class = "subpolity",
},
["colony"] = {
link = true,
fallback = "dependent territory",
},
["comarca"] = {
-- per Wikipedia: traditional region or local administrative division found in Portugal, Spain, and some of
-- their former colonies, like Brazil, Nicaragua, and Panama. In the Valencian Community, for example, it
-- sits between municipalities and provinces, something like a county or district.
link = true,
preposition = "ของ",
class = "subpolity",
},
["commandery"] = {
link = true,
preposition = "ของ",
class = "subpolity",
inherently_former = {"ANCIENT", "FORMER"},
},
["commonwealth"] = {
link = true,
preposition = "ของ",
-- No default; applies specifically to Puerto Rico
class = "subpolity",
},
["commune"] = {
link = true,
fallback = "เทศบาล",
},
["community"] = {
link = true,
category_link = "[[community|communities]] of all sizes",
fallback = "village",
},
["community development block"] = {
-- in India; appears to be similar to a rural municipality; groups several villages, unclear if there will be
-- neighborhoods so I'm not setting `has_neighborhoods` for now
link = "w",
affix_type = "suf",
no_affix_strings = "block",
class = "subpolity",
},
["comune"] = {
-- Italy, Switzerland
link = true,
fallback = "เทศบาล",
},
["condominium"] = {
link = true,
fallback = "องค์การทางการเมือง",
},
["confederacy"] = {
link = true,
fallback = "confederation",
},
["confederation"] = {
link = true,
fallback = "องค์การทางการเมือง",
},
["constituency"] = {
-- currently we have them as political divisions of Namibia but many countries have them
link = true,
preposition = "ของ",
class = "subpolity",
},
["constituent country"] = {
link = true,
preposition = "ของ",
class = "subpolity",
},
["constituent part"] = {
link = "separately",
preposition = "ของ",
class = "subpolity",
},
["constituent republic"] = {
-- Of Russia, Yugoslavia, etc.
link = "separately",
preposition = "ของ",
class = "subpolity",
},
["counties and county-level cities!"] = {
-- This is used when grouping counties and county-level cities under prefecture-level cities in China.
category_link = "[[county|counties]] and [[county-level city|county-level cities]]",
class = "subpolity",
},
["continent"] = {
link = true,
category_link = false, -- can't occur as a bare category
class = "natural feature",
default = {"Continents and continental regions"},
},
["continental region"] = {
link = "separately",
category_link = false, -- can't occur as a bare category
class = "geographic region",
fallback = "continent",
},
["continents and continental regions!"] = {
category_link = "[[continent]]s and [[continent]]-[[level]] [[region]]s (e.g. [[Polynesia]])",
class = "geographic region",
},
["council area"] = {
link = true,
-- in Scotland; similar to a county
preposition = "ของ",
affix_type = "suf",
class = "subpolity",
},
["ประเทศ"] = {
link = true,
class = "polity", --ห้ามแปล class
["continent/*"] = {true, "ประเทศ"},
default = {true},
},
["country-like entities!"] = {
category_link = "[[polity|polities]] not normally considered [[country|countries]] but treated similarly for categorization purposes; typically, [[unrecognized]] [[de-facto]] countries or [[w:dependent territory|dependent territories]]",
class = "polity", --ห้ามแปล class
},
["เทศมณฑล"] = {
link = true,
preposition = "ของ",
display_handler = county_display_handler,
class = "subpolity",
},
["county borough"] = {
link = true,
-- in Wales; similar to a county
preposition = "ของ",
affix_type = "suf",
fallback = "borough",
class = "subpolity",
},
["county seat"] = {
link = true,
entry_placetype_use_the = true,
preposition = "ของ",
has_neighborhoods = true,
class = "capital",
},
["county town"] = {
link = true,
entry_placetype_use_the = true,
preposition = "ของ",
fallback = "เมือง",
has_neighborhoods = true,
class = "capital",
},
["county-administered city"] = {
-- In Taiwan, per Wikipedia similar to a Taiwanese township or district, which is a small city.
-- NOT anything like a "county-level city" in PR China, which is a county masquerading as a city.
link = "w",
fallback = "นคร",
has_neighborhoods = true,
class = "settlement",
},
["county-controlled city"] = {
-- Taiwan
link = "w",
fallback = "county-administered city",
},
["county-level city"] = {
-- PR China
link = "w",
fallback = "prefecture-level city",
},
["crater lake"] = {
link = true,
fallback = "ทะเลสาบ",
},
["creek"] = {
link = true,
fallback = "stream",
},
["Crown colony"] = {
link = "+crown colony",
fallback = "crown colony",
},
["crown colony"] = {
link = true,
fallback = "colony",
},
["Crown dependency"] = {
link = true,
fallback = "dependent territory",
},
["crown dependency"] = {
link = true,
fallback = "dependent territory",
},
["cultural area"] = {
link = "w",
fallback = "geographic and cultural area",
},
["cultural region"] = {
link = "w",
fallback = "geographic and cultural area",
},
["delegation"] = {
-- Tunisia
link = "+w:delegations of Tunisia",
preposition = "ของ",
class = "subpolity",
},
["department"] = {
link = true,
preposition = "ของ",
affix_type = "suf",
class = "subpolity",
},
["departmental capital"] = {
link = "separately",
fallback = "เมืองหลวง",
},
["dependency"] = {
link = true,
fallback = "dependent territory",
},
["dependent territory"] = {
link = "w",
preposition = "ของ",
class = "subpolity",
former_type = "dependent territory",
bare_category_parent = "political divisions",
["country/*"] = {true},
default = {true},
},
["desert"] = {
link = true,
class = "natural feature",
addl_bare_category_parents = {"ecosystems"},
default = {true},
},
["deserted mediaeval village"] = {
link = "w",
fallback = "deserted medieval village",
},
["deserted medieval village"] = {
link = "w",
fallback = "ANCIENT settlement",
},
["direct-administered municipality"] = {
-- China
link = "+w:direct-administered municipalities of China",
fallback = "เทศบาล",
},
["direct-controlled municipality"] = {
-- several countries
link = "w",
fallback = "เทศบาล",
},
["distributary"] = {
link = true,
preposition = "ของ",
fallback = "แม่น้ำ",
},
["อำเภอ"] = {
link = true,
preposition = "ของ",
affix_type = "suf",
-- Grrr! FIXME! Here is where we need handlers for `class`. Using similar logic to
-- district_neighborhood_cat_handler, we need to check if we're below or above a city to determine if the class
-- is "settlement" or "subpolity".
class = "subpolity",
cat_handler = district_neighborhood_cat_handler,
-- No default. Countries for which districts are political divisions will get entries.
},
["districts and autonomous regions!"] = {
-- This and other similar "combined placetypes" are for use in the plural when grouping first-level
-- administrative regions of certain countries, in this case Portugal.
category_link = "[[district]]s and [[autonomous region]]s",
class = "subpolity",
},
["districts and autonomous territorial units!"] = {
-- This and other similar "combined placetypes" are for use in the plural when grouping first-level
-- administrative regions of certain countries, in this case Moldova.
category_link = "[[district]]s and [[w:autonomous territorial unit|autonomous territorial unit]]s",
class = "subpolity",
},
["district capital"] = {
link = "separately",
fallback = "เมืองหลวง",
},
["district headquarters"] = {
link = "separately",
fallback = "administrative centre",
},
["district municipality"] = {
-- In Canada, a district municipality is equivalent to a rural municipality and won't have neighborhoods; in
-- South Africa, district municipalities group local municipalities and hence won't have neighborhoods.
link = "w",
preposition = "ของ",
affix_type = "suf",
no_affix_strings = {"อำเภอ", "เทศบาล"},
fallback = "เทศบาล",
class = "subpolity",
},
["division"] = {
link = true,
preposition = "ของ",
class = "subpolity",
},
["division capital"] = {
link = "separately",
fallback = "เมืองหลวง",
},
["dome"] = {
link = true,
fallback = "ภูเขา",
},
["dormant volcano"] = {
link = true,
fallback = "volcano",
},
["duchy"] = {
link = true,
fallback = "องค์การทางการเมือง",
},
["emirate"] = {
link = true,
preposition = "ของ",
-- FIXME: Can be subpolities (of the United Arab Emirates).
fallback = "องค์การทางการเมือง",
},
["จักรวรรดิ"] = {
link = true,
fallback = "องค์การทางการเมือง",
},
["enclave"] = {
link = true,
preposition = "ของ",
-- Enclaves can theoretically be any size but assume a subpolity.
class = "subpolity",
},
["entity"] = {
-- Bosnia and Herzegovina
link = "+w:entities of Bosnia and Herzegovina",
preposition = "ของ",
class = "subpolity",
},
["escarpment"] = {
link = true,
fallback = "ภูเขา",
},
["ethnographic region"] = {
-- used in Lithuania
link = "+w:ethnographic regions of Lithuania",
fallback = "geographic and cultural area",
},
["exclave"] = {
link = true,
preposition = "ของ",
-- exclaves can theoretically be any size but assume a subpolity.
class = "subpolity",
},
["external territory"] = {
link = "separately",
fallback = "dependent territory",
},
["farm"] = {
link = true,
class = "non-admin settlement",
default = {"Farms and ranches"},
},
["farms and ranches!"] = {
category_link = "[[farm]]s and [[ranch]]es",
class = "non-admin settlement",
},
["federal city"] = {
link = "w",
preposition = "ของ",
fallback = "นคร",
},
["federal district"] = {
link = true,
preposition = "ของ",
-- Might have neighborhoods as federal districts are often cities (e.g. Mexico City)
has_neighborhoods = true,
class = "settlement",
},
["federal subject"] = {
-- In Russia; a generic term for first-level administrative divisions (republics, oblasts, okrugs, krais,
-- autonomous okrugs and autonomous oblasts).
link = "w",
preposition = "ของ",
class = "subpolity",
},
["federal territory"] = {
link = "w",
fallback = "ดินแดน",
},
["fictional location"] = {
link = "separately",
former_type = "!",
class = "hypothetical location",
bare_category_parent = "สถานที่",
default = {true},
},
["First Nations reserve"] = {
-- Canada
link = "[[First Nations]] [[w:Indian reserve|reserve]]",
-- Wikipedia uses "Indian reserve"; presumably that is the legal term
fallback = "Indian reserve",
class = "subpolity",
},
["fjord"] = {
link = true,
class = "natural feature",
addl_bare_category_parents = {"bodies of water"},
default = {true},
},
["footpath"] = {
link = true,
fallback = "road",
},
["forest"] = {
link = true,
class = "natural feature",
addl_bare_category_parents = {"ecosystems", "forestry"},
default = {true},
},
["fort"] = {
link = true,
fallback = "building",
},
["fortress"] = {
link = true,
-- The default plural algorithm gets this right but the singularization algorithm incorrectly converts
-- fortresses -> fortresse, so put an entry here to ensure we singularize correctly.
plural = "fortresses",
fallback = "building",
},
["frazione"] = {
link = "w",
fallback = "hamlet",
},
["freeway"] = {
link = true,
fallback = "road",
},
["French prefecture"] = {
link = "[[w:prefectures in France|prefecture]]",
entry_placetype_use_the = true,
preposition = "ของ",
has_neighborhoods = true,
class = "capital",
},
["geographic and cultural area"] = {
link = "+w:cultural area",
-- `generic_before_non_cities` is used when generating the category description of categories of the format
-- `Geographic and cultural areas of PLACE`. `preposition` is used when generating {{place}} description and
-- categories for any placetype that falls back to `geographic and cultural area`.
generic_before_non_cities = "ของ",
preposition = "ของ",
class = "geographic region",
bare_category_parent = "สถานที่",
["country/*"] = {true},
["constituent country/*"] = {true},
["continent/*"] = {true},
default = {true},
},
["geographic area"] = {
link = "+w:geographic region",
fallback = "geographic and cultural area",
},
["geographic region"] = {
link = "w",
fallback = "geographic and cultural area",
},
["geographical area"] = {
link = "w",
fallback = "geographic and cultural area",
},
["geographical region"] = {
link = "w",
fallback = "geographic and cultural area",
},
["geopolitical zone"] = {
-- Nigeria
link = true,
preposition = "ของ",
class = "subpolity",
},
["gewog"] = {
-- Bhutan
link = true,
preposition = "ของ",
class = "subpolity",
},
["ghost town"] = {
link = true,
generic_before_non_cities = "ใน",
class = "non-admin settlement",
bare_category_parent = "former settlements",
cat_handler = city_type_cat_handler,
default = {true},
},
["glen"] = {
link = true,
fallback = "valley",
},
["governorate"] = {
link = true,
preposition = "ของ",
affix_type = "suf",
class = "subpolity",
},
["greater administrative region"] = {
-- China (former division)
link = "w",
preposition = "ของ",
class = "subpolity",
inherently_former = {"FORMER"},
},
["gromada"] = {
-- Poland (former division)
link = "w",
preposition = "ของ",
affix_type = "Pref",
class = "subpolity",
inherently_former = {"FORMER"},
},
["group of islands"] = {
link = "[[group]] of [[island]]s",
plural = "groups of islands",
plural_link = "[[group]]s of [[island]]s",
fallback = "island group",
},
["gulf"] = {
link = true,
preposition = "ของ",
holonym_use_the = true,
class = "natural feature",
addl_bare_category_parents = {"bodies of water"},
default = {true},
},
["hamlet"] = {
link = true,
fallback = "village",
},
["harbor city"] = {
link = "separately",
fallback = "นคร",
},
["harbor town"] = {
link = "separately",
fallback = "เมือง",
},
["harbour city"] = {
link = "separately",
fallback = "นคร",
},
["harbour town"] = {
link = "separately",
fallback = "เมือง",
},
["headland"] = {
link = true,
class = "natural feature",
addl_bare_category_parents = {"ธรณีสัณฐาน"},
default = {true},
},
["headquarters"] = {
link = "w",
fallback = "administrative centre",
},
["heath"] = {
link = true,
fallback = "moor",
},
["hemisphere"] = {
link = true,
entry_placetype_use_the = true,
fallback = "continental region",
},
["highway"] = {
link = true,
fallback = "road",
},
["hill"] = {
link = true,
class = "natural feature",
addl_bare_category_parents = {"ธรณีสัณฐาน"},
default = {true},
},
["hill station"] = {
link = "w",
fallback = "เมือง",
},
["hill town"] = {
link = "w",
fallback = "เมือง",
},
["historic region"] = {
-- provided only for the link
link = "+w:historical region",
fallback = "FORMER geographic region",
},
["historical county"] = {
-- needed for historical counties of England/etc.
link = "+w:historic county",
fallback = "FORMER subpolity",
},
["historical region"] = {
-- provided only for the link
link = "w",
fallback = "FORMER geographic region",
},
["home rule city"] = {
link = "w",
fallback = "นคร",
},
["home rule municipality"] = {
link = "w",
fallback = "เทศบาล",
},
["hot spring"] = {
link = true,
fallback = "spring",
},
["house"] = {
link = true,
fallback = "building",
},
["housing estate"] = {
-- not the same as a housing project (i.e. public housing)
link = true,
-- not exactly the case but approximately
fallback = "neighborhood",
},
["hromada"] = {
-- Ukraine
link = "w",
disallow_in_entries = "Use placetype 'urban hromada', 'rural hromada' or 'settlement hromada' in place of bare 'hromada'",
disallow_in_holonyms = "Use placetype 'urban hromada'/'uhrom', 'rural hromada'/'rhrom' or 'settlement hromada'/'shrom' in place of bare 'hromada'",
preposition = "ของ",
affix_type = "suf",
class = "subpolity",
},
["inactive volcano"] = {
link = "w",
fallback = "dormant volcano",
},
["independent city"] = {
link = true,
fallback = "นคร",
},
["independent town"] = {
link = "+independent city",
fallback = "เมือง",
},
["Indian reservation"] = {
link = "w",
-- In the US. Also known as "Native American reservation" or "domestic dependent nation", and the reservations
-- themselves often use the term "nation" in their official name (e.g. the "Navajo Nation"). But Wikipedia puts
-- the article at [[w:Indian reservation]] and uses that term when describing e.g. what the Navajo Nation is,
-- so this must still be the legal term.
preposition = "ของ",
class = "subpolity",
default = {true},
},
["Indian reserve"] = {
link = "w",
-- In Canada. "First Nations reserve" sounds more modern/PC but Wikipedia uses "Indian reserve"; presumably that
-- is still the legal term.
preposition = "ของ",
class = "subpolity",
default = {true},
},
["inland sea"] = {
-- note, we also have 'inland' as a qualifier
link = true,
fallback = "ทะเล",
},
["inner city area"] = {
link = "[[inner city]] [[area]]",
fallback = "neighborhood",
},
["เกาะ"] = {
link = true,
preposition = "ของ",
class = "natural feature",
addl_bare_category_parents = {"ธรณีสัณฐาน"},
default = {true},
},
["island country"] = {
-- FIXME: The following should map to both 'island' and 'country'.
link = "w",
fallback = "ประเทศ",
},
["island group"] = {
link = "separately",
fallback = "เกาะ",
},
["island municipality"] = {
link = "w",
fallback = "เทศบาล",
},
["islet"] = {
link = "w",
fallback = "เกาะ",
},
["Israeli settlement"] = {
link = "w",
class = "settlement",
default = {true},
},
["judicial capital"] = {
link = "w",
fallback = "เมืองหลวง",
},
["khanate"] = {
link = true,
fallback = "องค์การทางการเมือง",
},
["kibbutz"] = {
link = true,
plural = "kibbutzim",
class = "non-admin settlement",
default = {true},
},
["kingdom"] = {
link = true,
fallback = "monarchy",
},
["krai"] = {
link = true,
preposition = "ของ",
affix_type = "Suf",
class = "subpolity",
},
["ทะเลสาบ"] = {
link = true,
class = "natural feature",
addl_bare_category_parents = {"bodies of water"},
default = {true},
},
["ธรณีสัณฐาน!"] = {
category_link = "[[ธรณีสัณฐาน]]",
bare_category_parent = "สถานที่",
addl_bare_category_parents = {"โลก"},
},
["largest city"] = {
link = "[[large]]st [[city]]",
entry_placetype_use_the = true,
fallback = "นคร",
has_neighborhoods = true,
},
["league"] = {
link = true,
fallback = "confederation",
},
["legislative capital"] = {
link = "separately",
fallback = "เมืองหลวง",
},
["library"] = {
link = true,
fallback = "building",
},
["lieutenancy area"] = {
-- used in the United Kingdom; per Wikipedia:
-- In England, lieutenancy areas are colloquially known as the ceremonial counties, although this phrase does
-- not appear in any legislation referring to them. The lieutenancy areas of Scotland are subdivisions of
-- Scotland that are more or less based on the counties of Scotland, making use of the major cities as separate
-- entities.[2] In Wales, the lieutenancy areas are known as the preserved counties of Wales and are based on
-- those used for lieutenancy and local government between 1974 and 1996. The lieutenancy areas of Northern
-- Ireland correspond to the six counties and two former county boroughs.[3]
link = "w",
fallback = "ceremonial county",
},
["local authority district"] = {
link = "w",
fallback = "local government district",
},
["local government area"] = {
-- Australia
link = "w",
preposition = "ของ",
class = "subpolity",
},
["local council"] = {
-- Malta; similar to municipalities
link = "+w:local councils of Malta",
preposition = "ของ",
fallback = "เทศบาล",
},
["local government district"] = {
link = "w",
preposition = "ของ",
affix_type = "suf",
affix = "อำเภอ",
class = "subpolity",
},
["local government district with borough status"] = {
link = "[[w:local government district|local government district]] with [[w:borough status|borough status]]",
plural = "local government districts with borough status",
plural_link = "[[w:local government district|local government districts]] with [[w:borough status|borough status]]",
preposition = "ของ",
affix_type = "suf",
affix = "อำเภอ",
class = "subpolity",
},
["local urban district"] = {
link = "w",
fallback = "unincorporated community",
},
["locality"] = {
link = "+w:locality (settlement)",
-- not necessarily true, but usually is the case
fallback = "village",
},
["London borough"] = {
link = "w",
preposition = "ของ",
affix_type = "pref",
affix = "borough",
fallback = "local government district with borough status",
has_neighborhoods = true,
},
["macroregion"] = {
link = true,
fallback = "ภูมิภาค",
},
["man-made structures!"] = {
category_link = "[[w:geographical feature#Engineered constructs|man-made structures]] such as [[airport]]s, [[university|universities]] and [[metro station]]s",
bare_category_parent = "สถานที่",
},
["manor"] = {
-- FIXME: or is this more like a farm?
link = true,
fallback = "building",
},
["marginal sea"] = {
link = true,
preposition = "ของ",
fallback = "ทะเล",
},
["market city"] = {
link = "+market town",
fallback = "นคร",
},
["market town"] = {
link = true,
fallback = "เมือง",
},
["massif"] = {
link = true,
fallback = "ภูเขา",
},
["megacity"] = {
link = true,
fallback = "นคร",
},
["metro station"] = {
link = true,
class = "man-made structure",
},
["metropolitan borough"] = {
link = true,
preposition = "ของ",
affix_type = "Pref",
no_affix_strings = {"borough", "นคร"},
fallback = "local government district",
has_neighborhoods = true,
},
["มหานคร"] = {
-- These exist e.g. in Italy and are more like municipalities or even provinces than cities.
link = true,
preposition = "ของ",
affix_type = "Pref",
no_affix_strings = {"มหานคร", "นคร"},
class = "subpolity",
},
["metropolitan county"] = {
link = true,
fallback = "เทศมณฑล",
},
["metropolitan municipality"] = {
-- In South Africa, metropolitan municipalities group local municipalities and are like districts, between
-- provinces and municipalities.
-- In Turkey, metropolitan municipalities are provinces-level.
link = "w",
preposition = "ของ",
affix_type = "Suf",
no_affix_strings = {"metropolitan", "เทศบาล"},
fallback = "เทศบาล",
class = "subpolity",
},
["microdistrict"] = {
-- residential complex in post-Soviet states
link = true,
fallback = "neighborhood",
},
["micronations!"] = {
-- FIXME, merge with microstate
category_link = "[[micronation]]s",
bare_category_parent = "ประเทศ",
},
["microstate"] = {
link = true,
fallback = "ประเทศ",
},
["military base"] = {
link = "w",
class = "settlement", -- or "man-made structure"?
default = {true},
},
["minster town"] = {
-- England
link = "separately",
fallback = "เมือง",
},
["monarchy"] = {
link = true,
fallback = "องค์การทางการเมือง",
},
["moor"] = {
link = true,
class = "natural feature",
addl_bare_category_parents = {"ธรณีสัณฐาน", "ecosystems"},
default = {true},
},
["moorland"] = {
link = true,
fallback = "moor",
},
["motorway"] = {
link = true,
fallback = "road",
},
["ภูเขา"] = {
link = true,
class = "natural feature",
addl_bare_category_parents = {"ธรณีสัณฐาน"},
default = {true},
},
["mountain indigenous district"] = {
-- Taiwan
link = "+w:district (Taiwan)",
fallback = "อำเภอ",
},
["mountain indigenous township"] = {
-- Taiwan
link = "+w:township (Taiwan)",
fallback = "township",
},
["mountain pass"] = {
link = true,
-- The default plural algorithm gets this right but the singularization algorithm incorrectly converts
-- passes -> passe, so put an entry here to ensure we singularize correctly.
plural = "mountain passes",
class = "natural feature",
addl_bare_category_parents = {"ภูเขา"},
default = {true},
},
["เทือกเขา"] = {
link = true,
fallback = "ภูเขา",
},
["mountainous region"] = {
link = "separately",
fallback = "ภูมิภาค",
},
["mukim"] = {
-- Malaysia, Brunei, Indonesia, Singapore
link = true,
preposition = "ของ",
class = "subpolity",
},
["municipal district"] = {
link = "w",
-- meaning varies depending on the country; for now, assume no neighborhoods.
-- FIXME: has_neighborhoods might have to be a function that looks at the containing holonyms.
preposition = "ของ",
affix_type = "Pref",
no_affix_strings = "อำเภอ",
fallback = "เทศบาล",
},
["เทศบาล"] = {
link = true,
preposition = "ของ",
has_neighborhoods = true,
class = "subpolity",
},
["municipality with city status"] = {
link = "[[municipality]] with [[w:city status|city status]]",
plural = "municipalities with city status",
plural_link = "[[municipality|municipalities]] with [[w:city status|city status]]",
fallback = "เทศบาล",
},
["museum"] = {
link = true,
fallback = "building",
},
["mythological location"] = {
link = "separately",
former_type = "!",
class = "hypothetical location",
bare_category_parent = "สถานที่",
default = {true},
},
["named bridges!"] = {
category_link = "notable [[bridge]]s",
bare_category_parent = "man-made structures",
addl_bare_category_parents = {"bridges"},
},
["named buildings!"] = {
category_link = "notable [[house]]s, [[library|libraries]] and other [[building]]s",
bare_category_parent = "man-made structures",
addl_bare_category_parents = {"buildings"},
},
["named roads!"] = {
category_link = "notable [[road]]s, [[highway]]s, [[trail]]s and similar linear structures",
bare_category_parent = "man-made structures",
addl_bare_category_parents = {"roads"},
},
["national capital"] = {
link = "w",
fallback = "เมืองหลวง",
},
["national park"] = {
link = true,
fallback = "park",
},
["natural features!"] = {
category_link = "[[w:geographical feature#Natural features|natural features]] such as [[lake]]s, [[mountain]]s, [[island]]s and [[ocean]]s",
bare_category_parent = "สถานที่",
},
["neighborhood"] = {
-- The majority of the properties here apply to both `neighborhoods` and `neighbourhoods`; the choice of which
-- one to use is made by district_neighborhood_cat_handler() based on the value of `british_spelling` for the
-- location (city, political division, etc.) of the holonym that follows the word "neighbo(u)hoods" in the
-- category name. It does *NOT* depend on whether the {{place}} call uses "neighborhoods" or "neighbourhoods".
-- (In general it can't, because other things like "urban areas", "อำเภอ", "subdivisions" and the like also
-- categorize as neighbo(u)rhoods.)
link = true,
-- See below. These are used by category handlers in [[Module:category tree/topic cat/data/Places]].
generic_before_non_cities = "ใน",
generic_before_cities = "ของ",
-- The following text is suitable for the top-level description of a neighborhood as well as categories of the
-- form `Neighborhoods in POLDIV` e.g. `Neighborhoods in Illinois, USA` but not for categories of the form
-- `Neighborhoods of Chicago`, where we'd get "... and other subportions of [[city|cities]] of [[Chicago]]".
category_link = "[[neighborhood]]s, [[district]]s and other subportions of [[city|cities]]",
category_link_before_city = "[[neighborhood]]s, [[district]]s and other subportions",
-- NOTE: This setting is needed for administrative divisions like barangays that fall back to `neighborhood`,
-- when set in [[Module:place/locations]] for a specific country (e.g. the Philippines). The above settings
-- for `generic_before_non_cities` and `generic_before_cities` are used by category handlers in
-- [[Module:category tree/topic cat/data/Places]] for `Neighborhoods in POLDIV` and `Neighborhoods of CITY`
-- categories. In fact, district_neighborhood_cat_handler() does not currently pay attention to them, but
-- generates "ของ" before cities and "ใน" before non-cities regardless. (FIXME: We should change that.)
preposition = "ของ",
class = "non-admin settlement",
cat_handler = district_neighborhood_cat_handler,
},
["neighbourhood"] = {
link = true,
category_link = "[[neighbourhood]]s, [[district]]s and other subportions of [[city|cities]]",
category_link_before_city = "[[neighbourhood]]s, [[district]]s and other subportions",
fallback = "neighborhood",
},
["new area"] = {
-- China (type of economic development zone, varying greatly in size)
link = "w",
preposition = "ใน",
class = "subpolity", --?
},
["new town"] = {
link = true,
fallback = "เมือง",
},
["เมืองหลวงที่ไม่ใช่นคร"] = {
link = "[[เมืองหลวง]]",
entry_placetype_use_the = true,
preposition = "ของ",
has_neighborhoods = true,
class = "capital",
cat_handler = function(data)
return capital_city_cat_handler(data, "non-city")
end,
-- FIXME, do we need the following?
default = {true},
},
["non-metropolitan county"] = {
link = "w",
fallback = "เทศมณฑล",
},
["non-metropolitan district"] = {
link = "w",
fallback = "local government district",
},
["non-sovereign kingdom"] = {
-- especially in Africa and Asia
link = "+w:non-sovereign monarchy",
generic_before_non_cities = "ใน",
class = "subpolity",
["country/*"] = {true},
["continent/*"] = {true},
default = {true},
},
["non-sovereign monarchy"] = {
link = "w",
fallback = "non-sovereign kingdom",
},
["oblast"] = {
link = true,
preposition = "ของ",
affix_type = "Suf",
class = "subpolity",
},
["oblasts and autonomous republics!"] = {
-- This and other similar "combined placetypes" are for use in the plural when grouping first-level
-- administrative regions of certain countries, in this case Ukraine.
category_link = "[[oblast]]s and [[w:autonomous republic|autonomous republic]]s",
class = "subpolity",
},
["มหาสมุทร"] = {
link = true,
holonym_use_the = true,
class = "natural feature",
addl_bare_category_parents = {"ทะเล", "bodies of water"},
default = {true},
},
["okrug"] = {
link = true,
preposition = "ของ",
affix_type = "Suf",
class = "subpolity",
},
["overseas collectivity"] = {
link = "w",
fallback = "collectivity",
},
["overseas department"] = {
link = "w",
fallback = "department",
},
["overseas territory"] = {
link = "w",
fallback = "dependent territory",
},
["parish"] = {
link = true,
preposition = "ของ",
affix_type = "suf",
class = "subpolity",
},
["parish municipality"] = {
-- in Quebec, often similar to a rural village; the famous [[Saint-Louis-du-Ha! Ha!]] is one of them.
link = "+w:parish municipality (Quebec)",
preposition = "ของ",
fallback = "เทศบาล",
has_neighborhoods = true,
},
["parish seat"] = {
link = true,
entry_placetype_use_the = true,
preposition = "ของ",
class = "capital",
has_neighborhoods = true,
},
["park"] = {
link = true,
class = "man-made structure",
default = {true},
},
["pass"] = {
link = "+mountain pass",
-- The default plural algorithm gets this right but the singularization algorithm incorrectly converts
-- passes -> passe, so put an entry here to ensure we singularize correctly.
plural = "passes",
fallback = "mountain pass",
},
["path"] = {
link = true,
fallback = "road",
},
["peak"] = {
link = true,
fallback = "ภูเขา",
},
["peninsula"] = {
link = true,
class = "natural feature",
addl_bare_category_parents = {"ธรณีสัณฐาน"},
default = {true},
},
["periphery"] = {
link = true,
preposition = "ของ",
class = "subpolity",
},
["สถานที่!"] = {
generic_before_non_cities = "ใน",
generic_before_cities = "ใน",
class = "generic place",
category_link = "[[place]]s of all sorts",
-- `category_link_top_level` control the description used in the top-level [[Category:Places]] and
-- language-specific variants such as [[Category:en:Places]]. The actual text for a language-spefic variant is
-- "{{{langname}}} names of [[geographical]] [[place]]s of all sorts; [[toponym]]s." where the "names of"
-- portion is automatically generated by the appropriate handler in
-- [[Module:category tree/topic cat/data/Places]].
category_link_top_level = "[[geographical]] [[place]]s of all sorts; [[toponym]]s",
bare_category_parent = "ชื่อ (หัวข้อ)",
},
["planned community"] = {
-- Include this so we don't categorize 'planned community' into villages, as 'community' does.
link = true,
class = "settlement",
has_neighborhoods = true,
},
["plateau"] = {
link = true,
class = "natural feature",
addl_bare_category_parents = {"ธรณีสัณฐาน"},
default = {true},
-- FIXME: Should generate both "Plateaus" and the appropriate 'geographic and cultural area' category
},
["Polish colony"] = {
link = "[[w:colony (Poland)|colony]]",
affix_type = "suf",
affix = "colony",
fallback = "village",
has_neighborhoods = true,
},
["political divisions!"] = {
category_link = "[[political]] [[division]]s and [[subdivision]]s, such as [[state]]s, [[province]]s, [[county|counties]] or [[district]]s",
bare_category_parent = "สถานที่",
},
["องค์การทางการเมือง"] = {
link = true,
category_link = "[[independent]] or [[semi-]][[independent]] [[polity|polities]]",
class = "polity", --ห้ามแปล class
bare_category_parent = "สถานที่",
default = {true},
},
["populated place"] = {
link = "+w:populated place",
-- not necessarily true, but usually is the case
fallback = "village",
},
["port"] = {
link = true,
class = "man-made structure",
default = {true},
},
["port city"] = {
-- FIXME: should categorize into "Ports" as well as "นคร"
link = true,
fallback = "นคร",
},
["port town"] = {
-- FIXME: should categorize into "Ports" as well as "เมือง"
link = "w",
fallback = "เมือง",
},
["prefecture"] = {
-- FIXME! `prefecture` is like a county in Japan and elsewhere but a department capital city in France.
-- May need `has_neighborhoods` to be a function.
link = true,
preposition = "ของ",
display_handler = prefecture_display_handler,
class = "subpolity",
},
["prefecture-level city"] = {
-- China; they are huge entities with a central city; not cities themselves.
link = "w",
preposition = "ของ",
class = "subpolity",
},
["preserved county"] = {
-- In Wales; they are former counties enshrined in law; there are 8 of them and each consists of one or more
-- "principal areas" (styled as "เทศมณฑล" or "county boroughs"), of which there are 22.
link = "w",
preposition = "ของ",
class = "subpolity",
inherently_former = {"FORMER"},
},
["primary area"] = {
-- a grouping of "อำเภอ" (neighborhoods) in Gothenburg, Sweden
link = "+w:sv:primärområde",
fallback = "neighborhood",
},
["principality"] = {
link = true,
fallback = "monarchy",
},
["promontory"] = {
link = true,
fallback = "headland",
},
["protectorate"] = {
link = true,
fallback = "dependent territory",
},
["จังหวัด"] = {
link = true,
preposition = "ของ",
display_handler = province_display_handler,
class = "subpolity",
},
["provinces and autonomous regions!"] = {
-- This and other similar "combined placetypes" are for use in the plural when grouping first-level
-- administrative regions of certain countries, in this case China.
category_link = "[[province]]s and [[autonomous region]]s",
class = "subpolity",
},
["provinces and territories!"] = {
-- This and other similar "combined placetypes" are for use in the plural when grouping first-level
-- administrative regions of certain countries, in this case Canada and Pakistan.
category_link = "[[province]]s and [[territory|territories]]",
class = "subpolity",
},
["provincial capital"] = {
link = true,
fallback = "เมืองหลวง",
},
["raion"] = {
link = true,
preposition = "ของ",
affix_type = "Suf",
class = "subpolity",
},
["ranch"] = {
link = true,
fallback = "farm",
},
["range"] = {
-- FIXME: Where is this used? Is it a mountain range?
link = true,
holonym_use_the = true,
class = "natural feature",
},
["regency"] = {
link = true,
preposition = "ของ",
class = "subpolity",
},
["ภูมิภาค"] = {
link = true,
preposition = "ของ",
-- If 'region' isn't a specific administrative division, fall back to 'geographic and cultural area'
fallback = "geographic and cultural area",
-- "former region" is a subpolity but traditional/historic(al)/ancient/medieval/etc. is a geographic region
class = "geographic region",
},
["regional capital"] = {
link = "separately",
fallback = "เมืองหลวง",
},
["regional county municipality"] = {
-- Quebec
link = "w",
preposition = "ของ",
affix_type = "Suf",
no_affix_strings = {"เทศบาล", "เทศมณฑล"},
fallback = "เทศบาล",
},
["regional district"] = {
link = "w",
preposition = "ของ",
affix_type = "Pref",
no_affix_strings = "อำเภอ",
fallback = "อำเภอ",
},
["regional municipality"] = {
link = "w",
preposition = "ของ",
affix_type = "Pref",
no_affix_strings = "เทศบาล",
fallback = "เทศบาล",
},
["regional unit"] = {
link = "w",
preposition = "ของ",
affix_type = "suf",
class = "subpolity",
},
["registration county"] = {
-- Used in Scotland for land registration purposes; formerly used in England, Wales and Ireland for statistical
-- purposes (registration of births, deaths and marriages, and for the output of census information).
link = "w",
fallback = "เทศมณฑล",
},
["republic"] = {
-- Of Russia, Yugoslavia, etc. "Republics" in general are sovereign but we use "ประเทศ" in that case.
link = true,
fallback = "constituent republic",
},
["research base"] = {
link = "+w:research station",
fallback = "research station",
},
["research station"] = {
link = "w",
class = "non-admin settlement", -- or "man-made structure"?
default = {true},
},
["reservoir"] = {
link = true,
fallback = "ทะเลสาบ",
},
["residential area"] = {
link = "separately",
fallback = "neighborhood",
},
["resort city"] = {
link = "w",
fallback = "นคร",
},
["resort town"] = {
link = "w",
fallback = "เมือง",
},
["แม่น้ำ"] = {
link = true,
generic_before_non_cities = "ใน",
holonym_use_the = true,
class = "natural feature",
addl_bare_category_parents = {"bodies of water"},
cat_handler = city_type_cat_handler,
["continent/*"] = {true},
default = {true},
},
["river island"] = {
link = "w",
fallback = "เกาะ",
},
["road"] = {
link = true,
class = "man-made structure",
default = {"Named roads"},
},
["Roman province"] = {
-- FIXME! Eliminate this in favor of 'former province|emp/Roman Empire'
link = "w",
default = {"Provinces of the Roman Empire"},
class = "subpolity",
},
["royal borough"] = {
link = "w",
preposition = "ของ",
affix_type = "Pref",
no_affix_strings = {"royal", "borough"},
fallback = "local government district with borough status",
has_neighborhoods = true,
},
["royal burgh"] = {
link = true,
fallback = "borough",
},
["royal capital"] = {
link = "w",
fallback = "เมืองหลวง",
},
["rural committee"] = {
-- Hong Kong; a group of villages
link = "w",
affix_type = "Suf",
has_neighborhoods = true,
class = "settlement",
},
["rural community"] = {
-- New Brunswick
link = "+w:list of municipalities in New_Brunswick#Rural communities",
fallback = "เทศบาล",
},
["rural hromada"] = {
link = "[[rural]] [[w:hromada|hromada]]",
affix_type = "suf",
fallback = "hromada",
},
["rural municipality"] = {
link = "w",
preposition = "ของ",
affix_type = "Pref",
no_affix_strings = "เทศบาล",
fallback = "เทศบาล",
has_neighborhoods = true, --?
},
["rural township"] = {
-- Taiwan
link = "+w:rural township (Taiwan)",
fallback = "township",
},
["sanctuary"] = {
link = true,
fallback = "temple",
},
["satrapy"] = {
link = true,
preposition = "ของ",
class = "subpolity",
inherently_former = {"ANCIENT", "FORMER"},
},
["ทะเล"] = {
link = true,
holonym_use_the = true,
class = "natural feature",
addl_bare_category_parents = {"bodies of water"},
default = {true},
},
["seaport"] = {
link = true,
fallback = "port",
},
["seat"] = {
link = true,
fallback = "administrative centre",
},
["self-administered area"] = {
-- Myanmar (groups self-administered divisions and zones)
link = "+w:self-administered zone",
preposition = "ของ",
class = "subpolity",
},
["self-administered division"] = {
-- Myanmar (only one of them: Wa Self-Administered Division)
link = "w",
fallback = "self-administered area",
},
["self-administered zone"] = {
-- Myanmar (five of them)
link = "w",
fallback = "self-administered area",
},
["separatist state"] = {
link = "separately",
fallback = "unrecognized country",
},
["การตั้งถิ่นฐาน"] = {
link = true,
category_link = "[[settlement]]s such as [[city|cities]], [[village]]s and [[farm]]s",
bare_category_parent = "สถานที่",
-- not necessarily true, but usually is the case
fallback = "village",
},
["settlement hromada"] = {
link = "[[w:Populated สถานที่ในUkraine#Rural settlements|การตั้งถิ่นฐาน]] [[w:hromada|hromada]]",
affix_type = "suf",
fallback = "hromada",
},
["sheading"] = {
-- Isle of Man
link = true,
fallback = "อำเภอ",
},
["sheep station"] = {
-- Australia
link = true,
fallback = "farm",
},
["shire"] = {
link = true,
fallback = "เทศมณฑล",
},
["shire county"] = {
link = "w",
fallback = "เทศมณฑล",
},
["shire town"] = {
link = true,
fallback = "county seat",
},
["ski resort city"] = {
link = "[[ski resort]] [[city]]",
fallback = "นคร",
},
["ski resort town"] = {
link = "[[ski resort]] [[town]]",
fallback = "เมือง",
},
["spa city"] = {
link = "+w:spa town",
fallback = "นคร",
},
["spa town"] = {
link = "w",
fallback = "เมือง",
},
["space station"] = {
link = true,
fallback = "research station",
},
["special administrative region"] = {
-- in China; in practice they are city-like (Hong Kong, Macau); also [[Oecusse]] in East Timor is formally a
-- "special administrative region"; North Korea had one such region planned (Sinuiju) but abandoned; Indonesia
-- has similar "special regions" of Jakarta, Yogyakarta and Aceh; and South Sudan has three "special
-- administrative areas"
link = "+w:special administrative regions of China",
preposition = "ของ",
class = "subpolity",
has_neighborhoods = true, --?
-- no suffix since สถานที่ในHong Kong or Macau are listed without China, except Hong Kong and Macau themselves
-- they also contain regions (or areas), e.g. [[Kowloon]], so it would be confusing
suffix = "",
},
["special collectivity"] = {
link = "w",
fallback = "collectivity",
},
["special municipality"] = {
-- formerly linked to the Taiwan article but there are also special municipalities of the Netherlands
link = "w",
fallback = "เทศบาล",
},
["special ward"] = {
-- Tokyo
link = true,
fallback = "เทศบาล",
},
["spit"] = {
link = true,
fallback = "peninsula",
},
["spring"] = {
link = true,
class = "natural feature",
default = {true},
},
["star"] = {
link = true,
class = "natural feature",
default = {true},
},
["รัฐ"] = {
link = true,
preposition = "ของ",
class = "subpolity",
-- 'former/historical state' could refer either to a state of a country (a division) or a state = sovereign
-- entity. The latter appears more common (e.g. in various "ancient states" of East Asia).
former_type = "องค์การทางการเมือง",
},
["states and territories!"] = {
-- This and other similar "combined placetypes" are for use in the plural when grouping first-level
-- administrative regions of certain countries, in this case Australia.
category_link = "[[state]]s and [[territory|territories]]",
class = "subpolity",
},
["states and union territories!"] = {
-- This and other similar "combined placetypes" are for use in the plural when grouping first-level
-- administrative regions of certain countries, in this case India.
category_link = "[[state]]s and [[union territory|union territories]]",
class = "subpolity",
},
["state capital"] = {
link = true,
fallback = "เมืองหลวง",
},
["state park"] = {
link = true,
fallback = "park",
},
["state-level new area"] = {
-- China (type of economic development zone, varying greatly in size)
link = "w",
fallback = "new area",
},
["statistical region"] = {
-- Slovenia
link = true,
fallback = "administrative region",
},
["statutory city"] = {
link = "w",
fallback = "นคร",
},
["statutory town"] = {
link = "w",
fallback = "เมือง",
},
["strait"] = {
link = true,
class = "natural feature",
addl_bare_category_parents = {"bodies of water"},
default = {true},
},
["stream"] = {
link = true,
fallback = "แม่น้ำ",
},
["street"] = {
link = true,
fallback = "road",
},
["strip"] = {
link = true,
fallback = "geographic region",
},
["strip of land"] = {
link = "[[strip]] of [[land]]",
plural = "strips of land",
plural_link = "[[strip]]s of [[land]]",
fallback = "geographic region",
},
["sub-metropolitan city"] = {
link = "+w:List of cities in Nepal#Sub-metropolitan cities",
fallback = "นคร",
},
["sub-prefectural city"] = {
link = "w",
fallback = "subprovincial city",
},
["ตำบล"] = {
link = true,
preposition = "ของ",
has_neighborhoods = true, --?
-- FIXME: subdistricts can be neighborhood-like (of Jakarta) or larger (in China); need a handler
class = "subpolity",
default = {true},
},
["subdivision"] = {
link = true,
preposition = "ของ",
affix_type = "suf",
-- FIXME: subdivisions can be neighborhood-like or larger; need a handler
class = "subpolity",
cat_handler = district_neighborhood_cat_handler,
},
["submerged ghost town"] = {
-- FIXME: Consider just having "submerged" as a qualifier.
link = "[[submerged]] [[ghost town]]",
fallback = "ghost town",
},
["subnational kingdom"] = {
link = "+w:subnational monarchy",
fallback = "non-sovereign kingdom",
},
["subnational monarchy"] = {
link = "w",
fallback = "non-sovereign kingdom",
},
["subprefecture"] = {
link = true,
affix_type = "suf",
preposition = "ของ",
class = "subpolity",
},
["subprovince"] = {
link = true,
preposition = "ของ",
class = "subpolity",
},
["subprovincial city"] = {
link = "w",
-- China; special status given to certain prefecture-level cities
fallback = "prefecture-level city",
},
["subprovincial district"] = {
link = "w",
-- China; special status given to Binhai New Area and Pudong New Area, which are county-level districts
preposition = "ของ",
class = "subpolity",
},
["subregion"] = {
link = true,
fallback = "geographic region",
},
["suburb"] = {
link = true,
-- The following text is suitable for the top-level description of a suburb as well as categories of the form
-- 'Suburbs in POLDIV' e.g. 'Suburbs in Illinois, USA' but not for categories of the form 'Suburbs of Chicago',
-- where we'd get "[[suburb]]s of [[city|cities]] of [[Chicago]]".
category_link = "[[suburb]]s of [[city|cities]]",
category_link_before_city = "[[suburb]]s",
-- See comments under "neighborhood" for the following three settings. They are used by
-- [[Module:category tree/topic cat/data/Places]] for generating the text of 'Suburbs in/of PLACE' categories
-- but currently ignored by district_neighborhood_cat_handler (which actually generates the categories for a
-- given page), which hardcodes "ใน" for non-cities and "ของ" for cities. (FIXME: Change this.)
generic_before_non_cities = "ใน",
generic_before_cities = "ของ",
preposition = "ของ",
has_neighborhoods = true, --?
class = "non-admin settlement", --?
cat_handler = district_neighborhood_cat_handler,
},
["suburban area"] = {
link = "w",
fallback = "suburb",
},
["subway station"] = {
link = "w",
fallback = "metro station",
},
["sum"] = {
-- In China, Mongolia, Russia; something like a county in Mongolia but a township in China (Inner Mongolia),
-- and equivalent to a [[selsoviet]] in the parts of Russia where it's in use (a rural council, below a raion).
link = "+w:sum (administrative division)",
-- This fallback is somewha arbitrary. We could use "เทศมณฑล" but that has a display handler
-- which we don't want to be active (FIXME: If the display handler would be active, that's a bug).
fallback = "division",
},
["supercontinent"] = {
link = true,
fallback = "continent",
},
["tehsil"] = {
link = true,
affix_type = "suf",
no_affix_strings = {"tehsil", "tahsil"},
class = "subpolity",
},
["temple"] = {
link = true,
fallback = "building",
},
["territorial authority"] = {
link = "w",
fallback = "อำเภอ",
},
["ดินแดน"] = {
link = true,
preposition = "ของ",
class = "subpolity",
},
["theme"] = {
link = "+w:theme (Byzantine district)",
preposition = "ของ",
class = "subpolity",
},
["เมือง"] = {
link = true,
generic_before_non_cities = "ใน",
has_neighborhoods = true,
class = "settlement",
cat_handler = city_type_cat_handler,
default = {true},
},
["town with bystatus"] = {
-- can't use templates in links currently
link = "[[town]] with [[bystatus#Norwegian Bokmål|bystatus]]",
plural = "towns with bystatus",
plural_link = "[[town]]s with [[bystatus#Norwegian Bokmål|bystatus]]",
fallback = "เมือง",
},
["township"] = {
link = true,
has_neighborhoods = true,
class = "settlement", --?
default = {true},
},
["township municipality"] = {
-- Quebec
link = "+w:township municipality (Quebec)",
preposition = "ของ",
fallback = "เทศบาล",
has_neighborhoods = true, --?
},
["traditional county"] = {
link = true,
fallback = "เทศมณฑล",
},
["traditional region"] = {
-- FIXME: Verify this works. Same for 'historic(al) region'.
-- provided only for the link
link = "w",
fallback = "FORMER geographic region",
},
["trail"] = {
link = true,
fallback = "road",
},
["treaty port"] = {
link = "w",
fallback = "นคร",
class = "settlement",
inherently_former = {"FORMER"},
},
["tributary"] = {
link = true,
preposition = "ของ",
fallback = "แม่น้ำ",
},
["underground station"] = {
link = "w",
fallback = "metro station",
},
["unincorporated area"] = {
link = "w",
-- I don't know if this fallback makes sense everywhere.
fallback = "unincorporated community",
},
["unincorporated community"] = {
link = true,
generic_before_non_cities = "ใน",
class = "non-admin settlement",
},
["unincorporated territory"] = {
link = "w",
fallback = "ดินแดน",
},
["union territory"] = {
-- India
link = true,
preposition = "ของ",
entry_placetype_indefinite_article = "a",
class = "subpolity",
},
["unitary authority"] = {
-- UK, New Zealand
link = true,
entry_placetype_indefinite_article = "a",
fallback = "local government district",
},
["unitary district"] = {
link = "w",
entry_placetype_indefinite_article = "a",
fallback = "local government district",
},
["united township municipality"] = {
-- Quebec
link = "+w:united township municipality (Quebec)",
entry_placetype_indefinite_article = "a",
fallback = "township municipality",
has_neighborhoods = true, --?
},
["university"] = {
link = true,
entry_placetype_indefinite_article = "a",
class = "man-made structure",
default = {true},
},
["unrecognised country"] = {
link = "w",
fallback = "unrecognized country",
},
["unrecognized and nearly unrecognized countries!"] = {
category_link = "[[de facto]] [[independent]] [[state]]s with little or no {{w|international recognition}}",
bare_category_parent = "country-like entities",
},
["unrecognized country"] = {
link = "w",
class = "polity", --ห้ามแปล class
default = {"Unrecognized and nearly unrecognized countries"},
},
["unrecognised state"] = {
link = "w",
fallback = "unrecognized country",
},
["unrecognized state"] = {
link = "w",
fallback = "unrecognized country",
},
["urban area"] = {
link = "separately",
fallback = "neighborhood",
},
["urban hromada"] = {
link = "[[urban]] [[w:hromada|hromada]]",
affix_type = "suf",
fallback = "hromada",
},
["urban service area"] = {
-- A strange beast existing in Alberta; technically a type of hamlet but in practice used for much larger
-- cities and treated equivalent to a city. (There are only two of them, [[Fort McMurray]] and [[Sherwood Park]]).
link = "w",
fallback = "นคร",
},
["urban township"] = {
link = "w",
fallback = "township",
},
["urban-type settlement"] = {
-- appears to be a particular type of small urban settlement in post-Soviet states,
-- had an administrative function.
link = "w",
fallback = "เมือง",
},
["valley"] = {
link = true,
class = "natural feature",
addl_bare_category_parents = {"ธรณีสัณฐาน", "water"},
default = {true},
},
["viceroyalty"] = {
-- in essence, a type of colony
link = true,
fallback = "dependent territory",
},
["village"] = {
link = true,
generic_before_non_cities = "ใน",
category_link = "[[village]]s, [[hamlet]]s, and other small [[community|communities]] and [[settlement]]s",
class = "settlement",
cat_handler = city_type_cat_handler,
default = {true},
},
["village development committee"] = {
-- former administrative structure in Nepal; also exists in India but not as a formal unit
link = "+w:village development committee (Nepal)",
inherently_former = {"FORMER"},
fallback = "village",
},
["village municipality"] = {
-- Quebec
link = "+w:village municipality (Quebec)",
preposition = "ของ",
fallback = "เทศบาล",
has_neighborhoods = true, --?
},
["voivodeship"] = {
-- Poland
link = true,
display_handler = voivodeship_display_handler,
preposition = "ของ",
class = "subpolity",
},
["volcano"] = {
link = true,
plural = "volcanoes",
class = "natural feature",
addl_bare_category_parents = {"ธรณีสัณฐาน"},
default = {true, "ภูเขา"},
},
["ward"] = {
link = true,
class = "settlement",
-- Wards are formal administrative divisions of a city but have some properties of neighborhoods.
fallback = "neighborhood",
},
["watercourse"] = {
link = true,
fallback = "channel",
},
["Welsh community"] = {
-- Wales
link = "[[w:community (Wales)|community]]",
preposition = "ของ",
affix_type = "suf",
affix = "community",
has_neighborhoods = true,
class = "settlement",
},
["zone"] = {
-- administrative division of Ethiopia, Qatar, Nepal, India
link = "+w:zone#Place names",
preposition = "ของ",
class = "subpolity",
},
----------------------------------------------------------------------------------------------
-- Categories for former places --
----------------------------------------------------------------------------------------------
["ANCIENT capital"] = {
link = false,
entry_placetype_use_the = true,
preposition = "ของ",
has_neighborhoods = true,
class = "capital",
-- FIXME: Consider removing 'ancient settlements' here. Ancient capitals, like former capitals, often still
-- exist but just aren't the capital any more. Maybe we should have an 'Ancient capitals' category.
default = {"Ancient settlements", "Former capitals"},
},
["ANCIENT non-admin settlement"] = {
link = false,
class = "non-admin settlement",
fallback = "ANCIENT settlement",
},
["ANCIENT settlement"] = {
link = false,
has_neighborhoods = true,
class = "settlement",
default = {"Ancient settlements"},
},
["ancient settlements!"] = {
category_link = "former [[city|cities]], [[town]]s and [[village]]s that existed in [[antiquity]]",
bare_category_parent = "former settlements",
},
["FORMER capital"] = {
link = false,
entry_placetype_use_the = true,
preposition = "ของ",
has_neighborhoods = true,
class = "capital",
default = {"Former capitals"},
},
["former capitals!"] = {
category_link = "former [[capital]] [[city|cities]] and [[town]]s",
bare_category_parent = "การตั้งถิ่นฐาน",
},
["former counties and county-level cities!"] = {
-- For categorizing former counties and county-level cities of China
category_link = "no-longer existing [[county|counties]] and [[county-level city|county-level cities]]",
bare_category_breadcrumb = "counties and county-level cities",
bare_category_parent = "former political divisions",
},
["FORMER county"] = {
-- For categorizing former counties and county-level cities of China
link = false,
fallback = "FORMER subpolity",
},
["FORMER county-level city"] = {
-- For categorizing former counties and county-level cities of China
link = false,
fallback = "FORMER subpolity",
},
["former countries and country-like entities!"] = {
category_link = "[[country|countries]] and similar [[polity|polities]] that no longer exist",
bare_category_breadcrumb = "countries and country-like entities",
bare_category_parent = "former polities",
},
["FORMER country"] = {
link = false,
class = "polity", --ห้ามแปล class
default = {"Former countries and country-like entities"},
},
["former dependent territories!"] = {
category_link = "[[w:dependent territory|dependent territories]] (colonies, dependencies, protectorates, etc.) that no longer exist",
bare_category_breadcrumb = "dependent territories",
bare_category_parent = "former political divisions",
},
["FORMER dependent territory"] = {
link = false,
preposition = "ของ",
class = "subpolity",
default = {"Former dependent territories"},
},
["former districts!"] = {
-- For categorizing former districts of China
category_link = "no-longer-existing [[district]]s",
bare_category_breadcrumb = "อำเภอ",
bare_category_parent = "former political divisions",
},
["FORMER district"] = {
-- For categorizing former districts of China
link = false,
fallback = "FORMER subpolity",
},
["FORMER geographic region"] = {
link = false,
fallback = "geographic and cultural area",
},
["FORMER man-made structure"] = {
link = false,
class = "man-made structure",
default = {"Former man-made structures"},
},
["former man-made structures!"] = {
category_link = "man-made structures such as [[airport]]s and [[park]]s that no longer exist",
bare_category_breadcrumb = "man-made structures",
bare_category_parent = "former places",
},
["former municipalities!"] = {
-- For categorizing former municipalities of the Netherlands
category_link = "no-longer-existing [[municipality|municipalities]]",
bare_category_breadcrumb = "เทศบาล",
bare_category_parent = "former political divisions",
},
["FORMER municipality"] = {
-- For categorizing former municipalities of the Netherlands
link = false,
fallback = "FORMER subpolity",
},
["FORMER natural feature"] = {
link = false,
class = "natural feature",
default = {"Former natural features"},
},
["former natural features!"] = {
category_link = "natural features such as [[lake]]s, [[river]]s and [[island]]s that no longer exist",
bare_category_breadcrumb = "natural features",
bare_category_parent = "former places",
},
["FORMER non-admin settlement"] = {
link = false,
class = "non-admin settlement",
fallback = "FORMER settlement",
},
["former places!"] = {
category_link = "[[place]]s of all sorts that no longer exist",
bare_category_breadcrumb = "former",
bare_category_parent = "สถานที่",
},
["former political divisions!"] = {
category_link = "[[political]] [[division]]s (states, provinces, counties, etc.) that no longer exist",
bare_category_breadcrumb = "political divisions",
bare_category_parent = "former places",
},
["former polities!"] = {
category_link = "[[polity|polities]] (countries, kingdoms, empires, etc.) that no longer exist",
bare_category_breadcrumb = "องค์การทางการเมือง",
bare_category_parent = "former places",
},
["FORMER polity"] = {
link = false,
class = "polity", --ห้ามแปล class
default = {"Former polities"},
},
["former prefectures!"] = {
-- For categorizing former prefectures of China
category_link = "no-longer-existing [[prefecture]]s",
bare_category_breadcrumb = "prefectures",
bare_category_parent = "former political divisions",
},
["FORMER prefecture"] = {
-- For categorizing former prefectures of China
link = false,
fallback = "FORMER subpolity",
},
["former provinces!"] = {
-- For categorizing former provinces of China, etc.
category_link = "no-longer-existing [[province]]s",
bare_category_breadcrumb = "จังหวัด",
bare_category_parent = "former political divisions",
},
["FORMER province"] = {
-- For categorizing ancient/historical/former provinces of the Roman Empire
link = false,
fallback = "FORMER subpolity",
},
["former region"] = {
-- A former region is considered a former political division, but not a 'historical/traditional/etc.' region.
link = "separately",
preposition = "ของ",
inherently_former = {"FORMER"},
class = "subpolity",
},
["FORMER settlement"] = {
link = false,
has_neighborhoods = true,
class = "settlement",
default = {"Former settlements"},
},
["former settlements!"] = {
category_link = "[[city|cities]], [[town]]s and [[village]]s that no longer exist or have been merged or reclassified",
bare_category_breadcrumb = "การตั้งถิ่นฐาน",
bare_category_parent = "former political divisions",
},
["FORMER subpolity"] = {
link = false,
preposition = "ของ",
class = "subpolity",
default = {"Former political divisions"},
},
----------------------------------------------------------------------------------------------
-- form-of categories --
----------------------------------------------------------------------------------------------
---------- Abbreviations ----------
["abbreviations of counties!"] = {
-- For categorizing abbreviations of counties of e.g. England
full_category_link = "{{glossary|abbreviation}}s of [[name]]s of [[county|counties]]",
bare_category_breadcrumb = "เทศมณฑล",
bare_category_parent = "abbreviations of political divisions",
},
["abbreviations of countries!"] = {
full_category_link = "{{glossary|abbreviation}}s of [[name]]s of [[country|countries]]",
bare_category_breadcrumb = "ประเทศ",
bare_category_parent = "abbreviations of places",
},
["abbreviations of departments!"] = {
-- For categorizing abbreviations of departments of e.g. France
full_category_link = "{{glossary|abbreviation}}s of [[name]]s of [[department]]s",
bare_category_breadcrumb = "departments",
bare_category_parent = "abbreviations of political divisions",
},
["abbreviations of districts!"] = {
-- For categorizing abbreviations of districts of e.g. ???
full_category_link = "{{glossary|abbreviation}}s of [[name]]s of [[district]]s",
bare_category_breadcrumb = "อำเภอ",
bare_category_parent = "abbreviations of political divisions",
},
["abbreviations of divisions!"] = {
-- For categorizing abbreviations of divisions of e.g. Bangladesh
full_category_link = "{{glossary|abbreviation}}s of [[name]]s of [[division]]s",
bare_category_breadcrumb = "divisions",
bare_category_parent = "abbreviations of political divisions",
},
["abbreviations of former countries!"] = {
full_category_link = "{{glossary|abbreviation}}s of [[country|countries]] that no longer [[exist]]",
bare_category_breadcrumb = "ประเทศ",
bare_category_parent = "abbreviations of former places",
},
["abbreviations of former places!"] = {
full_category_link = "{{glossary|abbreviation}}s of [[place]]s that no longer [[exist]]",
bare_category_breadcrumb = "abbreviations",
bare_category_parent = "former places",
addl_bare_category_parents = {{name = "abbreviations of places", sort = "former"}},
},
["abbreviations of places!"] = {
full_category_link = "{{glossary|abbreviation}}s of [[name]]s of [[place]]s",
bare_category_breadcrumb = "abbreviations",
bare_category_parent = "สถานที่",
},
["abbreviations of political divisions!"] = {
full_category_link = "{{glossary|abbreviation}}s of [[name]]s of [[political]] [[division]]s",
bare_category_breadcrumb = "political divisions",
bare_category_parent = "abbreviations of places",
},
["abbreviations of prefectures!"] = {
-- For categorizing abbreviations of prefectures of e.g. Japan
full_category_link = "{{glossary|abbreviation}}s of [[name]]s of [[prefecture]]s",
bare_category_breadcrumb = "prefectures",
bare_category_parent = "abbreviations of political divisions",
},
["abbreviations of provinces!"] = {
-- For categorizing abbreviations of provinces of e.g. Canada
full_category_link = "{{glossary|abbreviation}}s of [[name]]s of [[province]]s",
bare_category_breadcrumb = "จังหวัด",
bare_category_parent = "abbreviations of political divisions",
},
["abbreviations of provinces and territories!"] = {
full_category_link = "{{glossary|abbreviation}}s of [[name]]s of [[province]]s and [[territory|territories]]",
bare_category_breadcrumb = "provinces and territories",
bare_category_parent = "abbreviations of political divisions",
},
["abbreviations of regions!"] = {
-- For categorizing abbreviations of regions of e.g. Italy
full_category_link = "{{glossary|abbreviation}}s of [[name]]s of [[administrative region]]s",
bare_category_breadcrumb = "ภูมิภาค",
bare_category_parent = "abbreviations of political divisions",
},
["abbreviations of states!"] = {
-- For categorizing abbreviations of states of e.g. the United States
full_category_link = "{{glossary|abbreviation}}s of [[name]]s of [[state]]s",
bare_category_breadcrumb = "รัฐ",
bare_category_parent = "abbreviations of political divisions",
},
["abbreviations of states and territories!"] = {
full_category_link = "{{glossary|abbreviation}}s of [[name]]s of [[state]]s and [[territory|territories]]",
bare_category_breadcrumb = "states and territories",
bare_category_parent = "abbreviations of political divisions",
},
["abbreviations of states and union territories!"] = {
full_category_link = "{{glossary|abbreviation}}s of [[name]]s of [[state]]s and [[union territory|union territories]]",
bare_category_breadcrumb = "states and union territories",
bare_category_parent = "abbreviations of political divisions",
},
["abbreviations of territories!"] = {
full_category_link = "{{glossary|abbreviation}}s of [[name]]s of [[territory|territories]]",
bare_category_breadcrumb = "ดินแดน",
bare_category_parent = "abbreviations of political divisions",
},
["ABBREVIATION_OF country"] = {
link = false,
default = {"Abbreviations of countries"},
},
["ABBREVIATION_OF county"] = {
link = false,
fallback = "ABBREVIATION_OF subpolity",
},
["ABBREVIATION_OF department"] = {
link = false,
fallback = "ABBREVIATION_OF subpolity",
},
["ABBREVIATION_OF district"] = {
link = false,
fallback = "ABBREVIATION_OF subpolity",
},
["ABBREVIATION_OF division"] = {
link = false,
fallback = "ABBREVIATION_OF subpolity",
},
["ABBREVIATION_OF FORMER country"] = {
link = false,
default = {"Abbreviations of former countries"},
},
["ABBREVIATION_OF FORMER place"] = {
link = false,
default = {"Abbreviations of former places"},
},
["ABBREVIATION_OF place"] = {
link = false,
default = {"Abbreviations of places"},
},
["ABBREVIATION_OF prefecture"] = {
link = false,
fallback = "ABBREVIATION_OF subpolity",
},
["ABBREVIATION_OF province"] = {
link = false,
fallback = "ABBREVIATION_OF subpolity",
},
["ABBREVIATION_OF region"] = {
link = false,
fallback = "ABBREVIATION_OF subpolity",
},
["ABBREVIATION_OF state"] = {
link = false,
fallback = "ABBREVIATION_OF subpolity",
},
["ABBREVIATION_OF subpolity"] = {
link = false,
default = {"Abbreviations of political divisions"},
},
["ABBREVIATION_OF territory"] = {
link = false,
fallback = "ABBREVIATION_OF subpolity",
},
["ABBREVIATION_OF union territory"] = {
link = false,
fallback = "ABBREVIATION_OF subpolity",
},
---------- Archaic forms ----------
["archaic forms of places!"] = {
full_category_link = "{{glossary|archaic}} [[form]]s of [[name]]s of [[place]]s",
bare_category_breadcrumb = "archaic forms",
bare_category_parent = "สถานที่",
},
["ARCHAIC_FORM_OF place"] = {
link = false,
default = {"Archaic forms of places"},
},
---------- Clippings ----------
["clippings of places!"] = {
full_category_link = "{{glossary|clipping}}s of [[name]]s of [[place]]s",
bare_category_breadcrumb = "clippings",
bare_category_parent = "สถานที่",
},
["CLIPPING_OF place"] = {
link = false,
default = {"Clippings of places"},
},
---------- Dated forms ----------
["dated forms of places!"] = {
full_category_link = "{{glossary|dated}} [[form]]s of [[name]]s of [[place]]s",
bare_category_breadcrumb = "dated forms",
bare_category_parent = "สถานที่",
},
["DATED_FORM_OF place"] = {
link = false,
default = {"Dated forms of places"},
},
---------- Derogatory names ----------
["derogatory names for cities!"] = {
full_category_link = "{{glossary|derogatory}} [[name]]s for [[city|cities]]",
bare_category_breadcrumb = "นคร",
bare_category_parent = "derogatory names for places",
addl_bare_category_parents = {"nicknames for cities"},
},
["derogatory names for continents!"] = {
full_category_link = "{{glossary|derogatory}} [[name]]s for [[continent]]s",
bare_category_breadcrumb = "ทวีป",
bare_category_parent = "derogatory names for places",
addl_bare_category_parents = {"nicknames for continents"},
},
["derogatory names for countries!"] = {
full_category_link = "{{glossary|derogatory}} [[name]]s for [[country|countries]]",
bare_category_breadcrumb = "ประเทศ",
bare_category_parent = "derogatory names for places",
addl_bare_category_parents = {"nicknames for countries"},
},
["derogatory names for places!"] = {
full_category_link = "{{glossary|derogatory}} [[name]]s for [[place]]s",
bare_category_breadcrumb = "derogatory names",
bare_category_parent = "nicknames for places",
},
["derogatory names for states!"] = {
full_category_link = "{{glossary|derogatory}} [[name]]s for [[state]]s",
bare_category_breadcrumb = "รัฐ",
bare_category_parent = "derogatory names for places",
addl_bare_category_parents = {"nicknames for states"},
},
["DEROGATORY_NAME_FOR capital"] = {
link = false,
default = {"Derogatory names for cities"},
},
["DEROGATORY_NAME_FOR city"] = {
link = false,
default = {"Derogatory names for cities"},
},
["DEROGATORY_NAME_FOR continent"] = {
link = false,
default = {"Derogatory names for continents"},
},
["DEROGATORY_NAME_FOR country"] = {
link = false,
default = {"Derogatory names for countries"},
},
["DEROGATORY_NAME_FOR metropolitan city"] = {
-- "metropolitan city" doesn't fall back to "นคร"
link = false,
default = {"Derogatory names for cities"},
},
["DEROGATORY_NAME_FOR place"] = {
link = false,
default = {"Derogatory names for places"},
},
["DEROGATORY_NAME_FOR prefecture-level city"] = {
-- "prefecture-level city" doesn't fall back to "นคร" but things like "county-level city" and
-- "subprovincial city" fall back to "prefecture-level city"
link = false,
default = {"Derogatory names for cities"},
},
["DEROGATORY_NAME_FOR state"] = {
link = false,
default = {"Derogatory names for states"},
},
["DEROGATORY_NAME_FOR town"] = {
link = false,
default = {"Derogatory names for cities"},
},
---------- Ellipses ----------
["ellipses of places!"] = {
full_category_link = "{{glossary|ellipsis|ellipses}} of [[name]]s of [[place]]s",
bare_category_breadcrumb = "ellipses",
bare_category_parent = "สถานที่",
},
["ELLIPSIS_OF place"] = {
link = false,
default = {"Ellipses of places"},
},
---------- Former long-form names ----------
["former long-form names of countries!"] = {
full_category_link = "no-longer-[[use]]d [[long]]-[[form]] (but typically [[unofficial]]) [[name]]s of [[country|countries]]",
bare_category_breadcrumb = "ประเทศ",
bare_category_parent = "former long-form names of places",
addl_bare_category_parents = {{name = "former names of countries", sort = "long-form"}},
},
["former long-form names of places!"] = {
full_category_link = "no-longer-[[use]]d [[long]]-[[form]] (but typically [[unofficial]]) [[name]]s of [[place]]s",
bare_category_breadcrumb = "long-form",
bare_category_parent = "former names of places",
},
["FORMER_LONG_FORM_OF country"] = {
link = false,
default = {"Former long-form names of countries"},
},
["FORMER_LONG_FORM_OF place"] = {
link = false,
default = {"Former long-form names of places"},
},
---------- Former names ----------
["former names of capitals!"] = {
full_category_link = "[[former]] [[name]]s of [[capital city|capital cities]] that generally still exist but under a different name",
bare_category_breadcrumb = "เมืองหลวง",
bare_category_parent = "former names of settlements",
},
["former names of countries!"] = {
full_category_link = "[[former]] [[name]]s of [[country|countries]] that generally still exist but under a different name",
bare_category_breadcrumb = "ประเทศ",
bare_category_parent = "former names of places",
},
["former names of places!"] = {
full_category_link = "[[former]] [[name]]s of [[place]]s that generally still exist but under a different name",
bare_category_breadcrumb = "former names",
bare_category_parent = "สถานที่",
},
["former names of political divisions!"] = {
full_category_link = "[[former]] [[name]]s of [[political]] [[division]]s (states, provinces, counties, etc.) that generally still exist but under a different name",
bare_category_breadcrumb = "political divisions",
bare_category_parent = "former names of places",
},
["former names of polities!"] = {
full_category_link = "[[former]] [[name]]s of [[polity|polities]] (e.g. [[country|countries]]) that generally still exist but under a different name",
bare_category_breadcrumb = "องค์การทางการเมือง",
bare_category_parent = "former names of places",
},
["former names of settlements!"] = {
full_category_link = "[[former]] [[name]]s of [[city|cities]], [[town]]s, [[village]]s, etc. that generally still exist but under a different name",
bare_category_breadcrumb = "การตั้งถิ่นฐาน",
bare_category_parent = "former names of political divisions",
},
["FORMER_NAME_OF capital"] = {
link = false,
default = {"Former names of capitals"},
},
["FORMER_NAME_OF country"] = {
link = false,
default = {"Former names of countries"},
},
["FORMER_NAME_OF place"] = {
link = false,
default = {"Former names of places"},
},
["FORMER_NAME_OF polity"] = {
link = false,
default = {"Former names of polities"},
},
["FORMER_NAME_OF region"] = {
link = false,
fallback = "FORMER_NAME_OF subpolity",
},
["FORMER_NAME_OF settlement"] = {
link = false,
default = {"Former names of settlements"},
},
["FORMER_NAME_OF subpolity"] = {
link = false,
default = {"Former names of political divisions"},
},
---------- Former nicknames ----------
["former nicknames for cities!"] = {
full_category_link = "no-longer-used [[nickname]]s for [[city|cities]], e.g. the [[Eternal City]] for [[Kyoto]] during the {{w|Heian period}} ({{circa2|800–1100|short=yes}} {{AD}})",
bare_category_breadcrumb = "นคร",
bare_category_parent = "former nicknames for places",
addl_bare_category_parents = {"nicknames for cities"},
},
["former nicknames for places!"] = {
full_category_link = "no-longer-used [[nickname]]s for [[place]]s",
bare_category_breadcrumb = "former",
bare_category_parent = "nicknames for places",
addl_bare_category_parents = {{name = "former names of places", sort = "nicknames"}},
},
["FORMER_NICKNAME_FOR capital"] = {
link = false,
default = {"Former nicknames for cities"},
},
["FORMER_NICKNAME_FOR city"] = {
link = false,
default = {"Former nicknames for cities"},
},
["FORMER_NICKNAME_FOR metropolitan city"] = {
-- "metropolitan city" doesn't fall back to "นคร"
link = false,
default = {"Former nicknames for cities"},
},
["FORMER_NICKNAME_FOR place"] = {
link = false,
default = {"Former nicknames for places"},
},
["FORMER_NICKNAME_FOR prefecture-level city"] = {
-- "prefecture-level city" doesn't fall back to "นคร" but things like "county-level city" and
-- "subprovincial city" fall back to "prefecture-level city"
link = false,
default = {"Former nicknames for cities"},
},
["FORMER_NICKNAME_FOR town"] = {
link = false,
default = {"Former nicknames for cities"},
},
---------- Former official names ----------
["former official names of countries!"] = {
full_category_link = "no-longer-[[use]]d [[official]] [[name]]s of [[country|countries]]",
bare_category_breadcrumb = "ประเทศ",
bare_category_parent = "former official names of places",
addl_bare_category_parents = {{name = "former names of countries", sort = "official"}},
},
["former official names of places!"] = {
full_category_link = "no-longer-[[use]]d [[official]] [[name]]s of [[place]]s",
bare_category_breadcrumb = "official",
bare_category_parent = "former names of places",
},
["FORMER_OFFICIAL_NAME_OF country"] = {
link = false,
default = {"Former official names of countries"},
},
["FORMER_OFFICIAL_NAME_OF place"] = {
link = false,
default = {"Former official names of places"},
},
---------- Long-form names ----------
["long-form names of countries!"] = {
full_category_link = "[[long]]-[[form]] (but typically [[unofficial]]) [[name]]s of [[country|countries]]",
bare_category_breadcrumb = "ประเทศ",
bare_category_parent = "long-form names of places",
},
["long-form names of places!"] = {
full_category_link = "[[long]]-[[form]] (but typically [[unofficial]]) [[name]]s of [[place]]s",
bare_category_breadcrumb = "long-form names",
bare_category_parent = "สถานที่",
},
["LONG_FORM_OF country"] = {
link = false,
default = {"Long-form names of countries"},
},
["LONG_FORM_OF place"] = {
link = false,
default = {"Long-form names of places"},
},
---------- Nicknames ----------
["nicknames for cities!"] = {
full_category_link = "[[nickname]]s for [[city|cities]], e.g. the [[Big Apple]] for [[New York City]]",
bare_category_breadcrumb = "นคร",
bare_category_parent = "nicknames for places",
addl_bare_category_parents = {"นคร"},
},
["nicknames for continents!"] = {
full_category_link = "[[nickname]]s for [[continent]]s",
bare_category_breadcrumb = "ทวีป",
bare_category_parent = "nicknames for places",
addl_bare_category_parents = {"ทวีป"},
},
["nicknames for countries!"] = {
full_category_link = "[[nickname]]s for [[country|countries]]",
bare_category_breadcrumb = "ประเทศ",
bare_category_parent = "nicknames for places",
addl_bare_category_parents = {"ประเทศ"},
},
["nicknames for places!"] = {
full_category_link = "[[nickname]]s for [[place]]s",
bare_category_breadcrumb = "สถานที่",
bare_category_parent = "nicknames",
addl_bare_category_parents = {"สถานที่"},
},
["nicknames for states!"] = {
-- For categorizing nicknames for states of e.g. the United States
full_category_link = "[[nicknames]] for [[state]]s",
bare_category_breadcrumb = "รัฐ",
bare_category_parent = "nicknames for places",
addl_bare_category_parents = {"รัฐ"},
},
["NICKNAME_FOR capital"] = {
link = false,
default = {"Nicknames for cities"},
},
["NICKNAME_FOR city"] = {
link = false,
default = {"Nicknames for cities"},
},
["NICKNAME_FOR continent"] = {
link = false,
default = {"Nicknames for continents"},
},
["NICKNAME_FOR country"] = {
link = false,
default = {"Nicknames for countries"},
},
["NICKNAME_FOR metropolitan city"] = {
-- "metropolitan city" doesn't fall back to "นคร"
link = false,
default = {"Nicknames for cities"},
},
["NICKNAME_FOR place"] = {
link = false,
default = {"Nicknames for places"},
},
["NICKNAME_FOR prefecture-level city"] = {
-- "prefecture-level city" doesn't fall back to "นคร" but things like "county-level city" and
-- "subprovincial city" fall back to "prefecture-level city"
link = false,
default = {"Nicknames for cities"},
},
["NICKNAME_FOR state"] = {
link = false,
default = {"Nicknames for states"},
},
["NICKNAME_FOR town"] = {
link = false,
default = {"Nicknames for cities"},
},
---------- Obsolete forms ----------
["obsolete forms of places!"] = {
full_category_link = "{{glossary|obsolete}} [[form]]s of [[name]]s of [[place]]s",
bare_category_breadcrumb = "obsolete forms",
bare_category_parent = "สถานที่",
},
["OBSOLETE_FORM_OF place"] = {
link = false,
default = {"Obsolete forms of places"},
},
---------- Official names ----------
["official names of countries!"] = {
full_category_link = "[[official]] [[name]]s of [[country|countries]]",
bare_category_breadcrumb = "ประเทศ",
bare_category_parent = "official names of places",
},
["official names of former countries!"] = {
full_category_link = "[[official]] [[name]]s of [[country|countries]] that no longer [[exist]]",
bare_category_breadcrumb = "ประเทศ",
bare_category_parent = "official names of former places",
},
["official names of former places!"] = {
full_category_link = "[[official]] [[name]]s of [[place]]s that no longer [[exist]]",
bare_category_breadcrumb = "official names",
bare_category_parent = "former places",
addl_bare_category_parents = {{name = "official names of places", sort = "former"}},
},
["official names of places!"] = {
full_category_link = "[[official]] [[name]]s of [[place]]s",
bare_category_breadcrumb = "official names",
bare_category_parent = "สถานที่",
},
["OFFICIAL_NAME_OF country"] = {
link = false,
default = {"Official names of countries"},
},
["OFFICIAL_NAME_OF FORMER country"] = {
link = false,
default = {"Official names of former countries"},
},
["OFFICIAL_NAME_OF FORMER place"] = {
link = false,
default = {"Official names of former places"},
},
["OFFICIAL_NAME_OF place"] = {
link = false,
default = {"Official names of places"},
},
---------- Official nicknames ----------
["official nicknames for places!"] = {
full_category_link = "[[official]] [[nickname]]s for [[place]]s",
bare_category_breadcrumb = "official",
bare_category_parent = "nicknames for places",
},
["official nicknames for states!"] = {
-- For categorizing official nicknames for states of e.g. the United States
full_category_link = "[[official]] [[nicknames]] for [[state]]s",
bare_category_breadcrumb = "official",
bare_category_parent = "nicknames for states",
addl_bare_category_parents = {"รัฐ"},
},
["OFFICIAL_NICKNAME_FOR place"] = {
link = false,
default = {"Official nicknames for places"},
},
["OFFICIAL_NICKNAME_FOR state"] = {
link = false,
default = {"Official nicknames for states"},
},
}
export.plural_placetype_to_singular = {}
for sg_placetype, spec in pairs(export.placetype_data) do
if spec.plural then
export.plural_placetype_to_singular[spec.plural] = sg_placetype
end
end
return export
0xvw51pbnx3kitsu9mw03hxjjrzw4y6
5720709
5720700
2026-04-21T02:03:14Z
OctraBot
3198
5720709
Scribunto
text/plain
local export = {}
export.force_cat = false -- set to true for testing
local m_locations = require("Module:place/locations")
local m_links = require("Module:links")
local m_table = require("Module:table")
local m_strutils = require("Module:string utilities")
local debug_track_module = "Module:debug/track"
local en_utilities_module = "Module:en-utilities"
local dump = mw.dumpObject
local insert = table.insert
local concat = table.concat
local internal_error = m_locations.internal_error
export.internal_error = internal_error
local process_error = m_locations.process_error
export.process_error = process_error
local unpack = unpack or table.unpack -- Lua 5.2 compatibility
local ucfirst = m_strutils.ucfirst
local ulower = m_strutils.lower
local rmatch = m_strutils.match
local split = m_strutils.split
--[==[ intro:
This module contains placetype data used by [[Module:place]] and {{tl|place}}, along with a significant amount of code
to work with both placetypes and locations, as well as some placename-related info (FIXME: Consider moving it to
[[Module:place/locations]]). See also [[Module:place/locations]], which has definitions of all known locations. You must
currently load this module using {{cd|require()}}, not using {{cd|mw.loadData()}}.
In particular, it contains two fundamental and tricky functions:
# `get_placetype_equivs`, which finds the equivalent placetypes to look under in order to find a given property, and in
the process correctly handles placetypes with qualifiers (including qualifiers that act similar to "type-raising"
operators in that they do something non-trivial to the placetype to their right) as well as form-of directives and
fallbacks.
# `find_matching_holonym_location`, which looks up a holonym to find a matching known location, but in the process
checks holonyms to the right to make sure there isn't a clash between the user-specified containing holonyms and the
containers of the known location being considered. This is done to prevent overcategorizing when either there are two
known locations with the same name (e.g. Birmingham in England and Birmingham, Alabama in the US), or more generally
two locations with the same name, one of which is a known location but where the other is not (e.g. we're processing
non-known-location Mérida, Spain and don't want it categorized like known location Mérida, Yucatán, Mexico).
Both of these functions are invoked repeatedly, and probably are invoked several times on the same inputs and as a
result are candidates for memoization to speed up the operation of {{tl|place}}.
]==]
------------------------------------------------------------------------------------------
-- Basic utilities --
------------------------------------------------------------------------------------------
--[==[
Return true if `force_cat` is set either in this module or in [[Module:place/locations]].
]==]
function export.get_force_cat()
return export.force_cat or m_locations.force_cat
end
-- Add the page to a tracking "category". To see the pages in the "category",
-- go to [[Wiktionary:Tracking/place/PAGE]] and click on "What links here".
local function track(page)
require(debug_track_module)("place/" .. page)
return true
end
function export.remove_links_and_html(text)
text = m_links.remove_links(text)
return text:gsub("<.->", "")
end
--[==[
Return the singular version of a maybe-plural placetype, or nil if not plural. This correctly handles placetypes with
irregular plurals such as `kibbutzim` plural of `kibbutz` by looking up in a table constructed from the `plural` values
specified in `placetype_data`. If a special plural value is not found, the regular singularization algorithm in
[[Module:en-utilities]] is invoked, which reverses the y -> ies change after vowels and the 'es' addition after sh/ch/x,
and otherwise just subtracts a final 's' (which will incorrectly generate 'passe' for plural 'passes'; FIXME: consider
changing this for words ending in '-sses'). If the generated singular is the same as the passed-in value, nil is
returned.
]==]
function export.maybe_singularize_placetype(placetype)
if not placetype then
return nil
end
if export.plural_placetype_to_singular[placetype] then
return export.plural_placetype_to_singular[placetype]
end
local retval = --[[require(en_utilities_module).singularize(placetype)]] placetype
if retval == placetype then
return nil
end
return retval
end
-- Return the correct plural of a placetype, and (if `do_ucfirst` is given) make the first letter uppercase. We first
-- look up the plural in `placetype_data`, falling back to pluralize() in [[Module:en-utilities]], which is almost
-- always correct.
function export.pluralize_placetype(placetype, do_ucfirst)
local ptdata = export.placetype_data[placetype]
if ptdata and ptdata.plural then
placetype = ptdata.plural
else
placetype = --[[require(en_utilities_module).pluralize(placetype)]] placetype
end
if do_ucfirst then
return ucfirst(placetype)
else
return placetype
end
end
--[==[
Get the data associated with a placetype, which may be in its singular or plural form. If `from_category` is specified,
we also look for category-only placetypes (generally plural) followed by `!`. Return three values: (a) the placetype
under which the data can be looked up (i.e. in its singular form if the passed-in `placetype` is plural and did not
match a category-only placetype followed by `!`); (b) the placetype data structure; (c) the type of `placetype` match
that occurred, one of `"direct"` if the canonical placetype is the same as the passed-in `placetype` and also the same
as the key under which `ptdata` was looked up, or `"direct-category"` if the `ptdata` was looked up under a key formed
from the passed-in `placetype` by adding `!`, or `"plural"` if the `ptdata` was looked up under the singularized version
of the plural passed-in `placetype`.
]==]
function export.get_placetype_data(placetype, from_category)
local ptdata = export.placetype_data[placetype]
if ptdata then
return placetype, ptdata, "direct"
end
if from_category then
ptdata = export.placetype_data[placetype .. "!"]
if ptdata then
return placetype .. "!", ptdata, "direct-category"
end
end
local sg_placetype = export.maybe_singularize_placetype(placetype)
if sg_placetype then
ptdata = export.placetype_data[sg_placetype]
if ptdata then
return sg_placetype, ptdata, "plural"
end
end
return nil
end
--[==[
Check for special pseudo-placetypes that should be ignored for categorization purposes.
]==]
function export.placetype_is_ignorable(placetype)
return placetype == "and" or placetype == "or" or placetype == "และ" or placetype == "หรือ" or placetype:find("^%(")
end
function export.resolve_placetype_aliases(placetype)
return export.placetype_aliases[placetype] or placetype
end
--[==[
Return a property from `placetype_data` for a given placetype. If the placetype isn't found in `placetype_data`, or the
key isn't found in the placetype's entry in `placetype_data`, return nil.
]==]
function export.get_placetype_prop(placetype, key)
-- Usually we are called on equivalent placetypes returned from `get_placetype_equivs`, in which case placetype
-- aliases have been resolved, but sometimes not, e.g. when fetching the indefinite article in
-- get_placetype_article(). `resolve_placetype_aliases` is just a simple lookup and it doesn't hurt to do it twice.
placetype = export.resolve_placetype_aliases(placetype)
if export.placetype_data[placetype] then
return export.placetype_data[placetype][key]
else
return nil
end
end
--[==[
Given a placetype, split the placetype into one or more potential ''splits'', each consisting of a three-element list
{ {``prev_qualifiers``, ``this_qualifier``, ``reduced_placetype``}}, i.e.
# the concatenation of zero or more previously-recognized qualifiers on the left, normally canonicalized (if there are
zero such qualifiers, the value will be nil);
# a single recognized qualifier, normally canonicalized (if there is no qualifier, the value will be nil);
# the "reduced placetype" on the right.
Splitting between the qualifier in (2) and the reduced placetype in (3) happens at each space character, proceeding from
left to right, and stops if a qualifier isn't recognized. All placetypes are canonicalized by checking for aliases
in `placetype_aliases`, but no other checks are made as to whether the reduced placetype is recognized. Canonicalization
of qualifiers does not happen if `no_canon_qualifiers` is specified.
For example, given the placetype `"small beachside unincorporated community"`, the return value will be
{ {
{nil, nil, "small beachside unincorporated community"},
{nil, "small", "beachside unincorporated community"},
{"small", "[[beachfront]]", "unincorporated community"},
{"small [[beachfront]]", "[[unincorporated]]", "community"},
}}
Here, `"beachside"` is canonicalized to `"[[beachfront]]"` and `"unincorporated"` is canonicalized to
`"[[unincorporated]]"`, in both cases according to the entry in `placetype_qualifiers`.
On the other hand, if given `"small former haunted community"`, the return value will be
{ {
{nil, nil, "small former haunted community"},
{nil, "small", "former haunted community"},
{"small", "former", "haunted community"},
}}
because `"small"` and `"former"` but not `"haunted"` are recognized as qualifiers.
Finally, if given `"former adr"`, the return value will be
{ {
{nil, nil, "former adr"},
{nil, "former", "administrative region"},
}}
because `"adr"` is a recognized placetype alias for `"administrative region"`.
]==]
function export.split_qualifiers_from_placetype(placetype, no_canon_qualifiers)
local splits = {{nil, nil, export.resolve_placetype_aliases(placetype)}}
local prev_qualifier = nil
while true do
local qualifier, reduced_placetype = placetype:match("^(.-) (.*)$")
if qualifier then
local canon = export.placetype_qualifiers[qualifier]
if canon == nil then
break
end
local new_qualifier = qualifier
if type(canon) == "table" then
canon = canon.link
end
if not no_canon_qualifiers and canon ~= false then
if canon == true then
new_qualifier = "[[" .. qualifier .. "]]"
else
new_qualifier = canon
end
end
insert(splits, {prev_qualifier, new_qualifier, export.resolve_placetype_aliases(reduced_placetype)})
prev_qualifier = prev_qualifier and prev_qualifier .. " " .. new_qualifier or new_qualifier
placetype = reduced_placetype
else
break
end
end
return splits
end
--[==[
Given a `placetype` (which may be pluralized), return an ordered list of equivalent placetypes to look under to find the
placetype's properties (such as the category or categories to be inserted). The return value is actually an ordered list
of objects of the form `{qualifier=``qualifier``, placetype=``equiv_placetype``}` where ``equiv_placetype`` is a
placetype whose properties to look up, derived from the passed-in placetype or from a contiguous subsequence of the
words in the passed-in placetype (always including the rightmost word in the placetype, i.e. we successively chop off
qualifier words from the left and use the remainder to find equivalent placetypes). ``qualifier`` is the remaining words
not part of the subsequence used to find ``equiv_placetype``; or nil if all words in the passed-in placetype were used
to find ``equiv_placetype``. (FIXME: This qualifier is not currently used anywhere.) Only placetypes for which there is
an entry in `placetype_data` are included. The placetype passed in is always checked first, and will form the first
entry if it exists in `placetype_data`.
'''NOTE:''' This is a tricky function as it implements handling of (a) qualifiers, (b) fallback logic, (c)
"type-raising" qualifiers such as `former`/`ancient`/etc. as well as `fictional` and `mythological`, and (d) form-of
directives, which act somewhat similarly to `former`, and allows interaction between more than one of these
simultaneously (e.g. official names of former places, which have their own categorization).
If {{tl|place}} gets too slow, one potential speedup is to memoize the results of this function, as it appears to be
getting called more than once on the same inputs. Another similar potential speedup is to memoize the results of
`iterate_matching_holonym_location()`.
For example, given the placetype `left tributary`, the following placetype/qualifier combinations are checked in turn:
```
{qualifier = nil, placetype="left tributary"}
{qualifier = "left", placetype="tributary"}
{qualifier = "left", placetype="แม่น้ำ"}
```
and the return value will be
{ {
{qualifier = "left", placetype="tributary"},
{qualifier = "left", placetype="แม่น้ำ"},
}}
The algorithm first enters the placetype itself into the list, then checks for `left tributary` as a recognized
placetype in `placetype_data` and doesn't find it, so it doesn't enter it into the returned list (if it found it, it
would add it as well as any fallbacks directly after it). It then splits off the recognized qualifier `left` to form the
''reduced placetype'' `tributary`, which is entered into the list because it is found in `placetype_data`. Then, because
it has a fallback `river`, which exists in `placetype_data`, the fallback is entered next.
Another example is `small rural fraziones` (where a ''frazione'' is type of subdivision of a ''comune'' or municipality,
often specifically an outlying hamlet). the placetype/qualifier combinations checked are:
```
{qualifier = nil, placetype="small rural fraziones"}
{qualifier = nil, placetype="small rural frazione"}
{qualifier = "small", placetype="rural fraziones"}
{qualifier = "small", placetype="rural frazione"}
{qualifier = "small [[rural]]", placetype="fraziones"}
{qualifier = "small [[rural]]", placetype="frazione"}
{qualifier = "small [[rural]]", placetype="hamlet"}
{qualifier = "small [[rural]]", placetype="village"}
```
The return value ends up as
{qualifier = "small [[rural]]", placetype="frazione"},
{qualifier = "small [[rural]]", placetype="hamlet"},
{qualifier = "small [[rural]]", placetype="village"},
}}
Here, because the result of singularizing `fraziones` returns a different value from the placetype itself, that
singularized value is checked after the original plural value. Also, in the process of splitting off qualifiers,
they are canonicalized if the entry in `placetype_qualifiers` says to do so; in this case, links are placed around
`rural`. Finally, `frazione` has `hamlet` as its fallback, which in turn has `village` as its fallback, so both
fallbacks end up being returned.
`no_fallback`, if set, disables returning equivalent placetypes based on the `fallback` setting for a placetype. This is
used in the first of two loops in find_placetype_cat_specs() in [[Module:place]] to prefer exact matches for placetypes
such as barangays with later holonyms to matches based on a fallback such as `neighborhood` with an earlier holonym.
See the comment in that function in [[Module:place]] for a more detailed explanation of why this is needed. Only the
placetype itself, and any reduced placetypes created by chopping off recognized qualifiers at the beginning, are
returned; but we do not return reduced placetypes if a containing placetype exists in `placetype_data`. (For example,
`"overseas territory"` has a fallback `"dependent territory"`, and `"overseas"` is also a recognized qualifier. When
`no_fallback` is in place, without the above proviso, we would return `"overseas territory"` followed by `"ดินแดน"`
with the incorrect effect of classifying an `"overseas territory"` of the United Kingdom such as `"Gibraltar"` under
[[:Category:Territories of the United Kingdom]] instead of [[:Category:Dependent territories of the United Kingdom]].)
As an exception, if `historical`, `ancient`, `former` or the like are found, they proceed ignoring `no_fallback`,
because it seems tricky to handle them correctly in the presence of `no_fallback`, and historical/former placetypes
rarely occur with exact match category specs anyway.
`no_split_qualifiers` prevents splitting off recognized qualifiers and returning the remainder of the placetype as an
equivalent placetype. Only the passed-in placetype, and any fallbacks, will be returned. This is used in
[[Module:category tree/topic cat/data/Places]] when looking up placetypes found in categories. Such placetypes won't
have qualifiers and so it doesn't make sense to try and look for them.
`from_category`, if set, causes category-only placetypes (those ending in `!`) to also be checked.
`form_of_directive`, if set, causes the specified form-of directive (e.g. `FORMER_NAME_OF`) to be prepended to checked
placetypes, their directive-specific type (e.g. `FORMER_NAME_OF_type`), and their classes (`class`) to get the
appropriate placetypes to check for form-of-directive categories. It falls back to the prepended generic `place` as a
placetype, e.g. `FORMER_NAME_OF place`, if nothing else matches.
`no_check_for_inherently_former` is used internally to prevent an infinite loop when checking for `inherently_former`.
`register_former_as_non_former` is a major hack used in `get_bare_categories` to deal with the mismatch between e.g.
known location `Yugoslavia` declaring itself a `country` but definitions of it declaring it a `former country`. It
causes the non-former version of the specified placetype to be included in the returned equivalents along with the
former placetypes. [FIXME: This should apply only to the entries in `former_countries` but it's tricky to do that now;
fix this in the known-location refactor. -- The known-location refactor is already done but we haven't yet fixed this.]
]==]
function export.get_placetype_equivs(placetype, props)
local no_fallback, no_split_qualifiers, no_check_for_inherently_former, from_category, register_former_as_non_former
local form_of_directive
if props then
no_fallback, no_split_qualifiers, no_check_for_inherently_former, from_category, register_former_as_non_former =
props.no_fallback, props.no_split_qualifiers, props.no_check_for_inherently_former, props.from_category,
props.register_former_as_non_former
form_of_directive = props.form_of_directive
end
local equivs = {}
-- Insert `placetype` into `equivs`, along with any fallback placetypes listed in `placetype_data`. `qualifier` is
-- the preceding qualifier to insert into `equivs` along with the placetype (see comment at top of function). If
-- `from_category` is given, we also check for a category-specific entry consisting of the placetype followed by
-- `!`, and in all cases we also check to see if `placetype` is plural, and if so, insert the singularized version
-- along with its fallbacks (if any) in `placetype_data`. `form_of_prefix` is a form-of prefix such as
-- `OFFICIAL_NAME_OF`. If specified, we check the fallbacks of `placetype` without the prefix but then insert into
-- `equivs` the prefixed placetype. This way, if the user says e.g. {{tl|place|pt|@official name of:Cuba|island country|r/Caribbean}},
-- we will correctly categorize into [[:Category:Official names of countries]], rather than only trying to look up
-- `OFFICIAL_NAME_OF island country` and failing, falling back ultimately to [[:Category:Official names of places]].
local function insert_placetype_and_fallbacks(qualifier, placetype, form_of_prefix)
local function insert_equiv(pt)
if form_of_prefix then
-- Let's say the user says {{tl|place|pt|@official name of:Cuba|island country|r/Caribbean}} and we have
-- no entry for `OFFICIAL_NAME_OF island country` but we do for `OFFICIAL_NAME_OF country` (which we end
-- up processing because `island country` falls back to `country`), and that entry in turn is defined
-- using a fallback. We have to insert that fallback-of-fallback, and the easiest/cleanest way of
-- handling this is by calling ourselves recursively.
insert_placetype_and_fallbacks(qualifier, form_of_prefix .. " " .. pt)
else
insert(equivs, {qualifier=qualifier, placetype=pt})
end
end
-- Insert the placetype, along with any fallbacks.
local canon_placetype, ptdata, ptmatch = export.get_placetype_data(placetype, from_category)
if ptdata then
insert_equiv(canon_placetype)
if no_fallback then
return
end
local first_placetype = #equivs + 1
local prev_placetype = nil
while true do
local pt_value = export.placetype_data[canon_placetype]
if not pt_value then
internal_error("Fallback value %s specified for placetype %s but is not in `placetype_data`",
canon_placetype, prev_placetype)
end
if pt_value.fallback then
insert_equiv(pt_value.fallback)
local last_placetype = #equivs
if last_placetype - first_placetype >= 10 then
local fallback_loop = {}
for i = first_placetype, last_placetype do
insert(fallback_loop, equivs[i].placetype)
end
internal_error("Apparent loop in fallback chain: %s", table.concat(fallback_loop, " -> "))
end
prev_placetype = canon_placetype
canon_placetype = pt_value.fallback
else
break
end
end
end
end
-- Insert `placetype` into `equivs`, along with any fallback placetypes listed in `placetype_data`. This is a
-- wrapper around the more basic `insert_placetype_and_fallbacks()` which handles form-of directives. If there is no
-- form-of directive, this function directly calls `insert_placetype_and_fallbacks()`. We do things this way so that
-- form-of directives correctly combine with `former`-type qualifiers. Note that we also have special backups for
-- form-of directives that check `DIRECTIVE place` (and before that, `DIRECTIVE FORMER/ANCIENT place` is there's a
-- `former`-type directive); these backups live outside this function because we want them done once, late, rather
-- than in each invocation of `process_and_insert_placetype()`.
local function process_and_insert_placetype(qualifier, reduced_placetype)
if form_of_directive then
-- First check for e.g. `OFFICIAL_NAME_OF island country` and its fallbacks; then we look for fallbacks of
-- `island country` and check e.g. `OFFICIAL_NAME_OF country` and its fallbacks. All of this is handled by
-- `insert_placetype_and_fallbacks()` with appropriate parameters. After that, check the general class of
-- the directive, e.g. `subpolity` if something like `district` is given. (Eventually, we check for
-- `OFFICIAL_NAME_OF place` as a backup, but this happens at the end outside the loop over qualifiers.)
insert_placetype_and_fallbacks(qualifier, reduced_placetype, form_of_directive)
if not no_fallback then
local reduced_placetype_equivs = export.get_placetype_equivs(reduced_placetype)
local directive_type = export.get_equiv_placetype_prop_from_equivs(reduced_placetype_equivs,
function(pt) return export.get_placetype_prop(pt, form_of_directive .. "_type") or
export.get_placetype_prop(pt, "class") end
)
if not directive_type then
local pt_data = export.get_equiv_placetype_prop_from_equivs(reduced_placetype_equivs,
function(pt) return export.placetype_data[pt] end
)
if pt_data then
internal_error("For placetype %s in conjunction with form-of directive %s, placetype data " ..
'located but directive-specific type property %s missing, and so is "class"; ' ..
"placetypes searched are %s", reduced_placetype, form_of_directive,
form_of_directive .. "_type", reduced_placetype_equivs)
else
-- This should be allowed, as we allow unrecognized placetypes in general.
end
elseif directive_type ~= "!" then
insert_placetype_and_fallbacks(qualifier, directive_type, form_of_directive)
end
end
else
insert_placetype_and_fallbacks(qualifier, reduced_placetype)
end
end
-- Successively split off recognized qualifiers and loop over successively greater sets of qualifiers from the left
-- (unless `no_split_qualifiers` is specified, in which case we don't check for qualifiers).
local splits
if no_split_qualifiers then
splits = {{nil, nil, export.resolve_placetype_aliases(placetype)}}
else
splits = export.split_qualifiers_from_placetype(placetype)
end
for _, split in ipairs(splits) do
local prev_qualifier, this_qualifier, reduced_placetype = unpack(split, 1, 3)
-- If a special "former" qualifier like `former` or `historical` isn't present, and
-- `no_check_for_inherently_former` is not given (this flag is used to avoid infinite loops), check for
-- "inherently former" placetypes like `satrapy` and `treaty port` that always refer to no-longer-existing
-- placetypes, and handle accordingly.
local unlinked_this_qualifier
if this_qualifier and this_qualifier:find("%[") then
unlinked_this_qualifier = export.remove_links_and_html(this_qualifier)
else
unlinked_this_qualifier = this_qualifier
end
local former_qualifiers = this_qualifier and export.former_qualifiers[unlinked_this_qualifier] or nil
if not former_qualifiers and not no_check_for_inherently_former then
former_qualifiers = export.get_equiv_placetype_prop(reduced_placetype,
function(pt) return export.get_placetype_prop(pt, "inherently_former") end,
{no_check_for_inherently_former = true})
end
-- If a special "former" qualifier like `former` or `historical` is present, map it to the appropriate internal
-- qualifiers (`ANCIENT` and/or `FORMER`, which are written in all-caps to distinguish them from user-specified
-- qualifiers), fetch the `former_type` property, and treat the placetype as if a concatenation of the mapped
-- qualifier(s) and the value of `former_type`. For example, if `medieval village` is given, we map `medieval`
-- to `ANCIENT` and `FORMER`, and `village` to its `former_type` of `settlement`, and enter the placetypes
-- `ANCIENT settlement` and `FORMER settlement` (in that order) into `equivs`. If the placetype following the
-- "former" qualifier is recognized in `placetype_data` but has no `former_type` and no fallback with a
-- `former_type` specified, it is an internal error; but if the placetype isn't recognized (e.g. something like
-- `former greenhouse` is specified and we don't have an entry for `greenhouse`), just track the occurrence and
-- don't enter anything into `equivs`.
if former_qualifiers then
-- FIXME: Should we respect `no_fallback` here? My instinct says no.
local reduced_placetype_equivs = export.get_placetype_equivs(reduced_placetype, {
no_check_for_inherently_former = true
})
local former_type = export.get_equiv_placetype_prop_from_equivs(reduced_placetype_equivs,
function(pt) return export.get_placetype_prop(pt, "former_type") or
export.get_placetype_prop(pt, "class") end
)
if not former_type then
local pt_data = export.get_equiv_placetype_prop_from_equivs(reduced_placetype_equivs,
function(pt) return export.placetype_data[pt] end
)
if pt_data then
internal_error("For placetype %s, placetype data located but `former_type` missing; " ..
"placetypes searched are %s", reduced_placetype, reduced_placetype_equivs)
else
-- Enable error when we've verified there aren't any examples.
track("bad-former-placetype")
track("bad-former-placetype/" .. reduced_placetype)
--process_error("For placetype '%s', unrecognized placetype following 'former'-type " ..
-- "qualifier; searched placetype(s) %s", reduced_placetype, dump(reduced_placetype_equivs))
end
elseif former_type ~= "!" then
-- First check directly for `ANCIENT/FORMER` + the original following placetype. This makes it possible
-- for (e.g.) former provinces of the Roman empire to be categorized specially.
for _, former_qualifier in ipairs(former_qualifiers) do
process_and_insert_placetype(prev_qualifier, former_qualifier .. " " .. reduced_placetype)
end
for _, former_qualifier in ipairs(former_qualifiers) do
process_and_insert_placetype(prev_qualifier, former_qualifier .. " " .. former_type)
end
-- HACK! See explanation above for `register_former_as_non_former`.
if register_former_as_non_former then
process_and_insert_placetype(prev_qualifier, reduced_placetype)
end
-- If we're processing a form-of directive, after doing everything else we do
-- `DIRECTIVE ANCIENT/FORMER place` e.g. `OFFICIAL_NAME_OF FORMER place` as a backup.
if form_of_directive and not no_fallback then
for _, former_qualifier in ipairs(former_qualifiers) do
insert_placetype_and_fallbacks(prev_qualifier, form_of_directive .. " " .. former_qualifier ..
" place")
end
end
-- Don't continue processing equivs. The reason is probably the same as the `break` below for
-- qualifier_to_placetype_equivs[]; categories for `former BLAH` are set using `default`, and
-- non-former equivs will otherwise take precedence.
break
end
end
-- Then see if the rightmost split-off qualifier is in qualifier_to_placetype_equivs
-- (e.g. 'fictional *' -> 'fictional location'). If so, add the mapping.
if this_qualifier and export.qualifier_to_placetype_equivs[unlinked_this_qualifier] then
insert(equivs, {
qualifier=prev_qualifier,
placetype=export.qualifier_to_placetype_equivs[unlinked_this_qualifier]
})
-- Don't continue processing equivs; otherwise, if we specify 'mythological city', even though the
-- equivalent entry for 'mythological location' gets inserted ahead of the entry for 'city', the
-- latter ends up generating the category because the category for 'mythological location' is set as
-- the default value, which is used only when no non-default category can be found.
break
end
-- Finally, join the rightmost split-off qualifier to the previously split-off qualifiers to form a combined
-- qualifier, and add it along with reduced_placetype and any mapping in placetype_data for reduced_placetype.
-- NOTE: The first time through this loop, both `prev_qualifier` and `this_qualifier` are nil, and this inserts
-- the full placetype into `equivs`.
local qualifier = prev_qualifier and prev_qualifier .. " " .. this_qualifier or this_qualifier
process_and_insert_placetype(qualifier, reduced_placetype)
-- If `no_fallback` and there's an entry in `placetype_data` for this placetype, don't include any reduced
-- placetypes to avoid the "overseas territory treated as a territory" issue describe above.
if no_fallback then
local canon_placetype, ptdata, ptmatch = export.get_placetype_data(reduced_placetype, from_category)
if canon_placetype then
break
end
end
end
-- If we're processing a form-of directive, after doing everything else we do `DIRECTIVE place` e.g.
-- `OFFICIAL_NAME_OF place` as a backup; but only if either the placetype as a whole is recognized or the placetype
-- begins with a recognized qualifier. This latter check is to avoid categorizing into e.g.
-- [[Category:en:Former names of places]] in an invocation like
-- {{place|en|@former name of:Democratic Republic of the Congo|country|r/Central Africa|;|used from 1971–1997}};
-- the `used from 1971–1997` gets treated as a placetype and we're called on it.
if form_of_directive and not no_fallback and (splits[2] or export.get_placetype_data(placetype, from_category)) then
insert_placetype_and_fallbacks(nil, form_of_directive .. " place")
end
return equivs
end
function export.get_equiv_placetype_prop_from_equivs(equivs, fun, continue_on_nil_only)
for _, equiv in ipairs(equivs) do
local retval = fun(equiv.placetype)
if continue_on_nil_only and retval ~= nil or not continue_on_nil_only and retval then
return retval, equiv
end
end
return nil, nil
end
--[==[
Given a placetype `placetype` and a function `fun` of one argument, iteratively call the function on equivalent
placetypes fetched from `get_placetype_equivs` until the function returns a non-falsy value (i.e. not {nil} or {false});
but if `continue_on_nil_only` is specified, the iterations continue until the function returns non non-{nil} value.
FIXME: We should make `continue_on_nil_only` the default; but this requires changing some callers.) When `fun` returns a
non-falsy or non-{nil} value, `get_equiv_placetype_prop` returns two values: the value returned by `fun` and the
equivalent placetype that triggered the non-falsy (or non-{nil}) return value. If `fun` never returns a non-falsy (or
non-{nil}) value, `get_equiv_placetype_prop` returns {nil} for both return values. If `placetype` is passed in as {nil},
the return value is the result of calling `fun` on {nil} (whatever it is) with {nil} for the second return value.
]==]
function export.get_equiv_placetype_prop(placetype, fun, props)
if not placetype then
return fun(nil), nil
end
return export.get_equiv_placetype_prop_from_equivs(export.get_placetype_equivs(placetype, props), fun,
props and props.continue_on_nil_only)
end
--[==[
Return the article that is used with an entry placetype. We proceed as follows:
# See if there is a recognized qualifier at the beginning that specifies an article (including `false` for no article).
This takes precedence over anything else, so that e.g. `various capitals` gets no article rather than "`the"`.
# Then check the placetype or any equivalent placetype for the `entry_placetype_use_the` property, indicating that
`"the"` should be used.
# Otherwise we look to see if the placetype itself (not any equivalents, even those involving deleting a qualifier from
the beginning) has an entry in `placetype_data` that specifies the indefinite article using `entry_placetype_use_the`
(principally for use with placetypes like `union territory`).
# Otherwise, we use [[Module:en-utilities]] to apply the standard algorithm to generate `"an"` for words beginning with
a vowel and `"a"` otherwise.
If `ucfirst` is true, the first letter of the article is made upper-case.
]==]
function export.get_placetype_article(placetype, ucfirst)
local art
local qualifier, reduced_placetype = placetype:match("^(.-) (.*)$")
if qualifier then
local canon = export.placetype_qualifiers[qualifier]
if type(canon) == "table" then
art = canon.article
end
end
if art == false then
return art
end
if art == nil then
local placetype_use_the = export.get_equiv_placetype_prop(placetype,
function(pt) return export.get_placetype_prop(pt, "entry_placetype_use_the") end)
if placetype_use_the then
art = "the"
else
art = export.get_placetype_prop(placetype, "entry_placetype_indefinite_article")
if not art then
art = --[[require(en_utilities_module).get_indefinite_article(placetype)]] ""
end
end
end
if ucfirst then
art = m_strutils.ucfirst(art)
end
return art
end
--[==[
Return the preposition that should be used after `placetype` when occurring as an entry placetype or in categories
(e.g. `city >in< France` but `country >of< South America`). The preposition defaults to `"ใน"` if not specified.
]==]
function export.get_placetype_entry_preposition(placetype)
local pt_prep = export.get_equiv_placetype_prop(placetype,
function(pt) return export.get_placetype_prop(pt, "preposition") end
)
return pt_prep or "ใน"
end
--[==[
Given a place desc (see top of file) and a holonym object (see top of file), add a key/value into the place desc's
`holonyms_by_placetype` field corresponding to the placetype and placename of the holonym. For example, corresponding
to the holonym "c/Italy", a key "ประเทศ" with the list value {"Italy"} will be added to the place desc's
`holonyms_by_placetype` field. If there is already a key with that place type, the new placename will be added to the
end of the value's list.
]==]
function export.key_holonym_into_place_desc(place_desc, holonym)
if not holonym.placetype then
return
end
-- Key in equivalent placetypes, so that e.g. `cities/San Francisco` gets keyed under `city`; but don't do
-- fallbacks, as it doesn't seem correct for the "do other holonyms of the same placetype" algorithm to do holonyms
-- of different types just because they have the same fallback.
local equiv_placetypes = export.get_placetype_equivs(holonym.placetype, {no_fallback = true})
local unlinked_placename = holonym.unlinked_placename
for _, equiv in ipairs(equiv_placetypes) do
local placetype = equiv.placetype
if not place_desc.holonyms_by_placetype then
place_desc.holonyms_by_placetype = {}
end
if not place_desc.holonyms_by_placetype[placetype] then
place_desc.holonyms_by_placetype[placetype] = {unlinked_placename}
else
insert(place_desc.holonyms_by_placetype[placetype], unlinked_placename)
end
end
end
--[=[
Construct a formatted link from the raw link spec `link` given the canonical singular placetype `sg_placetype`. If the
placetype was originally plural, `orig_placetype` should contain this plural value; otherwise it should be nil. This
will construct the appropriate type of link that displays as `orig_placetype` (or otherwise `sg_placetype`) but links to
whatever the `link` spec specifies (which may be `sg_placetype`, a Wikipedia article, etc.). `ptdata` is the placetype
data structure for the placetype, and `from_category` indicates that we are generating the description of a category
(otherwise we are generating the display form of an entry placetype).
]=]
local function make_placetype_link(link, sg_placetype, orig_placetype, ptdata, from_category, noerror)
if not from_category and ptdata.disallow_in_entries then
if noerror then
return "[not meant to be specified directly, with warning: " .. ptdata.disallow_in_entries .. "]"
else
process_error("Placetype %s is not meant to be specified directly: " .. ptdata.disallow_in_entries, sg_placetype)
end
end
if link == nil then
internal_error("Placetype data present for placetype %s but no link= setting given", sg_placetype)
elseif link == true then
if orig_placetype then
return ("[[%s|%s]]"):format(sg_placetype, orig_placetype)
else
return ("[[%s]]"):format(sg_placetype)
end
elseif link == false then
process_error("Placetype %s is not meant to be specified directly, but is only for internal use", sg_placetype)
elseif link == "w" then
return ("[[w:%s|%s]]"):format(sg_placetype, orig_placetype or sg_placetype)
elseif link == "separately" then
if orig_placetype then
local sg_words = split(sg_placetype, " ")
local orig_words = split(orig_placetype, " ")
if #sg_words ~= #orig_words then
internal_error("Can't construct 'separately' link for plural placetype %s as original placetype %s " ..
"has different number of words", orig_placetype, sg_placetype)
else
for i = 1, #sg_words do
if sg_words[i] == orig_words[i] then
sg_words[i] = ("[[%s]]"):format(sg_words[i])
else
sg_words[i] = ("[[%s|%s]]"):format(sg_words[i], orig_words[i])
end
end
return concat(sg_words, " ")
end
else
return (sg_placetype:gsub("([^ ]+)", "[[%1]]"))
end
elseif link:find("^%+") then
link = link:sub(2) -- discard initial +
return ("[[%s|%s]]"):format(link, orig_placetype or sg_placetype)
elseif not orig_placetype then
return link
else
return --[[require(en_utilities_module).pluralize(link)]] link
end
end
--[==[
Get the display form of a placetype by looking it up in `placetype_data`. If the placetype is recognized, or is the
plural of a recognized placetype, the corresponding linked display form is returned (with plural placetypes displaying
as plural but linked to the singular form of the placetype). Otherwise, return nil. If we're generating the description
of a category, `category_type` should be set to one of `"top-level"` (for top-level categories like
[[:Category:Neighborhoods]]), `"noncity"` (for non-city categories like [[:Category:Neighborhoods in Illinois, USA]]) or
`"city"` (for city categories like [[:Category:Neighborhoods of Chicago]]). Otherwise, we're generating the description
for use in formatting a {{tl|place}} call, and category-only placetypes ending in `!` will be ignored, along with
special `category_link*` settings. `return_full` is used along with `category_type` and will preferably return the
"full" variant of category link settings, i.e. `full_category_link*`; if they don't exist, the `category_link*` value is
prepended with `"names of"`. `noerror` says to not throw an error when encountering entry placetypes that would be
disallowed.
]==]
function export.get_placetype_display_form(placetype, category_type, return_full, noerror)
local from_category = not not category_type
local canon_placetype, ptdata, ptmatch = export.get_placetype_data(placetype, from_category)
if canon_placetype then
local raw_link
local function is_linked_string(str)
return type(str) == "string" and str:find("%[%[")
end
if category_type then
local fetched_full
local function fetch_maybe_full(prop)
local retval = ptdata["full_" .. prop]
if retval ~= nil then
if return_full then
return retval, true
else
internal_error("Saw full_" .. prop .. "=%s but `return_full` not set, can't handle", retval)
end
end
return ptdata[prop], false
end
local function maybe_prefix(str)
if return_full and not fetched_full then
return "names of " .. str
else
return str
end
end
-- Careful with `false` as possible value.
if category_type == "top-level" then --ห้ามแปล
raw_link, fetched_full = fetch_maybe_full("category_link_top_level")
elseif category_type == "noncity" then --ห้ามแปล
raw_link, fetched_full = fetch_maybe_full("category_link_before_noncity")
elseif category_type == "city" then --ห้ามแปล
raw_link, fetched_full = fetch_maybe_full("category_link_before_city")
else
internal_error('Unrecognized value for `category_type` %s, should be "top-level", "noncity" or "city"', --ห้ามแปล
category_type)
end
if type(raw_link) == "string" then
return maybe_prefix(raw_link), ptdata
elseif raw_link ~= nil then
return raw_link, ptdata
end
raw_link, fetched_full = fetch_maybe_full("category_link")
if raw_link == false then
return raw_link, ptdata
end
if is_linked_string(raw_link) then
return maybe_prefix(raw_link), ptdata
end
if ptmatch == "plural" then
raw_link, fetched_full = fetch_maybe_full("plural_link")
if raw_link == false then
return raw_link, ptdata
end
if is_linked_string(raw_link) then
return maybe_prefix(raw_link), ptdata
end
end
if raw_link == nil then
raw_link, fetched_full = fetch_maybe_full("link")
end
if raw_link == false then
return raw_link, ptdata
end
return maybe_prefix(make_placetype_link(raw_link, canon_placetype,
placetype ~= canon_placetype and placetype or nil, ptdata, from_category, noerror)), ptdata
else
if ptmatch == "plural" then
raw_link = ptdata.plural_link
if raw_link == false then
process_error("Placetype %s cannot appear plural", placetype)
end
if is_linked_string(raw_link) then
return raw_link, ptdata
end
end
if raw_link == nil then
raw_link = ptdata.link
end
return make_placetype_link(raw_link, canon_placetype,
placetype ~= canon_placetype and placetype or nil, ptdata, from_category, noerror), ptdata
end
end
return nil
end
local function resolve_unlinked_placename_display_aliases(placetype, placename)
local equiv_placetypes = export.get_placetype_equivs(placetype)
for i, equiv in ipairs(equiv_placetypes) do
equiv_placetypes[i] = equiv.placetype
end
local all_display_aliases_found = {}
local all_others_found = {}
for group, key, spec in m_locations.iterate_matching_location {
placetypes = equiv_placetypes,
placename = placename,
alias_resolution = "display",
} do
if spec.alias_of and spec.display then
insert(all_display_aliases_found, {group, key, spec, spec.display_as_full})
else
insert(all_others_found, {group, key, spec})
end
end
if not all_display_aliases_found[1] then
return placename
elseif all_display_aliases_found[2] then
internal_error("Found multiple matching display aliases for placename %s, placetype %s: " ..
"all_display_aliases_found=%s, all_others_found=%s", placename, placetype, all_display_aliases_found,
all_others_found)
elseif all_others_found[1] then
internal_error("Found a display alias along with other possible meanings for placename %s, placetype %s: " ..
"all_display_aliases_found=%s, all_others_found=%s", placename, placetype, all_display_aliases_found,
all_others_found)
else
local group, key, spec, as_full = unpack(all_display_aliases_found[1])
local full, elliptical = m_locations.key_to_placename(group, key)
return as_full and full or elliptical
end
end
--[==[
If `placename` of type `placetype` is a display alias, convert it to its canonical form; otherwise, return unchanged.
Display aliases transform certain placenames into canonical displayed forms. For example, if any of `country/US`,
`country/USA` or `country/United States of America` (or `c/US`, etc.) are given, the result will be displayed as
`United States`.
'''NOTE''': Display aliases change what is displayed from what the editor wrote in the Wikitext. As a result, they
should (a) be non-political in nature, and (b) not involve a change where the word `the` needs to be added or removed.
For example, normalizing `US` and `USA` to `United States` for display purposes is OK but normalizing `Burma` to
`Myanmar` is not (instead a cat alias should be used) because the terms `Burma` and `Myanmar` have clear political
connotations. Similarly, we have a display alias that maps the old name of `Macedonia` as a country (but not a region!)
to `North Macedonia`, but `Republic of Macedonia` is mapped to `North Macedonia` only as a cat alias because the two
terms differ in their use of `the`. (For example, if we had a display alias mapping `Republic of Macedonia` to
`North Macedonia`, the call {{tl|place|en|the <<capital city>> of the <<c/Republic of Macedonia>>}} would wrongly
display as `the [[capital city]] of the [[North Macedonia]]`.) Generally, display normalizations tend to involve
alternative forms (e.g. abbreviations, ellipses, foreign spellings) where the normalization improves clarity and
consistency.
]==]
function export.resolve_placename_display_aliases(placetype, placename)
-- If the placename is a link, apply the alias inside the link.
-- This pattern matches both piped and unpiped links. If the link is not piped, the second capture (linktext) will
-- be empty.
local link, linktext = rmatch(placename, "^%[%[([^|%[%]]+)|?([^|%[%]]-)%]%]$")
if link then
if linktext ~= "" then
local alias = resolve_unlinked_placename_display_aliases(placetype, linktext)
return "[[" .. link .. "|" .. alias .. "]]"
else
local alias = resolve_unlinked_placename_display_aliases(placetype, link)
return "[[" .. alias .. "]]"
end
else
return resolve_unlinked_placename_display_aliases(placetype, placename)
end
end
--[==[
Generate the "prefixed" version of a bare key, i.e. prefix it with `the` if correct for this key.
]==]
function export.get_prefixed_key(key, spec)
if spec.the then
return "the " .. key
else
return key
end
end
-- Necessary for use by [[Module:place]]. FIXME: Reorganize the modules so this isn't necessary.
export.iterate_matching_location = m_locations.iterate_matching_location
--[=[
Iterator that iterates over holonyms in `place_desc`. If `first_holonym_index` is given, start iterating at the
specified holonym and stop either when there are no more holonyms or a holonym with modifier `:also` is found. If
`first_holonym_index` is nil or omitted, iterate over all holonyms regardless. If `include_raw_text_holonyms` is
specified, raw text holonyms (those not of the form `placetype/placename`) are returned as well; they can be identified
by the fact that the `placetype` field in the holonym structure is nil. Two values are returned at each iteration, the
holonym index and holonym structure, similar to `ipairs()`.
]=]
function export.get_holonyms_to_check(place_desc, first_holonym_index, include_raw_text_holonyms)
local stop_at_also = not not first_holonym_index
return function(place_desc, index)
while true do
index = index + 1
local this_holonym = place_desc.holonyms[index]
-- If we were passed in a starting holonym index, go up to but not including a holonym marked with `:also`
-- (continue_cat_loop); the categorization code will then restart the loop at that holonym. That holonym
-- will have `:also` marked on it, so make sure not to stop immediately if the first holonym is marked with
-- `:also`.
if not this_holonym or stop_at_also and index > first_holonym_index and this_holonym.continue_cat_loop then
return nil
end
-- If not placetype, we're processing raw text, which we normally want to skip.
if include_raw_text_holonyms or this_holonym.placetype then
return index, this_holonym
end
end
end, place_desc, first_holonym_index and first_holonym_index - 1 or 0
end
--[==[
If the holonym in `data` (in the format as passed to a category handler) refers to a known location, iterate over all
such known locations, returning for each location the corresponding key, spec and group as well as the trail of
ancestral containers. Unlike `iterate_matching_location()`, this specifically checks that there is no mismatch between
the location's containers at any level and any of the following holonyms in the {{tl|place}} spec. The fields in `data`
are:
* `holonym_placetype`: The placetype of the holonym. It can actually be a list of possible placetypes, as with
`iterate_matching_location()`.
* `holonym_placename`: The placename of the holonym.
* `holonym_index`: The index of the holonym among the holonyms in `place_desc`, or nil if the holonym is not among the
holonyms in `place_desc`. (If a holonym index is given, we check for container mismatches among the holonyms
following the specified index, stopping either when encountering a holonym marked with modifier `:also` or, if none
exist, when we run out of holonyms. If no holonym index is given, we check all holonyms for container mismatches.)
* `place_desc`: Description of the place; used for the holonyms, to check for container mismatches.
Returns four values: the location group, the canonical key by which the location is known, the spec object describing
the location and the trail of ancestral containers for the location. The first three values are the same as for
`iterate_matching_location`.
]==]
function export.iterate_matching_holonym_location(data)
local holonym_placetype, holonym_placename, holonym_index, place_desc =
data.holonym_placetype, data.holonym_placename, data.holonym_index, data.place_desc
local matching_location_iterator = m_locations.iterate_matching_location {
placetypes = holonym_placetype,
placename = holonym_placename,
}
return function()
while true do
local group, key, spec = matching_location_iterator()
if not group then
return nil
end
local container_trail = {}
-- For each level of container, check that there are no mismatches (i.e. other location of the same
-- placetype) mentioned. We allow a mismatch at a given level if there's also a match with the container
-- at that level. For example, in the case of Kansas City, defined in [[Module:place/locations]] as a city
-- in Missouri, if we define it as {{tl|place|city|s/Missouri,Kansas}}, we ignore the mismatching state of
-- Kansas because the correct state of Missouri was also mentioned. But imagine we are defining Newark,
-- Delaware as {{tl|place|city|s/Delaware|c/US}} and (as is the case) we have an entry for Newark, New
-- Jersey in [[Module:place/locations]]. Just because the containing location `US` matches isn't enough,
-- because Newark, NJ also has New Jersey as a containing location and there's a mismatch at that level. If
-- there are no mismatches at any level we assume we're dealing with the right known location.
--
-- If at a given level there are multiple containing locations, we count a match if any holonym matches any
-- containing location, and a mismatch only if a holonym exists of the same placetype that doesn't match any
-- containing location.
local containers_mismatch = false
for containers in m_locations.iterate_containers(group, key, spec) do
insert(container_trail, containers)
local match_at_level = false
local mismatch_at_level = false
for other_holonym_index, other_holonym in export.get_holonyms_to_check(place_desc,
holonym_index and holonym_index + 1 or nil) do
local other_source_holonym = other_holonym.augmented_from_holonym
if other_source_holonym and other_source_holonym.placetype == holonym_placetype and
other_source_holonym.unlinked_placename ~= holonym_placename then
-- Ignore holonyms added during the augmentation process for other holonyms of the same
-- placetype as the placetype of the holonym we're considering. See comment in
-- augment_holonyms_with_container() for why we do this.
-- continue; grrr, no 'continue' in Lua
else
local holonym_matches_at_level = false
local holonym_exists_with_same_placetype = false
for _, container in ipairs(containers) do
if not container.spec.no_check_holonym_mismatch then
local full_container_placename, elliptical_container_placename =
m_locations.key_to_placename(container.group, container.key)
local placetypes = container.spec.placetype
if type(placetypes) ~= "table" then
placetypes = {placetypes}
end
local placetype_equivs = {}
for _, pt in ipairs(placetypes) do
m_table.extend(placetype_equivs, export.get_placetype_equivs(pt))
end
local this_holonym_matches = export.get_equiv_placetype_prop_from_equivs(
placetype_equivs, function(placetype)
return other_holonym.placetype == placetype and
(other_holonym.unlinked_placename == full_container_placename or
other_holonym.unlinked_placename == elliptical_container_placename)
end
)
if this_holonym_matches then
holonym_matches_at_level = true
break
end
local this_holonym_exists_with_same_placetype = export.get_equiv_placetype_prop_from_equivs(
placetype_equivs, function(placetype)
return other_holonym.placetype == placetype
end
)
if this_holonym_exists_with_same_placetype then
-- We seem to have a mismatch at this level. But before we decide conclusively that this
-- is the case, check to see whether the putative mismatch is an alias and matches when
-- we resolve the alias.
for oh_group, oh_key, oh_spec, oh_container_trail in
export.iterate_matching_holonym_location {
holonym_placetype = other_holonym.placetype,
holonym_placename = other_holonym.unlinked_placename,
holonym_index = other_holonym_index,
place_desc = place_desc,
} do
local oh_full_placename, oh_elliptical_placename =
m_locations.key_to_placename(oh_group, oh_key)
if oh_full_placename == full_container_placename or
oh_elliptical_placename == elliptical_container_placename then
-- Alias matched when resolved.
this_holonym_matches = true
break
end
end
if this_holonym_matches then
-- Alias matched above when resolved.
holonym_matches_at_level = true
break
else
-- Not an alias, or doesn't match when resolved. We have a true mismatch.
holonym_exists_with_same_placetype = true
end
end
end
end
if holonym_matches_at_level then
match_at_level = true
break
end
if holonym_exists_with_same_placetype then
mismatch_at_level = true
end
end
end
if not match_at_level and mismatch_at_level then
containers_mismatch = true
break
end
end
if not containers_mismatch then
return group, key, spec, container_trail
end
end
end
end
--[==[
If the holonym in `data` (in the format as passed to a category handler) refers to a known location, find and return the
corresponding key, spec and group as well as the trail of ancestral containers. This is like
`iterate_matching_holonym_location()` but throws an error if more than one location matches. (An example where this
would happen is {{tl|place|en|neighborhood|city/Newcastle}}, because there are two known locations named Newcastle. To
fix this, specify additional following disambiguating holonyms, e.g.
{{tl|place|en|neighborhood|city/Newcastle|s/New South Wales}}.
]==]
function export.find_matching_holonym_location(data)
local all_found = {}
for group, key, spec, container_trail in export.iterate_matching_holonym_location(data) do
insert(all_found, {group, key, spec, container_trail})
end
if not all_found[1] then
return nil
elseif all_found[2] then
local holonym_placetype = data.holonym_placetype
if type(holonym_placetype) == "table" then
holonym_placetype = concat(holonym_placetype, ",")
end
local found_keys = {}
for _, found in ipairs(all_found) do
local _, key, _, _ = unpack(found)
insert(found_keys, key)
end
error(("Found multiple matching locations for holonym '%s/%s'; specify disambiguating context in the " ..
"containing holonyms: %s"):format(holonym_placetype, data.holonym_placename, dump(found_keys)))
else
return unpack(all_found[1])
end
end
------------------------------------------------------------------------------------------
-- Placename and placetype data --
------------------------------------------------------------------------------------------
--[==[ var:
This is a map from aliases to their canonical forms. Any placetypes appearing as keys here will be mapped to their
canonical forms in all respects, including the display form. Contrast entries in 'placetype_data' with a fallback, which
applies to categorization and other processes but not to display.
The most important aliases are for holonym placetypes, particularly those that occur often such as "ประเทศ", "รัฐ",
"จังหวัด" and the like. Particularly long placetypes that mostly occur as entry placetypes (e.g.
"census-designated place") can be given abbreviations, but it is generally preferred to spell out the entry placetype.
Note also that we purposely avoid certain abbreviations that would be ambiguous (e.g. "d", which could variously be
interpreted as "department", "อำเภอ" or "division").
]==]
export.placetype_aliases = {
["acomm"] = "autonomous community",
["adr"] = "administrative region",
["adterr"] = "administrative territory", -- Pakistan
["aobl"] = "autonomous oblast",
["aokr"] = "autonomous okrug",
["ap"] = "autonomous province",
["apref"] = "autonomous prefecture",
["aprov"] = "autonomous province",
["ar"] = "autonomous region",
["arch"] = "archipelago",
["arep"] = "autonomous republic",
["aterr"] = "autonomous territory",
["atu"] = "autonomous territorial unit",
["bor"] = "borough",
["c"] = "ประเทศ",
["can"] = "canton",
["carea"] = "council area",
["cc"] = "constituent country",
["cdblock"] = "community development block",
["cdep"] = "Crown dependency",
["CDP"] = "census-designated place",
["cdp"] = "census-designated place",
["clcity"] = "county-level city",
["co"] = "เทศมณฑล",
["cobor"] = "county borough",
["colcity"] = "county-level city",
["coll"] = "collectivity",
["comm"] = "community",
["cont"] = "ทวีป",
["contr"] = "continental region",
["contregion"] = "continental region",
["cpar"] = "civil parish",
["damun"] = "direct-administered municipality",
["dep"] = "dependency",
["department capital"] = "departmental capital",
["dept"] = "department",
["depterr"] = "dependent territory",
["dist"] = "อำเภอ",
["distmun"] = "district municipality",
["div"] = "division",
["emp"] = "จักรวรรดิ",
["fpref"] = "French prefecture",
["gov"] = "governorate",
["govnat"] = "governorate",
["home-rule city"] = "home rule city",
["home-rule municipality"] = "home rule municipality",
["inner-city area"] = "inner city area",
["ires"] = "Indian reservation",
["isl"] = "เกาะ",
["lbor"] = "London borough",
["lga"] = "local government area",
["lgarea"] = "local government area",
["lgd"] = "local government district",
["lgdist"] = "local government district",
["metbor"] = "metropolitan borough",
["metcity"] = "มหานคร",
["metmun"] = "metropolitan municipality",
["mtn"] = "ภูเขา",
["mun"] = "เทศบาล",
["mundist"] = "municipal district",
["nonmetropolitan county"] = "non-metropolitan county",
["obl"] = "oblast",
["okr"] = "okrug",
["p"] = "จังหวัด",
["par"] = "parish",
["parmun"] = "parish municipality",
["pen"] = "peninsula",
["plcity"] = "prefecture-level city",
["plcolony"] = "Polish colony",
["pref"] = "prefecture",
["prefcity"] = "prefecture-level city",
["preflcity"] = "prefecture-level city",
["prov"] = "จังหวัด",
["r"] = "ภูมิภาค",
["range"] = "เทือกเขา",
["rcm"] = "regional county municipality",
["rcomun"] = "regional county municipality",
["rdist"] = "regional district",
["rep"] = "republic",
["rhrom"] = "rural hromada",
["riv"] = "แม่น้ำ",
["rmun"] = "regional municipality",
["robor"] = "royal borough",
["romp"] = "Roman province",
["runit"] = "regional unit",
["rurmun"] = "rural municipality",
["s"] = "รัฐ",
["sar"] = "special administrative region",
["shrom"] = "settlement hromada",
["spref"] = "subprefecture",
["sprefcity"] = "sub-prefectural city",
["sprovcity"] = "subprovincial city",
["submet city"] = "sub-metropolitan city",
["submetropolitan city"] = "sub-metropolitan city",
["sub-prefecture-level city"] = "sub-prefectural city",
["sub-provincial city"] = "subprovincial city",
["sub-provincial district"] = "subprovincial district",
["terr"] = "ดินแดน",
["terrauth"] = "territorial authority",
["twp"] = "township",
["twpmun"] = "township municipality",
["uauth"] = "unitary authority",
["ucomm"] = "unincorporated community",
["udist"] = "unitary district",
["uhrom"] = "urban hromada",
["uterr"] = "union territory",
["utwpmun"] = "united township municipality",
["val"] = "valley",
["vdc"] = "village development committee",
["vil"] = "village",
["voi"] = "voivodeship",
["wcomm"] = "Welsh community",
}
local no_link_def_article = {link = false, article = "the"}
local no_link_no_article = {link = false, article = false}
--[==[ var:
These qualifiers can be prepended onto any placetype and will be handled correctly. For example, the placetype
`large city` will be displayed as `large <nowiki>[[city]]</nowiki>` and categorized as if `city` were specified. If the
value in the following table is a string, the qualifier will display according to the string. If the value is `true`,
the qualifier will be linked to its corresponding Wiktionary entry. If the value is `false`, the qualifier will not be
linked but will appear as-is. Note that these qualifiers do not override placetypes with entries elsewhere that contain
those same qualifiers. For example, the entry for `inland sea` in `placetype_data` will apply in preference to treating
`inland sea` as equivalent to `sea`.
]==]
export.placetype_qualifiers = {
-- generic qualifiers
["huge"] = false,
["tiny"] = false,
["large"] = false,
["big"] = false,
["mid-size"] = false,
["mid-sized"] = false,
["small"] = false,
["sizable"] = false,
["important"] = false,
["long"] = false,
["short"] = false,
["major"] = false,
["minor"] = false,
["high"] = false,
["tall"] = false,
["low"] = false,
["left"] = false, -- left tributary
["right"] = false, -- right tributary
["modern"] = false, -- for use in opposition to "ancient" in another definition
-- "former" qualifiers
["abandoned"] = true,
["ancient"] = true,
["deserted"] = true,
["extinct"] = true,
["former"] = false,
["historic"] = "historical",
["historical"] = true,
["medieval"] = true,
["mediaeval"] = true,
["ruined"] = true,
["traditional"] = true,
-- sea qualifiers
["coastal"] = true,
["inland"] = true, -- note, we also have an entry in placetype_data for 'inland sea' to get a link to [[inland sea]]
["maritime"] = true,
["overseas"] = true,
["seaside"] = true,
["beachfront"] = true,
["beachside"] = true,
["riverside"] = true,
-- lake qualifiers
["freshwater"] = true,
["saltwater"] = true,
["endorheic"] = true,
["oxbow"] = true,
["ox-bow"] = "[[oxbow]]", -- [[ox-bow]] is a red link
["tidal"] = true,
-- land qualifiers
["hilltop"] = true,
["hilly"] = true,
["insular"] = true,
["peninsular"] = true,
["chalk"] = true,
["karst"] = true,
["limestone"] = true,
["mountainous"] = true,
["mountaintop"] = true,
["alpine"] = true,
["volcanic"] = true, -- for an island
-- political status qualifiers
["autonomous"] = true,
["incorporated"] = true,
["special"] = true,
["unincorporated"] = true,
["coterminous"] = true,
-- monetary status/etc. qualifiers
["fashionable"] = true,
["wealthy"] = true,
["affluent"] = true,
["declining"] = true,
-- city vs. rural qualifiers
["urban"] = true,
["suburban"] = true,
["exurban"] = true,
["outlying"] = true,
["remote"] = true,
["rural"] = true,
["outback"] = true,
["inner"] = false,
["inner-city"] = true,
["central"] = false,
["outer"] = false,
-- land use qualifiers
["residential"] = true,
["agricultural"] = true,
["business"] = true,
["commercial"] = true,
["industrial"] = true,
-- business use qualifiers
["railroad"] = true,
["railway"] = true,
["farming"] = true,
["fishing"] = true,
["mining"] = true,
["logging"] = true,
["cattle"] = true,
-- tourism use qualifiers
["resort"] = true, -- note, we also have 'resort city' and 'resort town', that take precedecne
["spa"] = true, -- note, we also have 'spa city' and 'spa town', that take precedecne
["ski"] = true, -- note, we also have 'ski resort city' and 'ski resort town', that take precedecne
-- religious qualifiers
["holy"] = true,
["sacred"] = true,
["religious"] = true,
["secular"] = true,
-- qualifiers for nonexistent places
["claimed"] = false,
["fictional"] = true,
["legendary"] = true,
["mythical"] = true,
["mythological"] = true,
-- directional qualifiers
["northern"] = false,
["southern"] = false,
["eastern"] = false,
["western"] = false,
["north"] = false,
["south"] = false,
["east"] = false,
["west"] = false,
["northeastern"] = false,
["southeastern"] = false,
["northwestern"] = false,
["southwestern"] = false,
["northeast"] = false,
["southeast"] = false,
["northwest"] = false,
["southwest"] = false,
-- seasonal qualifiers
["summer"] = true, -- e.g. for 'summer capital'
["winter"] = true,
-- legal status qualifiers
-- FIXME: Two-word qualifiers don't work yet. But you can enter "de-facto" and it's canonicalized to [[de facto]].
["official"] = true,
["unofficial"] = true,
["de facto"] = true, -- 'de facto capital'
["de-facto"] = "[[de facto]]", -- [[de-facto]] is a red link
["de jure"] = true, -- 'de jure capital'
["de-jure"] = "[[de jure]]", -- [[de-jure]] is a red link
-- NOTE: 'unrecognized/unrecognised' are handled as placetypes 'unrecognized country', 'unrecognized state'
-- misc. qualifiers
["planned"] = true,
["chartered"] = true,
["landlocked"] = true,
["uninhabited"] = true,
-- superlative qualifiers
["first"] = no_link_def_article,
["second"] = no_link_def_article, -- for "second largest" etc.
["third"] = no_link_def_article,
["fourth"] = no_link_def_article,
["last"] = no_link_def_article,
["only"] = no_link_def_article,
["sole"] = no_link_def_article,
["main"] = no_link_def_article,
["largest"] = no_link_def_article,
["biggest"] = no_link_def_article,
["smallest"] = no_link_def_article,
["shortest"] = no_link_def_article,
["longest"] = no_link_def_article,
["tallest"] = no_link_def_article,
["highest"] = no_link_def_article,
["lowest"] = no_link_def_article,
["leftmost"] = no_link_def_article,
["rightmost"] = no_link_def_article,
["innermost"] = no_link_def_article,
["outermost"] = no_link_def_article,
["northernmost"] = no_link_def_article,
["southernmost"] = no_link_def_article,
["westernmost"] = no_link_def_article,
["easternmost"] = no_link_def_article,
["northwesternmost"] = no_link_def_article,
["southwesternmost"] = no_link_def_article,
["northeasternmost"] = no_link_def_article,
["southeasternmost"] = no_link_def_article,
-- several/various
["several"] = no_link_no_article,
["various"] = no_link_no_article,
["numerous"] = no_link_no_article,
["multiple"] = no_link_no_article,
["many"] = no_link_no_article,
["other"] = no_link_no_article,
}
--[==[ var:
In this table, the key qualifiers should be treated the same as the value qualifiers for categorization purposes. This
is overridden by `placetype_data` and `qualifier_to_placetype_equivs`.
]==]
export.former_qualifiers = {
["abandoned"] = {"FORMER"},
["ancient"] = {"ANCIENT", "FORMER"},
["former"] = {"FORMER"},
["extinct"] = {"FORMER"},
["historic"] = {"FORMER"},
["historical"] = {"FORMER"},
["medieval"] = {"ANCIENT", "FORMER"},
["mediaeval"] = {"ANCIENT", "FORMER"},
["ruined"] = {"ANCIENT", "FORMER"},
["traditional"] = {"FORMER"},
}
--[==[ var:
In this table, any placetypes containing these qualifiers that do not occur in `placetype_data` should be mapped to the
specified placetypes for categorization purposes. Entries here are overridden by `placetype_data`.
]==]
export.qualifier_to_placetype_equivs = {
["fictional"] = "fictional location",
["legendary"] = "mythological location",
["mythical"] = "mythological location",
["mythological"] = "mythological location",
-- For e.g. Taiwan as a "claimed province" of China; parts of Belize as claimed by Guatemala; various islands
-- claimed by various parties in East Asia. FIXME: We should conditionalize on what is being claimed since there are
-- also claimed capitals, e.g. Israel and Palestine claim Jerusalem as their capital.
["claimed"] = "claimed political division",
}
--[==[ var:
Mapping from placetypes to the corresponding plural category-only placetype for a capital of that placetype. The reverse
mapping also exists.
]==]
export.placetype_to_capital_cat = {
["autonomous community"] = "autonomous community capitals",
["canton"] = "cantonal capitals",
["comarca"] = "comarca capitals",
["ประเทศ"] = "เมืองหลวงของประเทศ",
-- The following are not obviously different from 'county seats' but the latte terminology is used in the US.
["เทศมณฑล"] = "เมืองหลวงของเทศมณฑล",
["department"] = "departmental capitals",
["อำเภอ"] = "เมืองหลวงของอำเภอ",
["division"] = "division capitals",
["emirate"] = "emirate capitals",
["governorate"] = "governorate capitals",
["hromada"] = "hromada capitals",
["krai"] = "krai capitals",
["มหานคร"] = "เมืองหลวงของมหานคร",
["เทศบาล"] = "เมืองหลวงของเทศบาล",
["oblast"] = "oblast capitals",
["okrug"] = "okrug capitals",
["prefecture"] = "prefectural capitals",
["จังหวัด"] = "เมืองหลวงของจังหวัด",
["raion"] = "raion capitals",
["regency"] = "regency capitals",
["ภูมิภาค"] = "เมืองหลวงของภูมิภาค",
["regional unit"] = "regional unit capitals",
["republic"] = "republic capitals",
["รัฐ"] = "เมืองหลวงของรัฐ",
["ดินแดน"] = "เมืองหลวงของดินแดน",
["voivodeship"] = "voivodeship capitals",
}
--[==[ var:
This contains placenames that should be preceded by an article (almost always "the"). '''NOTE''': There are multiple
ways that placenames can come to be preceded by "the":
# Listed here.
# Given in [[Module:place/locations]] with an initial "the". All such placenames are added to this map by the code
just below the map.
# The placetype of the placename has `holonym_use_the = true` in its placetype_data.
# A regex in placename_the_re matches the placename.
Note that "the" is added only before the first holonym in a place description.
]==]
export.placename_article = {
-- This should only contain info that can't be inferred from [[Module:place/locations]].
["archipelago"] = {
["Cyclades"] = "the",
["Dodecanese"] = "the",
},
["ประเทศ"] = {
["Holy Roman Empire"] = "the",
},
["จักรวรรดิ"] = {
["Holy Roman Empire"] = "the",
},
["เกาะ"] = {
["North Island"] = "the",
["South Island"] = "the",
},
["ภูมิภาค"] = {
["Balkans"] = "the",
["Russian Far East"] = "the",
["Caribbean"] = "the",
["Caucasus"] = "the",
["Middle East"] = "the",
["New Territories"] = "the",
["North Caucasus"] = "the",
["South Caucasus"] = "the",
["West Bank"] = "the",
["Gaza Strip"] = "the",
},
["valley"] = {
["San Fernando Valley"] = "the",
},
}
--[==[ var:
Regular expressions to apply to determine whether we need to put 'the' before a holonym. The key "*" applies to all
holonyms, otherwise only the regexes for the holonym's placetype apply.
]==]
export.placename_the_re = {
-- We don't need entries for peninsulas, seas, oceans, gulfs or rivers
-- because they have holonym_use_the = true.
["*"] = {"^Isle of ", " Islands$", " Mountains$", " Empire$", " Country$", " Region$", " District$", "^City of "},
["bay"] = {"^Bay of "},
["ทะเลสาบ"] = {"^Lake of "},
["ประเทศ"] = {"^Republic of ", " Republic$"},
["republic"] = {"^Republic of ", " Republic$"},
["ภูมิภาค"] = {" [Rr]egion$"},
["แม่น้ำ"] = {" River$"},
["local government area"] = {"^Shire of "},
["เทศมณฑล"] = {"^Shire of "},
["Indian reservation"] = {" Reservation", " Nation"},
["tribal jurisdictional area"] = {" Reservation", " Nation"},
}
--[==[ var:
If any of the following holonyms are present, the associated holonyms are automatically added to the end of the list of
holonyms for categorization (but not display) purposes.
]==]
export.cat_implications = {
["ภูมิภาค"] = {
["Eastern Europe"] = {"continent/ยุโรป"},
["Central Europe"] = {"continent/ยุโรป"},
["Western Europe"] = {"continent/ยุโรป"},
["South Europe"] = {"continent/ยุโรป"},
["Southern Europe"] = {"continent/ยุโรป"},
["Northern Europe"] = {"continent/ยุโรป"},
["Northeast Europe"] = {"continent/ยุโรป"},
["Northeastern Europe"] = {"continent/ยุโรป"},
["Southeast Europe"] = {"continent/ยุโรป"},
["Southeastern Europe"] = {"continent/ยุโรป"},
["North Caucasus"] = {"continent/ยุโรป"},
["South Caucasus"] = {"continent/เอเชีย"},
["South Asia"] = {"continent/เอเชีย"},
["Southern Asia"] = {"continent/เอเชีย"},
["East Asia"] = {"continent/เอเชีย"},
["Eastern Asia"] = {"continent/เอเชีย"},
["Central Asia"] = {"continent/เอเชีย"},
["West Asia"] = {"continent/เอเชีย"},
["Western Asia"] = {"continent/เอเชีย"},
["Southeast Asia"] = {"continent/เอเชีย"},
["North Asia"] = {"continent/เอเชีย"},
["Northern Asia"] = {"continent/เอเชีย"},
["Anatolia"] = {"continent/เอเชีย"},
["Asia Minor"] = {"continent/เอเชีย"},
["Mesopotamia"] = {"continent/เอเชีย"},
["North Africa"] = {"continent/แอฟริกา"},
["Central Africa"] = {"continent/แอฟริกา"},
["West Africa"] = {"continent/แอฟริกา"},
["East Africa"] = {"continent/แอฟริกา"},
["Southern Africa"] = {"continent/แอฟริกา"},
["Central America"] = {"continent/อเมริกากลาง"},
["Caribbean"] = {"continent/อเมริกาเหนือ"},
["Polynesia"] = {"continent/โอเชียเนีย"},
["Micronesia"] = {"continent/โอเชียเนีย"},
["Melanesia"] = {"continent/โอเชียเนีย"},
["Siberia"] = {"country/รัสเซีย", "continent/เอเชีย"},
["Russian Far East"] = {"country/รัสเซีย", "continent/เอเชีย"},
["South Wales"] = {"constituent country/เวลส์", "continent/ยุโรป"},
["Balkans"] = {"continent/ยุโรป"},
["West Bank"] = {"country/ปาเลสไตน์", "continent/เอเชีย"},
["Gaza"] = {"country/ปาเลสไตน์", "continent/เอเชีย"},
["Gaza Strip"] = {"country/ปาเลสไตน์", "continent/เอเชีย"},
}
}
------------------------------------------------------------------------------------------
-- Category and display handlers --
------------------------------------------------------------------------------------------
local function city_type_cat_handler(data)
local entry_placetype = data.entry_placetype
local generic_before_non_cities = export.get_placetype_prop(entry_placetype, "generic_before_non_cities")
if not generic_before_non_cities then
internal_error("city_type_cat_handler called on placetype %s that doesn't have a `generic_before_non_cities`" ..
" setting", entry_placetype)
end
local plural_entry_placetype = export.pluralize_placetype(entry_placetype)
local group, key, spec, container_trail = export.find_matching_holonym_location(data)
if group and not spec.is_former_place and not spec.is_city then
-- Categorize both in key, and in the larger polity that the key is part of, e.g. [[Hirakata]] goes in both
-- "Cities in Osaka Prefecture" and "Cities in Japan". (But don't do the latter if no_container_cat is set.)
local cap_plural_entry_placetype = ucfirst(plural_entry_placetype)
local retcats = {("%s%s%s"):format(cap_plural_entry_placetype, generic_before_non_cities, export.get_prefixed_key(key, spec))} --th
if container_trail[1] and not spec.no_container_cat then
for _, container in ipairs(container_trail[1]) do
insert(retcats, ("%s%s%s"):format(cap_plural_entry_placetype, generic_before_non_cities, export.get_prefixed_key(container.key, container.spec))) --th
end
end
return retcats
end
end
local function capital_city_cat_handler(data, non_city)
local holonym_placetype, holonym_placename, holonym_index, place_desc =
data.holonym_placetype, data.holonym_placename, data.holonym_index, data.place_desc
-- The first time we're called we want to return something; otherwise we will be called for later-mentioned
-- holonyms, which can result in wrongly classifying into e.g. `National capitals`. Simulate the loop in
-- find_placetype_cat_specs() over holonyms so we get the proper `Cities in ...` categories as well as the capital
-- category/categories we add below.
local retcats
if not non_city and place_desc.holonyms then
for h_index, holonym in export.get_holonyms_to_check(place_desc, holonym_index) do
local h_placetype, h_placename = holonym.placetype, holonym.unlinked_placename
retcats = city_type_cat_handler {
entry_placetype = "นคร",
holonym_placetype = h_placetype,
holonym_placename = h_placename,
holonym_index = h_index,
place_desc = place_desc,
}
if retcats then
break
end
end
end
if not retcats then
retcats = {}
end
-- Now find the appropriate capital-type category for the placetype of the holonym, e.g. 'State capitals'. If we
-- recognize the holonym among the known holonyms in [[Module:place/locations]], also add a category like 'State
-- capitals of the United States'. Truncate e.g. 'autonomous region' to 'region', 'union territory' to 'territory'
-- when looking up the type of capital category, if we can't find an entry for the holonym placetype itself (there's
-- an entry for 'autonomous community').
local capital_cat = export.placetype_to_capital_cat[holonym_placetype]
if not capital_cat then
capital_cat = export.placetype_to_capital_cat[holonym_placetype:gsub("^.* ", "")]
end
if capital_cat then
capital_cat = ucfirst(capital_cat)
local inserted_specific_variant_cat = false
if holonym_index then
-- Now find the first recognized holonym location. We don't stop when :also is seen because of the common pattern
-- where we use :also to specify that a given city is the capital at multiple surrounding levels.
local matching_group, matching_key, matching_spec, matching_container_trail, matching_holonym_index
for h_index = holonym_index, #place_desc.holonyms do
if place_desc.holonyms[h_index].placetype then
matching_group, matching_key, matching_spec, matching_container_trail = export.find_matching_holonym_location {
holonym_placetype = place_desc.holonyms[h_index].placetype,
holonym_placename = place_desc.holonyms[h_index].unlinked_placename,
holonym_index = h_index,
place_desc = place_desc,
}
if matching_group then
matching_holonym_index = h_index
break
end
end
end
if matching_holonym_index == holonym_index then
if matching_container_trail[1] and not matching_spec.no_container_cat then
for _, container in ipairs(matching_container_trail[1]) do
insert(retcats, ("%sของ%s"):format(capital_cat, export.get_prefixed_key(container.key,
container.spec)))
inserted_specific_variant_cat = true
end
end
elseif matching_holonym_index then
-- Check to make sure that the holonym placetype we were called on is listed among the
-- divtypes of the location we found.
local function insert_specific_variant_if_possible(key, spec)
return export.get_equiv_placetype_prop(holonym_placetype, function(pt)
local plural_holonym_placetype = export.pluralize_placetype(pt)
local saw_matching_div
if spec.divs then
local divs = spec.divs
if type(divs) ~= "table" then
divs = {divs}
end
for _, div in ipairs(divs) do
if type(div) ~= "table" then
div = {type = div}
end
if plural_holonym_placetype == div.type then
saw_matching_div = true
break
end
end
end
if saw_matching_div then
insert(retcats, ("%sของ%s"):format(capital_cat, export.get_prefixed_key(key, spec)))
return true
end
return false
end)
end
if insert_specific_variant_if_possible(matching_key, matching_spec) then
inserted_specific_variant_cat = true
elseif not matching_spec.no_container_cat then
for _, containers in ipairs(matching_container_trail) do
local saw_no_container_cat = false
for _, container in ipairs(containers) do
if insert_specific_variant_if_possible(container.key, container.spec) then
inserted_specific_variant_cat = true
break
end
saw_no_container_cat = saw_no_container_cat or container.spec.no_container_cat
end
if inserted_specific_variant_cat or saw_no_container_cat then
break
end
end
end
end
else
-- This happens when in an invocation like {{place|en|capital city|s/Haryana,Punjab}} for
-- [[Chandigarh]]. We fall back to older code that doesn't depend on the holonym index existing.
-- FIXME: This may not be necessary. In the example just given, when processing Haryana we add to
-- [[:Category:en:State capitals of India]], and nothing extra gets added when processing Punjab.
-- Possibly we can just skip this case entirely.
local group, key, spec, container_trail = export.find_matching_holonym_location(data)
if group and container_trail[1] and not spec.no_container_cat then
for _, container in ipairs(container_trail[1]) do
insert(retcats, ("%sของ%s"):format(capital_cat, export.get_prefixed_key(container.key,
container.spec)))
inserted_specific_variant_cat = true
end
end
end
if not inserted_specific_variant_cat then
insert(retcats, capital_cat)
end
else
-- We didn't recognize the holonym placetype; just put in 'Capital cities'.
insert(retcats, "เมืองหลวง")
end
return retcats
end
--[=[
This is invoked specially for all placetypes (see the `*` placetype key at the bottom of `placetype_data`). This is used
in two ways:
# To add pages to generic holonym categories like [[:Category:en:สถานที่ในMerseyside, England]] (and
[[:Category:en:สถานที่ในEngland]]) for any pages that have `co/Merseyside` as their holonym.
# To categorize demonyms in bare placename categories like [[:Category:en:Merseyside, England]] if the demonym
description mentions `co/Merseyside` and doesn't mention a more specific placename that also has a category. (In this
case there are none, but we can have demonyms at multiple levels, e.g. in France for individual villages, departments,
administrative regions, and for the entire country, and for example we only want to categorize a demonym into
[[:Category:France]] if no more specific category applies.) Unlike when invoked from {{tl|place}}, a demonym
invocation only adds the most specific holonym category and not the category of any containing polity (hence if we
add [[:Category:en:Merseyside, England]] we won't also add [[:Category:England]]).
This code also handles cities; e.g. for the first use case above, it would be used to add a page that has `city/Boston`
as a holonym to [[:Category:en:สถานที่ในBoston]], along with [[:Category:en:สถานที่ในMassachusetts, USA]] and
[[:Category:en:สถานที่ในthe United States]]. The city handler tries to deal with the possibility of multiple cities
having the same name. For example, the code in [[Module:place/locations]] knows about the city of [[Columbus]],
[[Ohio]], which has containing polities `Ohio` (a state) and `the United States` (a country). If either containing
polity is mentioned, the handler proceeds to return the key `Columbus` (along with `Ohio, USA` and `the United States`).
Otherwise, if any other state or country is mentioned, the handler returns nothing, and otherwise it assumes the
mentioned city is the one we're considering and returns `Columbus` etc. This works correctly if the place only mentions
Ohio and a holonym for a Columbus in a different country is encountered, because of the function
`augment_holonyms_with_container`, which adds the US as a holonym when Ohio is encountered.
The single parameter `data` is as in category handlers. The return value is a list of categories (without the preceding
language code).
]=]
local function generic_place_cat_handler(data)
local from_demonym = data.from_demonym
local retcats = {}
local function insert_retkey(key, spec)
if from_demonym then
insert(retcats, key)
else
insert(retcats, ("สถานที่ใน%s"):format(export.get_prefixed_key(key, spec)))
end
end
local group, key, spec, container_trail = export.find_matching_holonym_location(data)
if group then
if not spec.no_generic_place_cat then
-- This applies to continents and continental regions.
insert_retkey(key, spec)
end
-- Categorize both in key, and in the larger location(s) that the key is part of, e.g. [[Hirakata]] goes in
-- both [[Category:สถานที่ในOsaka Prefecture, Japan]] and [[Category:สถานที่ในJapan]]. But not when
-- no_container_cat is set (e.g. for 'United Kingdom').
if not spec.no_container_cat then
for _, container_set in ipairs(container_trail) do
local stop_adding_containers = false
for _, container in ipairs(container_set) do
if not container.spec.no_generic_place_cat then
insert_retkey(container.key, container.spec)
end
if container.spec.no_container_cat then
stop_adding_containers = true
end
end
if stop_adding_containers then
break
end
end
end
return retcats
end
end
--[==[
Special category handler run for all placetypes that checks for specified division placetypes of known locations and
categorizes appropriately.
]==]
function export.political_division_cat_handler(data)
if data.from_demonym then
return
end
local group, key, spec, container_trail = export.find_matching_holonym_location(data)
if group then
local divlists = {}
if spec.divs then
insert(divlists, spec.divs)
end
if spec.addl_divs then
insert(divlists, spec.addl_divs)
end
for _, divlist in ipairs(divlists) do
if type(divlist) ~= "table" then
divlist = {divlist}
end
for _, div in ipairs(divlist) do
if type(div) == "string" then
div = {type = div}
end
local sgdiv = export.maybe_singularize_placetype(div.type) or div.type
local prep = div.prep or "ของ"
local cat_as = div.cat_as or div.type
if type(cat_as) ~= "table" then
cat_as = {cat_as}
end
if not export.placetype_data[sgdiv] then
internal_error("Placetype %s associated with known location key %s and data %s not found in " ..
"`placetype_data`", sgdiv, key, spec)
end
if sgdiv == data.entry_placetype then
local retcats = {}
for _, pt_cat in ipairs(cat_as) do
if type(pt_cat) == "string" then
pt_cat = {type = pt_cat}
end
local pt_prep = pt_cat.prep or prep
insert(retcats, ucfirst(pt_cat.type) .. pt_prep .. export.get_prefixed_key(key, spec)) --th
end
return retcats
end
end
end
end
end
--[==[
This is used to add pages to "bare" categories like [[:Category:en:Georgia, USA]] for `[[Georgia]]` and any
foreign-language terms that are translations of the state of Georgia. We look at the page title (or its overridden value
in {{para|pagename}}) as well as the glosses in {{para|t}}/{{para|t2}} etc., various extra-info values such as the
modern names in {{para|modern}}, and any values specified using a form-of directive. We need to pay attention to the
entry placetypes specified so we don't overcategorize; e.g. the US state of Georgia is `[[Джорджия]]` in Russian but the
country of Georgia is `[[Грузия]]`, and if we just looked for matching names, we'd get both Russian terms categorized
into both [[:Category:ru:Georgia, USA]] and [[:Category:ru:Georgia]]. We also need to check the containing holonyms to
make sure there isn't a mismatch (so we don't e.g. categorize Newark, Delaware in [[:Category:en:Newark]], which is
intended for Newark, New Jersey).
]==]
function export.get_bare_categories(args, overall_place_spec)
local bare_cats = {}
local place_descs = overall_place_spec.descs
local possible_placetypes_by_place_desc = {}
for i, place_desc in ipairs(place_descs) do
possible_placetypes_by_place_desc[i] = {}
for _, placetype in ipairs(place_desc.placetypes) do
if not export.placetype_is_ignorable(placetype) then
local equivs = export.get_placetype_equivs(placetype, {register_former_as_non_former = true})
for _, equiv in ipairs(equivs) do
insert(possible_placetypes_by_place_desc[i], equiv.placetype)
end
end
end
end
local function check_term(term)
-- Treat Wikipedia links like local ones.
term = term:gsub("%[%[w:", "[["):gsub("%[%[wikipedia:", "[[")
term = export.remove_links_and_html(term)
term = term:gsub("^the ", "")
for i, place_desc in ipairs(place_descs) do
-- Iterate over all matching locations in case there are multiple, as with Delhi defined as
-- {{place|en|megacity/and/union territory|c/India|containing the national capital [[New Delhi]]}}.
for group, key, spec, container_trail in export.iterate_matching_holonym_location {
holonym_placetype = possible_placetypes_by_place_desc[i],
holonym_placename = term,
place_desc = place_desc,
} do
insert(bare_cats, key)
end
end
end
-- FIXME: Should we only do the following if the language is English (requires that the lang is passed in)?
-- We should always do it if `pagename` is given (as it is with {{tcl}}) but maybe not otherwise unless 1=en. There
-- are cases like [[Ankara]] = English name for capital of Turkey, but also the name in various languages for the
-- capital of Ghana (= English [[Accra]]). But this should get caught by mismatching the containing country. The
-- advantage of checking when the language isn't English is we catch those places that fail to give an English
-- translation but where the translation happens to be the same as the other-language spelling. However, I don't
-- know how often this situation occurs.
check_term(args.pagename or mw.title.getCurrentTitle().subpageText)
for _, t in ipairs(args.t) do
check_term(t)
end
local function check_termobj_list(terms)
for _, term in ipairs(terms) do
if term.eq then
check_term(term.eq)
end
if term.alt or term.term then
check_term(term.alt or term.term)
end
end
end
for _, extra_info_terms in ipairs(overall_place_spec.extra_info) do
local arg = extra_info_terms.arg
if arg == "modern" or arg == "now" or arg == "full" or arg == "short" then
check_termobj_list(extra_info_terms.terms)
end
end
for _, directive in ipairs(overall_place_spec.directives) do
check_termobj_list(directive.terms)
end
return bare_cats
end
--[==[
This is used to augment the holonyms associated with a place description with the containing polities. For example,
given the following:
`# {{tl|place|en|subprefecture|pref/Hokkaido}}.`
We auto-add Japan as another holonym so that the term gets categorized into [[:Category:Subprefectures of Japan]].
To avoid over-categorizing we need to check to make sure no other countries are specified as holonyms.
]==]
function export.augment_holonyms_with_container(place_descs)
for _, place_desc in ipairs(place_descs) do
if place_desc.holonyms then
-- This ends up containing a copy of the original holonyms, with the augmented holonyms inserted in their
-- appropriate position. We don't just put them at the end because some holonyms have use the `:also`
-- modifier, which causes category processing to restart at that point after generating categories for a
-- preceding holonym, and we don't want the preceding holonym's augmented holonyms interfering with
-- categorization of a later holonym. We proceed from right to left, and each time we augment, we copy
-- the holonyms with the augmented holonym(s) inserted appropriately and replace the place description's
-- holonyms with the augmented ones before the next iteration. The reason for this is so that e.g.
-- {{place|neighborhood|city/Birmingham|co/West Midlands|cc/England}} doesn't throw an error during the
-- augmentation process due to 'Birmingham' referring to two known locations (in England and Alabama). If
-- we go left to right, we will throw an ambiguity error on `city/Birmingham` because code to exclude
-- Birmingham, Alabama needs `c/United Kingdom` present (to cause a mismatch with `c/United States`),
-- which isn't yet present as the augmentation code hasn't gotten to `cc/England` yet. For similar
-- reasons, we need to include the augmented holonyms in the holonyms considered in the next iteration
-- rather than modifying the place description once at athe end.
for i = #place_desc.holonyms, 1, -1 do
local holonym = place_desc.holonyms[i]
if holonym.placetype and not export.placetype_is_ignorable(holonym.placetype) then
local group, key, spec, container_trail = export.find_matching_holonym_location {
holonym_placetype = holonym.placetype,
holonym_placename = holonym.unlinked_placename,
holonym_index = i,
place_desc = place_desc,
}
if group and container_trail[1] and not spec.no_auto_augment_container then
local augmented_holonyms = {}
for j = 1, i do
insert(augmented_holonyms, place_desc.holonyms[j])
end
for _, containers in ipairs(container_trail) do
local any_no_auto_augment_container = false
for _, container in ipairs(containers) do
any_no_auto_augment_container = any_no_auto_augment_container or
container.spec.no_auto_augment_container
local containing_type = container.spec.placetype
if type(containing_type) == "table" then
-- If the containing type is a list, use the first element as the canonical variant.
containing_type = containing_type[1]
end
local full_container_placename, elliptical_container_placename =
m_locations.key_to_placename(container.group, container.key)
-- Don't side-effect holonyms while processing them.
local new_holonym = {
-- By the time we run, the display has already been generated so we don't need to
-- set display_placename.
placetype = containing_type,
-- placename_to_key() for the group should correctly handle both full and elliptical
-- placenames, but the full placename seems less likely to be ambiguous. FIXME: We
-- should just store the key directly and use it when available to avoid having to
-- convert key to placename and back to key.
unlinked_placename = full_container_placename,
-- Indicate that this is an augmented holonym, and was derived from the specified
-- holonym. In iterate_matching_holonym_location(), we ignore augmented holonyms
-- derived from holonyms that are different from the holonym we're searching for but
-- of the same placetype. This is to correctly handle a situation like
-- {{place|river|dept/Ardèche,Gard,Vaucluse,Bouches-du-Rhône|c/France}}. Here,
-- `Ardèche` is in `r/Auvergne-Rhône-Alpes`, while `Gard` is in `r/Occitania` and
-- the other two are in `r/Provence-Alpes-Côte d'Azur`. Augmenting proceeds from
-- right to left, so after it adds `r/Provence-Alpes-Côte d'Azur` to
-- `Bouches-du-Rhône`, Vaucluse gets augmented correctly but `Gard` fails to match
-- in find_matching_holonym_location() because of the mismatch between augmented
-- `r/Provence-Alpes-Côte d'Azur` and actual `r/Occitania`. Similarly, all later
-- calls to find_matching_holonym_location() fail to match `Gard` (and likewise
-- `Ardèche`) against any known location. To deal with this, we mark augmented
-- holoynms as being augmented due to a source holonym, and when processing a given
-- holonym, ignore augmented holonyms from other holonyms of the same placetype.
-- The restriction to the same placetype is so that `Birmingham` still gets
-- correctly disambiguated to Birmingham, England in the example given above near
-- the top of this function, using the augmented holonym `c/United Kingdom` added by
-- the specified `cc/England` (whose placetype `constituent country` differs from
-- the placetype `city` of Birmingham).
augmented_from_holonym = holonym,
}
insert(augmented_holonyms, new_holonym)
-- But it is safe to modify other parts of the place_desc.
export.key_holonym_into_place_desc(place_desc, new_holonym)
end
if any_no_auto_augment_container then
break
end
end
for j = i + 1, #place_desc.holonyms do
insert(augmented_holonyms, place_desc.holonyms[j])
end
place_desc.holonyms = augmented_holonyms
end
end
end
end
end
end
-- Cat handler for district, areas, neighborhoods and suburbs. Districts are tricky because they can either be political
-- divisions or city neighborhoods. Areas similarly can be political divisions (rarely; specifically, in Kuwait), city
-- neighborhoods or larger geographical areas/regions. We handle this as follows:
-- (1) `placetype_data` cat entries for specific countries or country divisions take precedence over cat_handlers, so if
-- the user says {{tl|place|district|s/Maharashtra|c/India}}, we won't even be called because there is an entry that
-- categorizes into [[:Category|Districts of Maharashtra, India]].
-- (2) If we're called, we check the holonym we're called on to see if it is a recognized city, e.g. if we're called
-- using {{tl|place|district|city/Mumbai|s/Maharashtra|c/India}}. If so, we categorize under e.g.
-- [[:Category:Neighbourhoods of Mumbai]]. (Choosing the spelling "neighbourhoods" because we're in India.)
-- (3) If we're called and the holonym is not a recognized city, we check if the placetype has has_neighborhoods set.
-- If so, it's "city-like" and we categorize under the first containing polity that we recognize. For example, if
-- we're called using {{tl|place|district|town/Northampton|co/Hampshire|s/Massachusetts|c/US}}, we should recognize
-- town as "city-like" and categorize under [[:Category:Neighborhoods in Massachusetts]]. (Note "ใน" not "ของ", and
-- note the spelling "neighborhoods" because we're in the US.)
-- (4) If the holonym is not city-like, we do nothing. If there's a city or city-like placetype farther up (e.g. we're
-- called as {{tl|place|district|ward/Foo|mun/Bar|...}}), we will handle the city-like entity according to (2) or
-- (3) when called on that holonym. Otherwise either the categorization in (1) takes place or there's no
-- categorization.
local function district_neighborhood_cat_handler(data)
local function get_plural_entry_placetype(location_spec, container_trail)
if data.entry_placetype == "suburb" then
return "Suburbs"
else
-- Check for `british_spelling` setting on the spec itself or any container.
local uses_british_spelling = location_spec.british_spelling
if uses_british_spelling == nil and container_trail then
for _, container_set in ipairs(container_trail) do
local must_outer_break = false
for _, container in ipairs(container_set) do
if container.spec.british_spelling ~= nil then
uses_british_spelling = container.spec.british_spelling
must_outer_break = true
break
end
end
if must_outer_break then
break
end
end
end
return uses_british_spelling and "Neighbourhoods" or "Neighborhoods"
end
end
-- First check the immediate holonym to see if it's a city or a city-like top-level entity (Hong Kong, Bonaire,
-- etc.)
local group, key, spec, container_trail = export.find_matching_holonym_location(data)
if group and not spec.is_former_place and spec.is_city then
return {get_plural_entry_placetype(spec, container_trail) .. " of " .. export.get_prefixed_key(key, spec)}
end
-- If the entry placetype is neighbo(u)rhood, assume it is a neighborhood even if there isn't a city-like
-- entity father up the chain. (E.g. due to a mistaken use of m/ instead of mun/ for municipality.)
local has_neighborhoods
local entry_placetype = data.entry_placetype
if entry_placetype == "neighborhood" or entry_placetype == "neighbourhood" or entry_placetype == "suburb" then
has_neighborhoods = true
else
-- Otherwise, make sure the current holonym is city-like.
has_neighborhoods = export.get_equiv_placetype_prop(data.holonym_placetype, function(pt)
return export.get_placetype_prop(pt, "has_neighborhoods")
end, {continue_on_nil_only = true})
end
if has_neighborhoods then
-- Loop up the holonyms, looking for city and city-like entities in case of e.g. [[Sepulveda]] written
-- {{place|en|neighborhood|valley/San Fernando Valley|city/Los Angeles|s/California|c/USA}}
-- but also look for a recognizable poldiv, and if so categorize as "Neighborhoods in POLDIV". We need
-- to start with the current holonym, which is especially important for neighborhoods and suburbs that
-- may have the first holonym be a recognizable province, etc. but can't hurt otherwise. (Previously
-- we skipped the first/current holonym.)
for other_holonym_index, other_holonym in export.get_holonyms_to_check(data.place_desc,
data.holonym_index) do
local other_holonym_data = {
holonym_placetype = other_holonym.placetype,
holonym_placename = other_holonym.unlinked_placename,
holonym_index = other_holonym_index,
place_desc = data.place_desc,
}
local group, key, spec, container_trail = export.find_matching_holonym_location(other_holonym_data)
if group and not spec.is_former_place then
return {get_plural_entry_placetype(spec, container_trail) .. (spec.is_city and "ของ" or "ใน") ..
export.get_prefixed_key(key, spec)}
end
end
end
end
function export.check_already_seen_string(holonym_placename, already_seen_strings)
local canon_placename = ulower(m_links.remove_links(holonym_placename))
if type(already_seen_strings) ~= "table" then
already_seen_strings = {already_seen_strings}
end
for _, already_seen_string in ipairs(already_seen_strings) do
if canon_placename:find(already_seen_string) then
return true
end
end
return false
end
-- Prefix display handler that adds a prefix such as "Metropolitan Borough of " to the display
-- form of holonyms. We make sure the holonym doesn't contain the prefix or some variant already.
-- We do this by checking if any of the strings in ALREADY_SEEN_STRINGS, either a single string or
-- a list of strings, or the prefix if ALREADY_SEEN_STRINGS is omitted, are found in the holonym
-- placename, ignoring case and links. If the prefix isn't already present, we create a link that
-- uses the raw form as the link destination but the prefixed form as the display form, unless the
-- holonym already has a link in it, in which case we just add the prefix.
local function prefix_display_handler(prefix, holonym_placename, already_seen_strings)
if export.check_already_seen_string(holonym_placename, already_seen_strings or ulower(prefix)) then
return holonym_placename
end
if holonym_placename:find("%[%[") then
return prefix .. " " .. holonym_placename
end
return prefix .. " [[" .. holonym_placename .. "]]"
end
-- Suffix display handler that adds a suffix such as " parish" to the display form of holonyms.
-- Works identically to prefix_display_handler but for suffixes instead of prefixes.
local function suffix_display_handler(suffix, holonym_placename, already_seen_strings, include_suffix_in_link)
if export.check_already_seen_string(holonym_placename, already_seen_strings or ulower(suffix)) then
return holonym_placename
end
if holonym_placename:find("%[%[") then
return holonym_placename .. " " .. suffix
end
if include_suffix_in_link then
return "[[" .. holonym_placename .. " " .. suffix .. "]]"
else
return "[[" .. holonym_placename .. "]] " .. suffix
end
end
-- Display handler for boroughs. New York City boroughs are display as-is. Others are suffixed
-- with "borough".
local function borough_display_handler(holonym_placetype, holonym_placename)
local unlinked_placename = m_links.remove_links(holonym_placename)
if m_locations.new_york_boroughs[unlinked_placename] then
-- Hack: don't display "borough" after the names of NYC boroughs
return holonym_placename
end
return suffix_display_handler("borough", holonym_placename)
end
local function county_display_handler(holonym_placetype, holonym_placename)
local unlinked_placename = m_links.remove_links(holonym_placename)
-- Display handler for Irish counties. Irish counties are displayed as e.g. "County [[Cork]]".
if m_locations.ireland_counties["County " .. unlinked_placename .. ", Ireland"] or
m_locations.northern_ireland_counties["County " .. unlinked_placename .. ", Northern Ireland"] then
return prefix_display_handler("เทศมณฑล", holonym_placename)
end
-- Display handler for Taiwanese counties. Taiwanese counties are displayed as e.g. "[[Chiayi]] County".
if m_locations.taiwan_counties[unlinked_placename .. " County, Taiwan"] then
return suffix_display_handler("เทศมณฑล", holonym_placename)
end
-- Display handler for Romanian counties. Romanian counties are displayed as e.g. "[[Cluj]] County".
if m_locations.romania_counties[unlinked_placename .. " County, Romania"] then
return suffix_display_handler("เทศมณฑล", holonym_placename)
end
-- FIXME, we need the same for US counties but need to key off the country, not the specific county.
-- Others are displayed as-is.
return holonym_placename
end
-- Display handler for prefectures. Japanese prefectures are displayed as e.g. "[[Fukushima]] Prefecture".
-- Others are displayed as e.g. "[[Fthiotida]] prefecture".
local function prefecture_display_handler(holonym_placetype, holonym_placename)
local unlinked_placename = m_links.remove_links(holonym_placename)
local suffix = m_locations.japan_prefectures[unlinked_placename .. " Prefecture, Japan"] and "Prefecture" or "prefecture"
return suffix_display_handler(suffix, holonym_placename)
end
-- Display handler for provinces of Iran, Laos, North and South Korea, Thailand, Turkey and Vietnam. Recognized
-- provinces are displayed as e.g. "[[Gyeonggi]] Province" or "[[Antalya]] Province". Others are displayed as-is.
local function province_display_handler(holonym_placetype, holonym_placename)
local unlinked_placename = m_links.remove_links(holonym_placename)
if
m_locations.iran_provinces[unlinked_placename .. ", Iran"] or
m_locations.laos_provinces[unlinked_placename .. ", Laos"] or
m_locations.north_korea_provinces[unlinked_placename .. ", North Korea"] or
m_locations.south_korea_provinces[unlinked_placename .. ", South Korea"] or
m_locations.thailand_provinces[unlinked_placename .. ", ไทย"] or
m_locations.turkey_provinces[unlinked_placename .. ", Turkey"] or
m_locations.vietnam_provinces[unlinked_placename .. ", เวียดนาม"] then
return suffix_display_handler("จังหวัด", holonym_placename)
end
return holonym_placename
end
-- Display handler for Nigerian states. Nigerian states are display as "[[Kano]] State". Others are displayed as-is.
local function state_display_handler(holonym_placetype, holonym_placename)
local unlinked_placename = m_links.remove_links(holonym_placename)
if m_locations.nigeria_states[unlinked_placename .. " State, Nigeria"] then
return suffix_display_handler("รัฐ", holonym_placename)
end
return holonym_placename
end
-- Display handler for voivodeships. Display as e.g. [[Subcarpathian Voivodeship]].
local function voivodesip_display_handler(holonym_placetype, holonym_placename)
return suffix_display_handler("Voivodeship", holonym_placename, nil, "include_suffix_in_link")
end
------------------------------------------------------------------------------------------
-- Placetype data --
------------------------------------------------------------------------------------------
--[==[ var:
Main placetype data structure. This specifies, for each canonicalized placetype, various properties. The keys are
placetypes (in the singular, except for category-only placetypes, which are plural and followed by `!`), and the value
is a table of properties. The `"*"` key is special and is used for adding "generic" categories of the form
`สถานที่ใน``location`` `; it runs for all entry placetypes. Keys in the form of plural placetypes followed by `!` are
used only in [[Module:category tree/topic cat/data/Places]] for specifying the properties of categories containing the
specified placetype, esp. bare categories like [[:Category:States and territories]] (rather than qualified categories
like [[:Category:States and territories of Australia]]).
Keys under the value table for a given placetype of are two types: ''property keys'' (which specify the value of
specific properties) and ''categorization keys'' (which tell how to categorize certain sorts of holonyms if the
placetype in question occurs as an entry placetype). Categorization keys are either the special value `default` or are
wildcard strings with a slash in them, such as `"country/*"`. Note that only wildcard strings are currently allowed
directly in the placetype data; everything else is handled through category handlers, either per-placetype or special
(such as `political_division_cat_handler`). The algorithm for how category keys and handlers are used to generate
categories is described at the top of [[Module:place]].
There are several recognized property keys, of various types:
1. The following link-related property keys are recognized:
* `link`: '''Required''' except in category-only placetypes ending in `!`. Describes how to link and display the
placetype in the formatted description when occurring as an entry placetype. Also used for formatting pluralized
placetypes (which may occur in entry placetypes, esp. new-format ones, such as `two <<islands>>`) and may occur in
categories). The possible values are:
*# `true`: Link to the same-named Wiktionary entry. This creates a raw link, e.g. `<nowiki>[[city]]</nowiki>`, which is
converted to an English-specific link by JavaScript postprocessing. If the placetype is plural, this creates a
two-part raw link e.g. `<nowiki>[[city|cities]]</nowiki>`.
*# `"w"`: Link to the same-named Wikipedia entry. This creates a two-part link, e.g.
`<nowiki>[[w:census town|census town]]</nowiki>`, or `<nowiki>[[w:census town|census towns]]</nowiki>` if the
placetype is given plural.
*# `"+..."`: Create a two-part link to the entry following the `+` sign. For example, if `cercle` specifies
`"+w:cercles of Mali"`, a two-part link `<nowiki>[[w:cercles of Mali|cercle]]</nowiki>` will be generated, or
`<nowiki>[[w:cercles of Mali|cercles]]</nowiki>` if plural `cercles` is specified.
*# `"separately"`: Link each word separately. For example, if `administrative territory` specifies `"separately"`, it
will be linked as `<nowiki>[[administrative]] [[territory]]</nowiki>`, or as
`<nowiki>[[administrative]] [[territory|territories]]</nowiki>` if plural `administrative territories` is given.
*# another string: Use that string directly. If the placetype is plural, `pluralize()` in [[Module:en-utilities]] is
called on the string, which will correctly pluralize most strings, including those with links in them. (If there
are multiple links, the display form of the last link is pluralized.)
*# `false`: This placetype is not allowed as an entry placetype. An error will be thrown if this placetype is given as
an entry placetype. This is specified for internal-use placetypes, especially placetypes used in conjunction with
the qualifiers `former`, `ancient`, `historical` and such.
* `plural_link`: If specified and the placetype is plural, use the value in place of generating a pluralized version of
the link spec in `link`. Most commonly, this is either a string with links in it (which is used directly) or the
value `false`, indicating that the placetype cannot occur plural. (This is used for example by `caplc`, which displays
as `<nowiki>[[capital]] and [[large]]st [[city]]</nowiki>`, where a plural version doesn't make sense.) Generally if
this is specified, `plural` also needs to be specified to give a special placetype plural; this situation occurs
especially with multiword placetypes where something other than the last word is pluralized. An example is
`town with bystatus`, whose plural is `towns with bystatus`, which needs to be explicitly given. This example uses
`link = <nowiki>"[[town]] with [[bystatus#Norwegian Bokmål|bystatus]]"</nowiki>` ({{m|nb|bystatus}}) is a Norwegian
Bokmål word, and template calls aren't currently permitted in link strings), along with
`plural_link = <nowiki>"[[town]]s with [[bystatus#Norwegian Bokmål|bystatus]]"</nowiki>`.
* `category_link`: Spec indicating how to display the placetype when occurring in category descriptions. Defaults to
the value of `link`, and in turn is overridden by more specific `category_link_*` keys; see below. Category-only
placetypes (which are plural and end in `!`) usually use `category_link` in preference to `link`. The value of
`category_link` can be any of the types of specs given above, but most commonly is a plural string with links in it,
spelling out the description; in this case it is used directly. When both `category_link` and `link` are given, the
value in `category_link` is typically longer and more descriptive. For example, `polity` uses `link = true`, which
just generates a link `<nowiki>[[polity]]</nowiki>` or plural `<nowiki>[[polity|polities]]</nowiki>`, but specifies a
separate `category_link = <nowiki>"[[independent]] or [[semi-]][[independent]] [[polity|polities]]"</nowiki>`, which
clarifies in the category description what a polity is.
* `category_link_top_level`: Spec indicating how to display top-level (bare/unqualified) categories, i.e. categories
where the placetype is not followed by `in ``location`` ` or `of ``location`` `. If given, this overrides
`category_link` for this type of category.
* `category_link_before_noncity`: Spec indicating how to display qualified categories of the form
` ``placetypes`` in/of ``location`` ` where ``location`` does not refer to a city. If given, this overrides
`category_link` for this type of category.
* `category_link_before_city`: Spec indicating how to display qualified categories of the form
` ``placetypes`` in/of ``location`` ` where ``location`` refer to a city. If given, this overrides `category_link` for
this type of category. An example where this is given is `neighborhood`, which uses the following specs:<ol>
<li>`link = true`</li>
<li>`category_link = <nowiki>"[[neighborhood]]s, [[district]]s and other subportions of [[city|cities]]"</nowiki>`</li>
<li>`category_link_before_city = <nowiki>"[[neighborhood]]s, [[district]]s and other subportions"</nowiki>`</li>
</ol> This has the effect of making the entry placetype `neighborhood` display as just
`<nowiki>[[neighborhood]]</nowiki>`, while e.g. a category like `Neighborhoods of Chicago` displays as
`<nowiki>[[neighborhood]]s, [[district]]s and other subportions of [[Chicago]], ...</nowiki>` and a category like
`Neighborhoods in Illinois, USA` displays as
`<nowiki>[[neighborhood]]s, [[district]]s and other subportions of [[city|cities]] in [[Illinois]], ...</nowiki>`.
* `disallow_in_entries`: If specified, this placetype cannot occur as an entry placetype, and the specified value
(a message indicating what to use instead) is displayed in the error message.
* `disallow_in_holonyms`: If specified, this placetype cannot occur as a holonym placetype, and the specified value
(a message indicating what to use instead) is displayed in the error message.
2. There is currently one fallback-related property key recognized:
* `fallback`: If specified, its value is a placetype which will be used for categorization purposes if no categories
get added using the placetype itself. As an example, `branch` sets a fallback of `river` but also sets
`preposition = "ของ"`, meaning that {{tl|place|en|branch|riv/Mississippi}} displays as `a branch of the Mississippi`
(whereas `river` itself uses the preposition `in`), but otherwise categorizes the same as `river`. A more complex
example is `area`, which sets a fallback of `geographic and cultural area` and also sets a category handler that
checks for cities or city-like entities (e.g. boroughs) occurring as holonyms and categorizes the toponym under
[[:Category:Neighborhoods of CITY]] (for recognized cities) or otherwise [[:Category:Neighborhoods of POLDIV]] (for
the nearest containing recognized location). In addition, `area` is set as a political division of Kuwait, meaning if
`c/Kuwait` occurs as holonym, the toponym is categorized under [[:Category:Areas of Kuwait]]. If none of these
categories trigger, the fallback of `geographic and cultural area` will take effect, and the toponym will be
categorized as e.g. [[:Category:Geographic and cultural areas of England]].
3. There is currently one property to control irregular plurals of placetypes:
* `plural`: If specified, its value is the plural of the placetype. Otherwise, the default pluralization algorithm in
[[Module:en-utilities]] applies (which correctly pluralizes most words, including those ending in `-y`, `-ch`, `-sh`,
`-x`, etc.). The value of `plural` is also used when converting a pluralized placetype into its singular equivalent;
for example, since the placetype `kibbutz` has `plural = "kibbutzim"`, the placetype `kibbutzim` will be recognized
as a plural and singularized to `kibbutz`. For this reason, it's occasionally necessary to specify a `plural` value
even when the default pluralization algorithm works correctly, if the default singularization algorithm won't
correctly reverse the pluralization (as with `pass` and other terms ending in `-ss`).
4. The following property keys relate to generating categories for entry placetypes and specifying the parents of those
categories:
* `class`: The general class of placetype. This is used for various purposes: (a) to categorize placetypes preceded by
a qualifier such as `former`, `ancient`, `medieval` or `historical` (note that these placetypes are not all treated
alike); (b) to determine the parent category of bare placetype categories (e.g. [[:Category:Villages]] for placetype
`village`); (c) to determine whether to add a parent category `political divisions of specific countries` to
qualified placetype categories (e.g. [[:Category:Villages in Mali]]). The possible values are:
*# `polity`: a more-or-less sovereign/independent polity, such as a country, kingdom or empire.
*# `subpolity`: a non-sovereign division of a polity, above the level of an individual settlement.
*# `settlement`: a city or smaller equivalent, such as a village. This also includes administrative divisions of a
settlement, such as wards and barangays.
*# `non-admin settlement`: similar to a settlement but without administrative or political significance, such as an
unincorporated community, farm or neighborhood.
*# `capital`: a settlement that is a capital. A former capital is generally still in existence, just not the capital
any more.
*# `natural feature`: any non-man-made feature, such as a lake, mountain, island, ocean, etc.
*# `man-made structure`: a man-made feature below the level of a neighborhood, such as a house, airport, university,
metro station, park or the like.
*# `geographic region`: a geographic or cultural region or area that has no administrative significance. These may vary
greatly in size but typically have some sort of cultural significance (possibly historical). The `former`, `ancient`,
etc. qualifier has no effect on the category of these placetypes.
*# `generic place`: a place that isn't further qualified into any specific subtype.
* `former_type`: The class of placetype used for categorizing placetypes preceded by a qualifier such as `former`,
`ancient`, `medieval` or `historical`. The possible values are the same as for `class` but with the addition of
`dependent territory` (for colonies, protectorates and the like) and `!` (ignore the historical/former/ancient/etc.
qualifier; used e.g. with `fictional location` and `mythological location`). If not specified, the value of `class`
is used. When a qualifier such as `former`, `ancient`, `medieval` or `historical` is encountered (specifically, those
in `former_qualifiers`), it is mapped using `former_qualifiers` to the appropriate internal qualifier or qualifiers
(one or both of `ANCIENT` and/or `FORMER`, which are written in all-caps to distinguish them from user-specified
qualifiers), which is prepended to the value of `former_type` or `class` to form a placetype whose properties are
looked up to determine how to categorize the toponym in question. For example, if `medieval village` is given, we map
`medieval` to `ANCIENT` and `FORMER`, and `village` to its `class` of `settlement`, and enter the placetypes
`ANCIENT settlement` and `FORMER settlement` (in that order) into the list of equivalent placetypes returned by
`get_placetype_equivs`. In this case, there is an entry in `placetype_data` for `ANCIENT settlement`, so its default
category spec `Ancient settlements` is used as the category. If on the other hand `medieval kingdom` is given, where
`kingdom` has a `class` value `polity`, we first look up `ANCIENT polity`, see there is no entry in `placetype_data`
for it, and then look up `FORMER polity`, which exists and has a default category spec `Former polities`, which is
used as the category. Note that if the placetype following the "former" qualifier is recognized in `placetype_data`
but has no `former_type` or `class` and no fallback with a `former_type` or `class` specified, it is an internal
error; but if the placetype isn't recognized (e.g. something like `former greenhouse` is specified and we don't have
an entry for `greenhouse`), we just track the occurrence and end up not categorizing.
* `bare_category_parent`: This specifies the first parent category of a bare placetype category named according to the
placetype in question (e.g. [[:Category:Atolls]] for placetype `atoll`, or [[:Category:Named buildings]] for
placetype `named buildings!`). If not specified, the first parent category is determined by the value of `class`,
using the mapping `class_to_bare_category_parent` in [[Module:category tree/topic cat/data/Places]].
* `addl_bare_category_parents`: Extra parent categories to add a bare placetype category to (see `bare_category_parent`
just above).
* `bare_category_breadcrumb`: Breadcrumb for bare placetype categories. Also used as the sort key of
`bare_category_parent` if it is a string.
* `inherently_former`: If specified and the given placetype is used as an entry placetype, act as if `former` or
`ancient` (depending on the value of `inherently_former`) were prefixed to the placetype. This is for placetypes that
always refer to no-longer-existing entities, such as `satrapy` and `treaty port`. The value of `inherently_former` is
a list of internal qualifiers (one or more of `ANCIENT` and/or `FORMER`), just as for `former_qualifiers`, and the
implementation is the same.
* `cat_handler`: Handler used to generate the categories to add a given toponym to, if its entry placetype is the
placetype in question. Generally the `cat_handler` function checks the holonyms specified in order to determine which
category or categories to generate. For example, `district_neighborhood_cat_handler` handles placetypes `district`,
`neighborhood`, `subdivision`, `suburb` and the like, and either adds the toponym to a category like
`Neighborhoods of ``city`` ` (if a recognized city is given as a holonym), or otherwise a category like
`Neighborhoods in ``location`` ` (for the first recognized non-city location given as a holonym, if an unrecognized
city or city-like entity is given before the recognized non-city). The algorithm that runs the category handlers
iterates over holonyms from left to right, running the `cat_handler` function on each holonym in turn until one or
more categories are returned; see below for more specifics. (Note that countries for which e.g. a `district` is a
political division do not get the corresponding category added by the `district_neighborhood_cat_handler` function but
by `political_division_cat_handler`.) `cat_handler` functions are called with one argument, `data`, describing the
resolved entry placetype (i.e. after resolving placetype aliases and fallbacks) and the holonym being processed. The
return value should be a list of category specs (categories minus the langcode prefix, with `+++` standing for the
holonym key, or the value `true`, which stands for ` ``Placetypes`` in/of ``Holonym`` `, i.e. the pluralized placetype
with the appropriate preposition as specified in `placetype_data`). `data` contains the following fields:
** `entry_placetype`: the resolved entry placetype for the entry placetype being processed (i.e. it will always have an
entry in `placetype_data` but may not be the original placetype given by the user);
** `holonym_placetype` and `holonym_placename`: the holonym placetype and placename being processed;
** `holonym_index`: the index of the holonym being processed, or {nil} if we're handling an overriding holonym (FIXME:
we will change the overriding holonym algorithm so there will be an index even when processing overriding holonyms);
** `place_desc`: a full description of the {{tl|place}} call, as specified at the top of [[Module:place]];
** `from_demonym`: If set, we are called from [[Module:demonym]], triggered by {{tl|demonym-adj}} or
{{tl|demonym-noun}}, instead of being triggered by {{tl|place}}.
* `has_neighborhoods`: If `true`, the specified placetype is city-like. This is used in the
`district_neighborhood_cat_handler` to determine whether to add a category such as `Neighborhoods in ``location`` `;
see the section just above on `cat_handler`.
5. The following preposition-related property keys are recognized:
* `preposition`: The preposition used after this placetype when it occurs as an entry placetype. Defaults to `"ใน"`.
* `generic_before_non_cities`: If specified, the appropriate category description handler in
[[Module:category tree/topic cat/data/Places]] will recognize categories of the form
` ``Placetype`` in/of ``location`` ` for the specified placetype and preposition, if ``location`` is a non-city. This
is used to generate descriptions for categories added by category handlers and by explicit category specs in the
placetype data. All placetypes that specify `generic_before_non_cities` or `generic_before_cities` *MUST* also specify
a value for `class` so that the category tree code can determine whether it's a political or non-political division.
* `generic_before_cities`: Like `generic_before_non_cities` but for locations referring to cities.
6. The following property keys control the auto-addition of affixes when formatting holonyms of a particular placetype:
* `affix_type`: If specified, add the placetype as an affix before or after holonyms of this placetype. Possible values
are:
*# `"pref"` (the holonym will display as `(the) placetype of Holonym`, where `the` appears when the holonym directly
follows an entry placetype);
*# `"Pref"` (same as `"pref"` but the placetype is capitalized; each word is capitalized if there are multiple);
*# `"suf"` (the holonym will display as `Holonym placetype`);
*# `"Suf"` (the holonym will display as `Holonym Placetype`, i.e. same as `"suf"` but the placetype is capitalized).
* `suffix`: String to use in place of the placetype itself when the placetype is displayed as a suffix after a holonym.
Note that `suffix` can be used independently of `affix_type` because the user can also request a suffix explicitly
using a syntax like `adr:suf/Occitania`, which will display as `Occitania region` because the placetype
`administrative region` specifies `suffix = "ภูมิภาค"`.
* `prefix`: Like `suffix` but for use when the placetype is displayed as a prefix before the holonym.
* `affix`: Like `suffix` and `prefix` but for use when the placetype is displayed as an affix either before or after the
holonym. If both `suffix` or `prefix` and `affix` are given for a single placetype, `suffix` or `prefix` take
precedence.
* `no_affix_strings`: String or list of strings that, if they occur in the holonym, suppress the addition of any affix
requested using `affix_type`. Defaults to the placetype itself. For example, `autonomous okrug` specifies
`affix_type = "Suf"` so that `aokr/Nenets` displays as `Nenets Autonomous Okrug`, but also specifies
`no_affix_strings = "okrug"` so that `aokr/Nenets Okrug` or `aokr/Nenets Autonomous Okrug` displays as specified,
without a redundant `Autonomous Okrug` added. Matching is case-insensitive but whole-word.
* `display_handler`: A function of two arguments, `holonym_placetype` and `holonym_placename` (specifying a holonym).
Its return value is a string specifying the display form of the holonym.
7. The following property keys control the indefinite and definite articles used before entry placetypes and/or holonyms
of the specified placetype.
* `entry_placetype_use_the`: Use `"the"` before this placetype when it occurs as an entry placetype.
* `entry_placetype_indefinite_article`: Indefinite article used before this placetype when it occurs as an entry
placetype (usually `"a"`, specifically for placetypes beginning with u- that don't take the indefinite article
`"an"`). Defaults to the appropriate indefinite article (`"a"` or `"an"` depending on whether the placetype begins
with a vowel). Overridden by `entry_placetype_use_the`, and unlike for most properties, does not apply to equivalent
placetypes (i.e. fallbacks or those formed by removing a qualifier from the beginning); only to the exact placetype
specified.
* `holonym_use_the`: Use `"the"` before holonyms of this placetype.
'''NOTE:'''
# The `link` property must be specified on all placetypes, except those ending in `!` (category-only placetypes), which
must have either `link` or `category_link` specified.
# Either the `class` or `former_type` property must be specified on all placetypes not ending in `!` that do not have a
fallback (if a placetype has a fallback and omits the `class` and `former_type` properties, they are taken from the
fallback). An internal error will result if a placetype has no `class` or `former_type` property derivable either
directly or through a fallback, if an attempt is made to categorize a former/ancient/historical/etc. entity of this
placetype.
# It is possible to have multiple levels of fallback (e.g. `frazione` falls back to `hamlet`, which falls back
to `village`). Fallback loops will cause an internal error. All placetypes specified as fallbacks must exist in
`placetype_data` or an internal error occurs.
]==]
export.placetype_data = {
--[=[
If you need to sort the following, do this (using Vim):
1. Make sure all full-line comments are within the { ... } table, or are moved after and on the same line as single-line
entries.
2. Make sure the table uses tabs everywhere for indent, and not spaces.
3. Mark the top of the table with `ma`, go to the bottom and execute the following two lines in sequence:
:'a,.s/\n/\\n/g
:s/\\n\(\t\[\)/\r\1/g
The first command converts every newline to a literal `\n` sequence, so the whole thing becomes a single line, while
the second command restores the newlines before the beginning of each entry. The effect is to convert all entries to
a single line while not losing any information. (Potentially a negative lookahead could be used to do it all in one
command.)
4. Execute the following to sort:
:'a,.!perl -pe 's/^(\t\[")(.*?)(".*)$/$2 @@@ $1$2$3/' | sort -f | perl -pe 's/.*? @@@ //'
Note that a simple `sort -f` (where `-f` means case-insensitive) would almost work, but it would sort "hill station"
before "hill" and "county borough" before "เทศมณฑล" because the space after e.g. "hill station" sorts before the
quotation mark after e.g. "hill". The above command deals with this by extracting the key, prepending it followed by
` @@@ `, sorting, and then removing key (the classic decorate-sort-undecorate pattern).
5. Put the table back to multi-line format by marking the top of the table with `ma`, going to the bottom and executing
:'a,.s/\\n/\r/g
Note that for some reason, in order to get a match a newline in the left side of a replacement, you must use \n, but
to insert a newline in the right sode of a replacement you must use \r.
]=]
["*"] = {
link = false,
cat_handler = generic_place_cat_handler,
},
["administrative atoll"] = {
-- Maldives
link = "+w:administrative divisions of the Maldives",
preposition = "ของ",
class = "subpolity",
},
["administrative capital"] = {
link = "w",
fallback = "เมืองหลวง",
},
["administrative center"] = {
link = "w",
fallback = "เมืองหลวงที่ไม่ใช่นคร",
},
["administrative centre"] = {
link = "w",
fallback = "administrative center",
},
["administrative county"] = {
link = "w",
fallback = "เทศมณฑล",
},
["administrative district"] = {
link = "w",
fallback = "อำเภอ",
},
["administrative headquarters"] = {
link = "separately",
fallback = "administrative centre",
},
["administrative region"] = {
link = true,
preposition = "ของ",
suffix = "ภูมิภาค", -- but prefix is still "administrative region (of)"
fallback = "ภูมิภาค",
class = "subpolity",
},
["administrative seat"] = {
link = "w",
fallback = "administrative centre",
},
["administrative territory"] = {
link = "separately",
preposition = "ของ",
suffix = "ดินแดน", -- but prefix is still "administrative territory (of)"
fallback = "ดินแดน",
class = "subpolity",
},
["administrative unit"] = {
-- Grrr, it's difficult to generalize about "administrative units". In Albania, "administrative unit" is an
-- official term for a city-level division of municipalities; Wikipedia renders it using the more practical term
-- "commune". In Pakistan, "administrative unit" is a collective term used to refer to all the different types
-- of first-level divisions (four provinces, one federal territory, and two "disputed territories", i.e. Azad
-- Kashmir and Gilgit-Balistan, that are variously described). For this reason, we set no fallback, but we need
-- to include this so that it can be used as a placetype for Albania, categorizing as communes.
link = "w",
class = "subpolity",
},
["administrative village"] = {
link = "w",
preposition = "ของ",
has_neighborhoods = true,
class = "settlement",
},
["aimag"] = {
-- used in Mongolia, Russia and China (Inner Mongolia); in Mongolia, equivalent to a province;
-- in China, equivalent to a prefecture (below a province); in Russia, equivalent to a municipal district.
link = "w",
fallback = "prefecture",
},
["airport"] = {
link = true,
class = "man-made structure",
default = {true},
},
["alliance"] = {
link = true,
fallback = "confederation",
},
["archipelago"] = {
link = true,
fallback = "เกาะ",
},
["area"] = {
link = true,
preposition = "ของ",
fallback = "geographic and cultural area",
-- Areas can either be administrative divisions (specifically of Kuwait) or geographic areas. Assume the former
-- when categorizing 'Areas' but the latter when handling e.g. 'historical area'.
class = "subpolity",
former_type = "geographic region",
cat_handler = district_neighborhood_cat_handler,
},
["arm"] = {
link = true,
preposition = "ของ",
class = "natural feature",
default = {"ทะเล"},
},
["arrondissement"] = {
link = true,
preposition = "ของ",
-- FIXME!!! Grrrrr!!! In some countries, arrondissements are divisions of cities; in others, they are divisions
-- of departments or provinces. Need to conditionalize on the country for both of the following.
class = "subpolity",
has_neighborhoods = true,
},
["associated province"] = {
link = "separately",
fallback = "จังหวัด",
},
["atoll"] = {
-- FIXME! Atolls are administrative divisions of the Maldives but natural features elsewhere. Need to
-- conditionalize `class` on the country. See also `administrative atoll`.
link = true,
class = "natural feature",
bare_category_parent = "เกาะ",
default = {true},
},
["autonomous city"] = {
link = "w",
preposition = "ของ",
fallback = "นคร",
has_neighborhoods = true,
},
["autonomous community"] = {
-- Spain; refers to regional entities, not village-like entities, as might be expected from "community"
link = true,
preposition = "ของ",
class = "subpolity",
},
["autonomous island"] = {
-- Comoros; seems like an administrative atoll of the Maldives.
link = "+w:autonomous islands of Comoros",
preposition = "ของ",
class = "subpolity",
},
["autonomous oblast"] = {
link = true,
preposition = "ของ",
affix_type = "Suf",
no_affix_strings = "oblast",
class = "subpolity",
},
["autonomous okrug"] = {
link = true,
preposition = "ของ",
affix_type = "Suf",
no_affix_strings = "okrug",
class = "subpolity",
},
["autonomous prefecture"] = {
link = true,
fallback = "prefecture",
},
["autonomous province"] = {
link = "w",
fallback = "จังหวัด",
},
["autonomous region"] = {
link = "w",
preposition = "ของ",
fallback = "administrative region",
-- "administrative region" sets an affix of "ภูมิภาค" but we want to display as "Tibet Autonomous Region"
-- if the user writes 'ar:Suf/Tibet'.
affix = "autonomous region",
},
["autonomous republic"] = {
link = "w",
preposition = "ของ",
class = "subpolity",
},
["autonomous territorial unit"] = {
-- Moldova; only two of them, one for Gagauzia and one for Transnistria.
link = "w",
preposition = "ของ",
class = "subpolity",
},
["autonomous territory"] = {
link = "w",
fallback = "dependent territory",
},
["bailiwick"] = {
-- Jersey, etc.
link = true,
fallback = "องค์การทางการเมือง",
},
["barangay"] = {
-- Philippines
link = true,
class = "settlement",
-- Barangays are formal administrative divisions of a city rather than informal neighborhoods, but can use
-- some of the properties of a neighborhood.
fallback = "neighborhood",
},
["barrio"] = {
-- Spanish-speaking countries; Philippines
link = true,
-- FIXME: Not completely correct, in some countries barrios are formal administrative divisions of a city.
-- `class` will need to conditionalize on the country to be completely correct.
fallback = "neighborhood",
},
["basin"] = {
link = true,
fallback = "ทะเลสาบ",
},
["bay"] = {
link = true,
preposition = "ของ",
class = "natural feature",
addl_bare_category_parents = {"bodies of water"},
default = {true},
},
["beach"] = {
link = true,
class = "natural feature",
addl_bare_category_parents = {"water"},
default = {true},
},
["beach resort"] = {
link = "w",
fallback = "resort town",
},
["bishopric"] = {
link = true,
fallback = "องค์การทางการเมือง",
},
["bodies of water!"] = {
-- FIXME: This is (maybe?) a type category not a name category. There should be an option for this. We need to
-- straighten out the type vs. name vs. related-to issue.
category_link = "[[body of water|bodies of water]]",
class = "natural feature",
addl_bare_category_parents = {"ธรณีสัณฐาน", "ecosystems", "water"},
},
["borough"] = {
link = true,
preposition = "ของ",
display_handler = borough_display_handler,
has_neighborhoods = true,
-- "former borough" could be a former settlement or a former part of a city but seems more likely to
-- be a former subpolity, particularly in England. FIXME, we really need a handler to take care of this
-- properly.
class = "subpolity",
-- Grr, some boroughs are city-like but some (e.g. in Britain) may be larger.
},
["borough seat"] = {
link = true,
entry_placetype_use_the = true,
preposition = "ของ",
has_neighborhoods = true,
class = "capital",
},
["branch"] = {
link = true,
preposition = "ของ",
fallback = "แม่น้ำ",
},
["bridge"] = {
link = true,
class = "man-made structure",
default = {"Named bridges"},
},
["building"] = {
link = true,
class = "man-made structure",
default = {"Named buildings"},
},
["built-up area"] = {
link = "w",
fallback = "area",
},
["burgh"] = {
link = true,
fallback = "borough",
},
["business park"] = {
link = true,
fallback = "park",
},
["caliphate"] = {
link = true,
fallback = "องค์การทางการเมือง",
},
["canton"] = {
link = true,
preposition = "ของ",
affix_type = "suf",
class = "subpolity",
},
["cape"] = {
link = true,
fallback = "headland",
},
["capital"] = {
link = true,
fallback = "เมืองหลวง",
},
["เมืองหลวง"] = {
link = true,
category_link = "[[capital city|capital cities]]: the [[seat of government|seats of government]] for a country or [[political]] [[division]] of a country",
entry_placetype_use_the = true,
preposition = "ของ",
has_neighborhoods = true,
class = "capital",
bare_category_parent = "นคร",
cat_handler = capital_city_cat_handler,
default = {true},
-- The following is necessary so that e.g. [[Melbourne]] defined as {{place|en|capital city|s/Victoria|c/Australia}}
-- gets categorized in the bare category [[Category:en:Melbourne]]; otherwise placetype 'capital city' wouldn't
-- match against the placetype 'city' of Melbourne.
fallback = "นคร",
},
["caplc"] = {
link = "[[capital]] and [[large]]st [[city]]",
plural_link = false,
fallback = "เมืองหลวง",
},
["captaincy"] = {
link = true,
preposition = "ของ",
class = "subpolity",
inherently_former = {"FORMER"},
},
["caravan city"] = {
link = "w",
fallback = "นคร",
class = "settlement",
inherently_former = {"ANCIENT", "FORMER"},
},
["castle"] = {
link = true,
fallback = "building",
},
["cathedral city"] = {
link = true,
fallback = "นคร",
},
["cattle station"] = {
-- Australia
link = true,
fallback = "farm",
},
["census area"] = {
link = true,
affix_type = "Suf",
has_neighborhoods = true,
class = "non-admin settlement",
},
["census-designated place"] = {
-- United States
link = true,
class = "non-admin settlement",
},
["census division"] = {
-- Canada
link = "w",
preposition = "ของ",
class = "subpolity",
},
["census town"] = {
link = "w",
fallback = "เมือง",
},
["central business district"] = {
link = true,
fallback = "neighborhood",
},
["cercle"] = {
-- Mali
link = "+w:cercles of Mali",
preposition = "ของ",
class = "subpolity",
},
["ceremonial county"] = {
link = true,
fallback = "เทศมณฑล",
},
["chain of islands"] = {
link = "[[chain]] of [[island]]s",
plural = "chains of islands",
plural_link = "[[chain]]s of [[island]]s",
fallback = "เกาะ",
},
["channel"] = {
link = true,
fallback = "strait",
},
["charter community"] = {
-- Northwest Territories, Canada
link = "w",
fallback = "village",
},
["นคร"] = {
link = true,
generic_before_non_cities = "ใน",
has_neighborhoods = true,
class = "settlement",
cat_handler = city_type_cat_handler,
default = {true},
},
["city-state"] = {
link = true,
category_link = "[[sovereign]] [[microstate]]s consisting of a single [[city]] and [[w:dependent territory|dependent territories]]",
has_neighborhoods = true,
class = "settlement",
["continent/*"] = {"City-states", "นครใน+++", "ประเทศใน+++", "เมืองหลวงของ"},
default = {"City-states", "นคร", "ประเทศ", "เมืองหลวงของประเทศ"},
},
["civil parish"] = {
-- Mostly England; similar to municipalities
link = true,
preposition = "ของ",
affix_type = "suf",
has_neighborhoods = true,
class = "subpolity",
},
["claimed political division"] = {
link = "[[claim]]ed [[political]] [[division]]",
class = "subpolity",
default = {true},
},
["co-capital"] = {
link = "[[co-]][[capital]]",
fallback = "เมืองหลวง",
},
["coal city"] = {
link = "+w:coal town",
fallback = "นคร",
},
["coal town"] = {
link = "w",
fallback = "เมือง",
},
["collectivity"] = {
link = "w",
preposition = "ของ",
-- No default; these are weird one-off governmental divisions in France (esp. for overseas collectivities)
class = "subpolity",
},
["colony"] = {
link = true,
fallback = "dependent territory",
},
["comarca"] = {
-- per Wikipedia: traditional region or local administrative division found in Portugal, Spain, and some of
-- their former colonies, like Brazil, Nicaragua, and Panama. In the Valencian Community, for example, it
-- sits between municipalities and provinces, something like a county or district.
link = true,
preposition = "ของ",
class = "subpolity",
},
["commandery"] = {
link = true,
preposition = "ของ",
class = "subpolity",
inherently_former = {"ANCIENT", "FORMER"},
},
["commonwealth"] = {
link = true,
preposition = "ของ",
-- No default; applies specifically to Puerto Rico
class = "subpolity",
},
["commune"] = {
link = true,
fallback = "เทศบาล",
},
["community"] = {
link = true,
category_link = "[[community|communities]] of all sizes",
fallback = "village",
},
["community development block"] = {
-- in India; appears to be similar to a rural municipality; groups several villages, unclear if there will be
-- neighborhoods so I'm not setting `has_neighborhoods` for now
link = "w",
affix_type = "suf",
no_affix_strings = "block",
class = "subpolity",
},
["comune"] = {
-- Italy, Switzerland
link = true,
fallback = "เทศบาล",
},
["condominium"] = {
link = true,
fallback = "องค์การทางการเมือง",
},
["confederacy"] = {
link = true,
fallback = "confederation",
},
["confederation"] = {
link = true,
fallback = "องค์การทางการเมือง",
},
["constituency"] = {
-- currently we have them as political divisions of Namibia but many countries have them
link = true,
preposition = "ของ",
class = "subpolity",
},
["constituent country"] = {
link = true,
preposition = "ของ",
class = "subpolity",
},
["constituent part"] = {
link = "separately",
preposition = "ของ",
class = "subpolity",
},
["constituent republic"] = {
-- Of Russia, Yugoslavia, etc.
link = "separately",
preposition = "ของ",
class = "subpolity",
},
["counties and county-level cities!"] = {
-- This is used when grouping counties and county-level cities under prefecture-level cities in China.
category_link = "[[county|counties]] and [[county-level city|county-level cities]]",
class = "subpolity",
},
["ทวีป"] = {
link = true,
category_link = false, -- can't occur as a bare category
class = "natural feature",
default = {"Continents and continental regions"},
},
["continental region"] = {
link = "separately",
category_link = false, -- can't occur as a bare category
class = "geographic region",
fallback = "ทวีป",
},
["continents and continental regions!"] = {
category_link = "[[continent]]s and [[continent]]-[[level]] [[region]]s (e.g. [[Polynesia]])",
class = "geographic region",
},
["council area"] = {
link = true,
-- in Scotland; similar to a county
preposition = "ของ",
affix_type = "suf",
class = "subpolity",
},
["ประเทศ"] = {
link = true,
class = "polity", --ห้ามแปล class
["continent/*"] = {true, "ประเทศ"},
default = {true},
},
["country-like entities!"] = {
category_link = "[[polity|polities]] not normally considered [[country|countries]] but treated similarly for categorization purposes; typically, [[unrecognized]] [[de-facto]] countries or [[w:dependent territory|dependent territories]]",
class = "polity", --ห้ามแปล class
},
["เทศมณฑล"] = {
link = true,
preposition = "ของ",
display_handler = county_display_handler,
class = "subpolity",
},
["county borough"] = {
link = true,
-- in Wales; similar to a county
preposition = "ของ",
affix_type = "suf",
fallback = "borough",
class = "subpolity",
},
["county seat"] = {
link = true,
entry_placetype_use_the = true,
preposition = "ของ",
has_neighborhoods = true,
class = "capital",
},
["county town"] = {
link = true,
entry_placetype_use_the = true,
preposition = "ของ",
fallback = "เมือง",
has_neighborhoods = true,
class = "capital",
},
["county-administered city"] = {
-- In Taiwan, per Wikipedia similar to a Taiwanese township or district, which is a small city.
-- NOT anything like a "county-level city" in PR China, which is a county masquerading as a city.
link = "w",
fallback = "นคร",
has_neighborhoods = true,
class = "settlement",
},
["county-controlled city"] = {
-- Taiwan
link = "w",
fallback = "county-administered city",
},
["county-level city"] = {
-- PR China
link = "w",
fallback = "prefecture-level city",
},
["crater lake"] = {
link = true,
fallback = "ทะเลสาบ",
},
["creek"] = {
link = true,
fallback = "stream",
},
["Crown colony"] = {
link = "+crown colony",
fallback = "crown colony",
},
["crown colony"] = {
link = true,
fallback = "colony",
},
["Crown dependency"] = {
link = true,
fallback = "dependent territory",
},
["crown dependency"] = {
link = true,
fallback = "dependent territory",
},
["cultural area"] = {
link = "w",
fallback = "geographic and cultural area",
},
["cultural region"] = {
link = "w",
fallback = "geographic and cultural area",
},
["delegation"] = {
-- Tunisia
link = "+w:delegations of Tunisia",
preposition = "ของ",
class = "subpolity",
},
["department"] = {
link = true,
preposition = "ของ",
affix_type = "suf",
class = "subpolity",
},
["departmental capital"] = {
link = "separately",
fallback = "เมืองหลวง",
},
["dependency"] = {
link = true,
fallback = "dependent territory",
},
["dependent territory"] = {
link = "w",
preposition = "ของ",
class = "subpolity",
former_type = "dependent territory",
bare_category_parent = "political divisions",
["country/*"] = {true},
default = {true},
},
["desert"] = {
link = true,
class = "natural feature",
addl_bare_category_parents = {"ecosystems"},
default = {true},
},
["deserted mediaeval village"] = {
link = "w",
fallback = "deserted medieval village",
},
["deserted medieval village"] = {
link = "w",
fallback = "ANCIENT settlement",
},
["direct-administered municipality"] = {
-- China
link = "+w:direct-administered municipalities of China",
fallback = "เทศบาล",
},
["direct-controlled municipality"] = {
-- several countries
link = "w",
fallback = "เทศบาล",
},
["distributary"] = {
link = true,
preposition = "ของ",
fallback = "แม่น้ำ",
},
["อำเภอ"] = {
link = true,
preposition = "ของ",
affix_type = "suf",
-- Grrr! FIXME! Here is where we need handlers for `class`. Using similar logic to
-- district_neighborhood_cat_handler, we need to check if we're below or above a city to determine if the class
-- is "settlement" or "subpolity".
class = "subpolity",
cat_handler = district_neighborhood_cat_handler,
-- No default. Countries for which districts are political divisions will get entries.
},
["districts and autonomous regions!"] = {
-- This and other similar "combined placetypes" are for use in the plural when grouping first-level
-- administrative regions of certain countries, in this case Portugal.
category_link = "[[district]]s and [[autonomous region]]s",
class = "subpolity",
},
["districts and autonomous territorial units!"] = {
-- This and other similar "combined placetypes" are for use in the plural when grouping first-level
-- administrative regions of certain countries, in this case Moldova.
category_link = "[[district]]s and [[w:autonomous territorial unit|autonomous territorial unit]]s",
class = "subpolity",
},
["district capital"] = {
link = "separately",
fallback = "เมืองหลวง",
},
["district headquarters"] = {
link = "separately",
fallback = "administrative centre",
},
["district municipality"] = {
-- In Canada, a district municipality is equivalent to a rural municipality and won't have neighborhoods; in
-- South Africa, district municipalities group local municipalities and hence won't have neighborhoods.
link = "w",
preposition = "ของ",
affix_type = "suf",
no_affix_strings = {"อำเภอ", "เทศบาล"},
fallback = "เทศบาล",
class = "subpolity",
},
["division"] = {
link = true,
preposition = "ของ",
class = "subpolity",
},
["division capital"] = {
link = "separately",
fallback = "เมืองหลวง",
},
["dome"] = {
link = true,
fallback = "ภูเขา",
},
["dormant volcano"] = {
link = true,
fallback = "volcano",
},
["duchy"] = {
link = true,
fallback = "องค์การทางการเมือง",
},
["emirate"] = {
link = true,
preposition = "ของ",
-- FIXME: Can be subpolities (of the United Arab Emirates).
fallback = "องค์การทางการเมือง",
},
["จักรวรรดิ"] = {
link = true,
fallback = "องค์การทางการเมือง",
},
["enclave"] = {
link = true,
preposition = "ของ",
-- Enclaves can theoretically be any size but assume a subpolity.
class = "subpolity",
},
["entity"] = {
-- Bosnia and Herzegovina
link = "+w:entities of Bosnia and Herzegovina",
preposition = "ของ",
class = "subpolity",
},
["escarpment"] = {
link = true,
fallback = "ภูเขา",
},
["ethnographic region"] = {
-- used in Lithuania
link = "+w:ethnographic regions of Lithuania",
fallback = "geographic and cultural area",
},
["exclave"] = {
link = true,
preposition = "ของ",
-- exclaves can theoretically be any size but assume a subpolity.
class = "subpolity",
},
["external territory"] = {
link = "separately",
fallback = "dependent territory",
},
["farm"] = {
link = true,
class = "non-admin settlement",
default = {"Farms and ranches"},
},
["farms and ranches!"] = {
category_link = "[[farm]]s and [[ranch]]es",
class = "non-admin settlement",
},
["federal city"] = {
link = "w",
preposition = "ของ",
fallback = "นคร",
},
["federal district"] = {
link = true,
preposition = "ของ",
-- Might have neighborhoods as federal districts are often cities (e.g. Mexico City)
has_neighborhoods = true,
class = "settlement",
},
["federal subject"] = {
-- In Russia; a generic term for first-level administrative divisions (republics, oblasts, okrugs, krais,
-- autonomous okrugs and autonomous oblasts).
link = "w",
preposition = "ของ",
class = "subpolity",
},
["federal territory"] = {
link = "w",
fallback = "ดินแดน",
},
["fictional location"] = {
link = "separately",
former_type = "!",
class = "hypothetical location",
bare_category_parent = "สถานที่",
default = {true},
},
["First Nations reserve"] = {
-- Canada
link = "[[First Nations]] [[w:Indian reserve|reserve]]",
-- Wikipedia uses "Indian reserve"; presumably that is the legal term
fallback = "Indian reserve",
class = "subpolity",
},
["fjord"] = {
link = true,
class = "natural feature",
addl_bare_category_parents = {"bodies of water"},
default = {true},
},
["footpath"] = {
link = true,
fallback = "road",
},
["forest"] = {
link = true,
class = "natural feature",
addl_bare_category_parents = {"ecosystems", "forestry"},
default = {true},
},
["fort"] = {
link = true,
fallback = "building",
},
["fortress"] = {
link = true,
-- The default plural algorithm gets this right but the singularization algorithm incorrectly converts
-- fortresses -> fortresse, so put an entry here to ensure we singularize correctly.
plural = "fortresses",
fallback = "building",
},
["frazione"] = {
link = "w",
fallback = "hamlet",
},
["freeway"] = {
link = true,
fallback = "road",
},
["French prefecture"] = {
link = "[[w:prefectures in France|prefecture]]",
entry_placetype_use_the = true,
preposition = "ของ",
has_neighborhoods = true,
class = "capital",
},
["geographic and cultural area"] = {
link = "+w:cultural area",
-- `generic_before_non_cities` is used when generating the category description of categories of the format
-- `Geographic and cultural areas of PLACE`. `preposition` is used when generating {{place}} description and
-- categories for any placetype that falls back to `geographic and cultural area`.
generic_before_non_cities = "ของ",
preposition = "ของ",
class = "geographic region",
bare_category_parent = "สถานที่",
["country/*"] = {true},
["constituent country/*"] = {true},
["continent/*"] = {true},
default = {true},
},
["geographic area"] = {
link = "+w:geographic region",
fallback = "geographic and cultural area",
},
["geographic region"] = {
link = "w",
fallback = "geographic and cultural area",
},
["geographical area"] = {
link = "w",
fallback = "geographic and cultural area",
},
["geographical region"] = {
link = "w",
fallback = "geographic and cultural area",
},
["geopolitical zone"] = {
-- Nigeria
link = true,
preposition = "ของ",
class = "subpolity",
},
["gewog"] = {
-- Bhutan
link = true,
preposition = "ของ",
class = "subpolity",
},
["ghost town"] = {
link = true,
generic_before_non_cities = "ใน",
class = "non-admin settlement",
bare_category_parent = "former settlements",
cat_handler = city_type_cat_handler,
default = {true},
},
["glen"] = {
link = true,
fallback = "valley",
},
["governorate"] = {
link = true,
preposition = "ของ",
affix_type = "suf",
class = "subpolity",
},
["greater administrative region"] = {
-- China (former division)
link = "w",
preposition = "ของ",
class = "subpolity",
inherently_former = {"FORMER"},
},
["gromada"] = {
-- Poland (former division)
link = "w",
preposition = "ของ",
affix_type = "Pref",
class = "subpolity",
inherently_former = {"FORMER"},
},
["group of islands"] = {
link = "[[group]] of [[island]]s",
plural = "groups of islands",
plural_link = "[[group]]s of [[island]]s",
fallback = "island group",
},
["gulf"] = {
link = true,
preposition = "ของ",
holonym_use_the = true,
class = "natural feature",
addl_bare_category_parents = {"bodies of water"},
default = {true},
},
["hamlet"] = {
link = true,
fallback = "village",
},
["harbor city"] = {
link = "separately",
fallback = "นคร",
},
["harbor town"] = {
link = "separately",
fallback = "เมือง",
},
["harbour city"] = {
link = "separately",
fallback = "นคร",
},
["harbour town"] = {
link = "separately",
fallback = "เมือง",
},
["headland"] = {
link = true,
class = "natural feature",
addl_bare_category_parents = {"ธรณีสัณฐาน"},
default = {true},
},
["headquarters"] = {
link = "w",
fallback = "administrative centre",
},
["heath"] = {
link = true,
fallback = "moor",
},
["hemisphere"] = {
link = true,
entry_placetype_use_the = true,
fallback = "continental region",
},
["highway"] = {
link = true,
fallback = "road",
},
["hill"] = {
link = true,
class = "natural feature",
addl_bare_category_parents = {"ธรณีสัณฐาน"},
default = {true},
},
["hill station"] = {
link = "w",
fallback = "เมือง",
},
["hill town"] = {
link = "w",
fallback = "เมือง",
},
["historic region"] = {
-- provided only for the link
link = "+w:historical region",
fallback = "FORMER geographic region",
},
["historical county"] = {
-- needed for historical counties of England/etc.
link = "+w:historic county",
fallback = "FORMER subpolity",
},
["historical region"] = {
-- provided only for the link
link = "w",
fallback = "FORMER geographic region",
},
["home rule city"] = {
link = "w",
fallback = "นคร",
},
["home rule municipality"] = {
link = "w",
fallback = "เทศบาล",
},
["hot spring"] = {
link = true,
fallback = "spring",
},
["house"] = {
link = true,
fallback = "building",
},
["housing estate"] = {
-- not the same as a housing project (i.e. public housing)
link = true,
-- not exactly the case but approximately
fallback = "neighborhood",
},
["hromada"] = {
-- Ukraine
link = "w",
disallow_in_entries = "Use placetype 'urban hromada', 'rural hromada' or 'settlement hromada' in place of bare 'hromada'",
disallow_in_holonyms = "Use placetype 'urban hromada'/'uhrom', 'rural hromada'/'rhrom' or 'settlement hromada'/'shrom' in place of bare 'hromada'",
preposition = "ของ",
affix_type = "suf",
class = "subpolity",
},
["inactive volcano"] = {
link = "w",
fallback = "dormant volcano",
},
["independent city"] = {
link = true,
fallback = "นคร",
},
["independent town"] = {
link = "+independent city",
fallback = "เมือง",
},
["Indian reservation"] = {
link = "w",
-- In the US. Also known as "Native American reservation" or "domestic dependent nation", and the reservations
-- themselves often use the term "nation" in their official name (e.g. the "Navajo Nation"). But Wikipedia puts
-- the article at [[w:Indian reservation]] and uses that term when describing e.g. what the Navajo Nation is,
-- so this must still be the legal term.
preposition = "ของ",
class = "subpolity",
default = {true},
},
["Indian reserve"] = {
link = "w",
-- In Canada. "First Nations reserve" sounds more modern/PC but Wikipedia uses "Indian reserve"; presumably that
-- is still the legal term.
preposition = "ของ",
class = "subpolity",
default = {true},
},
["inland sea"] = {
-- note, we also have 'inland' as a qualifier
link = true,
fallback = "ทะเล",
},
["inner city area"] = {
link = "[[inner city]] [[area]]",
fallback = "neighborhood",
},
["เกาะ"] = {
link = true,
preposition = "ของ",
class = "natural feature",
addl_bare_category_parents = {"ธรณีสัณฐาน"},
default = {true},
},
["island country"] = {
-- FIXME: The following should map to both 'island' and 'country'.
link = "w",
fallback = "ประเทศ",
},
["island group"] = {
link = "separately",
fallback = "เกาะ",
},
["island municipality"] = {
link = "w",
fallback = "เทศบาล",
},
["islet"] = {
link = "w",
fallback = "เกาะ",
},
["Israeli settlement"] = {
link = "w",
class = "settlement",
default = {true},
},
["judicial capital"] = {
link = "w",
fallback = "เมืองหลวง",
},
["khanate"] = {
link = true,
fallback = "องค์การทางการเมือง",
},
["kibbutz"] = {
link = true,
plural = "kibbutzim",
class = "non-admin settlement",
default = {true},
},
["kingdom"] = {
link = true,
fallback = "monarchy",
},
["krai"] = {
link = true,
preposition = "ของ",
affix_type = "Suf",
class = "subpolity",
},
["ทะเลสาบ"] = {
link = true,
class = "natural feature",
addl_bare_category_parents = {"bodies of water"},
default = {true},
},
["ธรณีสัณฐาน!"] = {
category_link = "[[ธรณีสัณฐาน]]",
bare_category_parent = "สถานที่",
addl_bare_category_parents = {"โลก"},
},
["largest city"] = {
link = "[[large]]st [[city]]",
entry_placetype_use_the = true,
fallback = "นคร",
has_neighborhoods = true,
},
["league"] = {
link = true,
fallback = "confederation",
},
["legislative capital"] = {
link = "separately",
fallback = "เมืองหลวง",
},
["library"] = {
link = true,
fallback = "building",
},
["lieutenancy area"] = {
-- used in the United Kingdom; per Wikipedia:
-- In England, lieutenancy areas are colloquially known as the ceremonial counties, although this phrase does
-- not appear in any legislation referring to them. The lieutenancy areas of Scotland are subdivisions of
-- Scotland that are more or less based on the counties of Scotland, making use of the major cities as separate
-- entities.[2] In Wales, the lieutenancy areas are known as the preserved counties of Wales and are based on
-- those used for lieutenancy and local government between 1974 and 1996. The lieutenancy areas of Northern
-- Ireland correspond to the six counties and two former county boroughs.[3]
link = "w",
fallback = "ceremonial county",
},
["local authority district"] = {
link = "w",
fallback = "local government district",
},
["local government area"] = {
-- Australia
link = "w",
preposition = "ของ",
class = "subpolity",
},
["local council"] = {
-- Malta; similar to municipalities
link = "+w:local councils of Malta",
preposition = "ของ",
fallback = "เทศบาล",
},
["local government district"] = {
link = "w",
preposition = "ของ",
affix_type = "suf",
affix = "อำเภอ",
class = "subpolity",
},
["local government district with borough status"] = {
link = "[[w:local government district|local government district]] with [[w:borough status|borough status]]",
plural = "local government districts with borough status",
plural_link = "[[w:local government district|local government districts]] with [[w:borough status|borough status]]",
preposition = "ของ",
affix_type = "suf",
affix = "อำเภอ",
class = "subpolity",
},
["local urban district"] = {
link = "w",
fallback = "unincorporated community",
},
["locality"] = {
link = "+w:locality (settlement)",
-- not necessarily true, but usually is the case
fallback = "village",
},
["London borough"] = {
link = "w",
preposition = "ของ",
affix_type = "pref",
affix = "borough",
fallback = "local government district with borough status",
has_neighborhoods = true,
},
["macroregion"] = {
link = true,
fallback = "ภูมิภาค",
},
["man-made structures!"] = {
category_link = "[[w:geographical feature#Engineered constructs|man-made structures]] such as [[airport]]s, [[university|universities]] and [[metro station]]s",
bare_category_parent = "สถานที่",
},
["manor"] = {
-- FIXME: or is this more like a farm?
link = true,
fallback = "building",
},
["marginal sea"] = {
link = true,
preposition = "ของ",
fallback = "ทะเล",
},
["market city"] = {
link = "+market town",
fallback = "นคร",
},
["market town"] = {
link = true,
fallback = "เมือง",
},
["massif"] = {
link = true,
fallback = "ภูเขา",
},
["megacity"] = {
link = true,
fallback = "นคร",
},
["metro station"] = {
link = true,
class = "man-made structure",
},
["metropolitan borough"] = {
link = true,
preposition = "ของ",
affix_type = "Pref",
no_affix_strings = {"borough", "นคร"},
fallback = "local government district",
has_neighborhoods = true,
},
["มหานคร"] = {
-- These exist e.g. in Italy and are more like municipalities or even provinces than cities.
link = true,
preposition = "ของ",
affix_type = "Pref",
no_affix_strings = {"มหานคร", "นคร"},
class = "subpolity",
},
["metropolitan county"] = {
link = true,
fallback = "เทศมณฑล",
},
["metropolitan municipality"] = {
-- In South Africa, metropolitan municipalities group local municipalities and are like districts, between
-- provinces and municipalities.
-- In Turkey, metropolitan municipalities are provinces-level.
link = "w",
preposition = "ของ",
affix_type = "Suf",
no_affix_strings = {"metropolitan", "เทศบาล"},
fallback = "เทศบาล",
class = "subpolity",
},
["microdistrict"] = {
-- residential complex in post-Soviet states
link = true,
fallback = "neighborhood",
},
["micronations!"] = {
-- FIXME, merge with microstate
category_link = "[[micronation]]s",
bare_category_parent = "ประเทศ",
},
["microstate"] = {
link = true,
fallback = "ประเทศ",
},
["military base"] = {
link = "w",
class = "settlement", -- or "man-made structure"?
default = {true},
},
["minster town"] = {
-- England
link = "separately",
fallback = "เมือง",
},
["monarchy"] = {
link = true,
fallback = "องค์การทางการเมือง",
},
["moor"] = {
link = true,
class = "natural feature",
addl_bare_category_parents = {"ธรณีสัณฐาน", "ecosystems"},
default = {true},
},
["moorland"] = {
link = true,
fallback = "moor",
},
["motorway"] = {
link = true,
fallback = "road",
},
["ภูเขา"] = {
link = true,
class = "natural feature",
addl_bare_category_parents = {"ธรณีสัณฐาน"},
default = {true},
},
["mountain indigenous district"] = {
-- Taiwan
link = "+w:district (Taiwan)",
fallback = "อำเภอ",
},
["mountain indigenous township"] = {
-- Taiwan
link = "+w:township (Taiwan)",
fallback = "township",
},
["mountain pass"] = {
link = true,
-- The default plural algorithm gets this right but the singularization algorithm incorrectly converts
-- passes -> passe, so put an entry here to ensure we singularize correctly.
plural = "mountain passes",
class = "natural feature",
addl_bare_category_parents = {"ภูเขา"},
default = {true},
},
["เทือกเขา"] = {
link = true,
fallback = "ภูเขา",
},
["mountainous region"] = {
link = "separately",
fallback = "ภูมิภาค",
},
["mukim"] = {
-- Malaysia, Brunei, Indonesia, Singapore
link = true,
preposition = "ของ",
class = "subpolity",
},
["municipal district"] = {
link = "w",
-- meaning varies depending on the country; for now, assume no neighborhoods.
-- FIXME: has_neighborhoods might have to be a function that looks at the containing holonyms.
preposition = "ของ",
affix_type = "Pref",
no_affix_strings = "อำเภอ",
fallback = "เทศบาล",
},
["เทศบาล"] = {
link = true,
preposition = "ของ",
has_neighborhoods = true,
class = "subpolity",
},
["municipality with city status"] = {
link = "[[municipality]] with [[w:city status|city status]]",
plural = "municipalities with city status",
plural_link = "[[municipality|municipalities]] with [[w:city status|city status]]",
fallback = "เทศบาล",
},
["museum"] = {
link = true,
fallback = "building",
},
["mythological location"] = {
link = "separately",
former_type = "!",
class = "hypothetical location",
bare_category_parent = "สถานที่",
default = {true},
},
["named bridges!"] = {
category_link = "notable [[bridge]]s",
bare_category_parent = "man-made structures",
addl_bare_category_parents = {"bridges"},
},
["named buildings!"] = {
category_link = "notable [[house]]s, [[library|libraries]] and other [[building]]s",
bare_category_parent = "man-made structures",
addl_bare_category_parents = {"buildings"},
},
["named roads!"] = {
category_link = "notable [[road]]s, [[highway]]s, [[trail]]s and similar linear structures",
bare_category_parent = "man-made structures",
addl_bare_category_parents = {"roads"},
},
["national capital"] = {
link = "w",
fallback = "เมืองหลวง",
},
["national park"] = {
link = true,
fallback = "park",
},
["natural features!"] = {
category_link = "[[w:geographical feature#Natural features|natural features]] such as [[lake]]s, [[mountain]]s, [[island]]s and [[ocean]]s",
bare_category_parent = "สถานที่",
},
["neighborhood"] = {
-- The majority of the properties here apply to both `neighborhoods` and `neighbourhoods`; the choice of which
-- one to use is made by district_neighborhood_cat_handler() based on the value of `british_spelling` for the
-- location (city, political division, etc.) of the holonym that follows the word "neighbo(u)hoods" in the
-- category name. It does *NOT* depend on whether the {{place}} call uses "neighborhoods" or "neighbourhoods".
-- (In general it can't, because other things like "urban areas", "อำเภอ", "subdivisions" and the like also
-- categorize as neighbo(u)rhoods.)
link = true,
-- See below. These are used by category handlers in [[Module:category tree/topic cat/data/Places]].
generic_before_non_cities = "ใน",
generic_before_cities = "ของ",
-- The following text is suitable for the top-level description of a neighborhood as well as categories of the
-- form `Neighborhoods in POLDIV` e.g. `Neighborhoods in Illinois, USA` but not for categories of the form
-- `Neighborhoods of Chicago`, where we'd get "... and other subportions of [[city|cities]] of [[Chicago]]".
category_link = "[[neighborhood]]s, [[district]]s and other subportions of [[city|cities]]",
category_link_before_city = "[[neighborhood]]s, [[district]]s and other subportions",
-- NOTE: This setting is needed for administrative divisions like barangays that fall back to `neighborhood`,
-- when set in [[Module:place/locations]] for a specific country (e.g. the Philippines). The above settings
-- for `generic_before_non_cities` and `generic_before_cities` are used by category handlers in
-- [[Module:category tree/topic cat/data/Places]] for `Neighborhoods in POLDIV` and `Neighborhoods of CITY`
-- categories. In fact, district_neighborhood_cat_handler() does not currently pay attention to them, but
-- generates "ของ" before cities and "ใน" before non-cities regardless. (FIXME: We should change that.)
preposition = "ของ",
class = "non-admin settlement",
cat_handler = district_neighborhood_cat_handler,
},
["neighbourhood"] = {
link = true,
category_link = "[[neighbourhood]]s, [[district]]s and other subportions of [[city|cities]]",
category_link_before_city = "[[neighbourhood]]s, [[district]]s and other subportions",
fallback = "neighborhood",
},
["new area"] = {
-- China (type of economic development zone, varying greatly in size)
link = "w",
preposition = "ใน",
class = "subpolity", --?
},
["new town"] = {
link = true,
fallback = "เมือง",
},
["เมืองหลวงที่ไม่ใช่นคร"] = {
link = "[[เมืองหลวง]]",
entry_placetype_use_the = true,
preposition = "ของ",
has_neighborhoods = true,
class = "capital",
cat_handler = function(data)
return capital_city_cat_handler(data, "non-city")
end,
-- FIXME, do we need the following?
default = {true},
},
["non-metropolitan county"] = {
link = "w",
fallback = "เทศมณฑล",
},
["non-metropolitan district"] = {
link = "w",
fallback = "local government district",
},
["non-sovereign kingdom"] = {
-- especially in Africa and Asia
link = "+w:non-sovereign monarchy",
generic_before_non_cities = "ใน",
class = "subpolity",
["country/*"] = {true},
["continent/*"] = {true},
default = {true},
},
["non-sovereign monarchy"] = {
link = "w",
fallback = "non-sovereign kingdom",
},
["oblast"] = {
link = true,
preposition = "ของ",
affix_type = "Suf",
class = "subpolity",
},
["oblasts and autonomous republics!"] = {
-- This and other similar "combined placetypes" are for use in the plural when grouping first-level
-- administrative regions of certain countries, in this case Ukraine.
category_link = "[[oblast]]s and [[w:autonomous republic|autonomous republic]]s",
class = "subpolity",
},
["มหาสมุทร"] = {
link = true,
holonym_use_the = true,
class = "natural feature",
addl_bare_category_parents = {"ทะเล", "bodies of water"},
default = {true},
},
["okrug"] = {
link = true,
preposition = "ของ",
affix_type = "Suf",
class = "subpolity",
},
["overseas collectivity"] = {
link = "w",
fallback = "collectivity",
},
["overseas department"] = {
link = "w",
fallback = "department",
},
["overseas territory"] = {
link = "w",
fallback = "dependent territory",
},
["parish"] = {
link = true,
preposition = "ของ",
affix_type = "suf",
class = "subpolity",
},
["parish municipality"] = {
-- in Quebec, often similar to a rural village; the famous [[Saint-Louis-du-Ha! Ha!]] is one of them.
link = "+w:parish municipality (Quebec)",
preposition = "ของ",
fallback = "เทศบาล",
has_neighborhoods = true,
},
["parish seat"] = {
link = true,
entry_placetype_use_the = true,
preposition = "ของ",
class = "capital",
has_neighborhoods = true,
},
["park"] = {
link = true,
class = "man-made structure",
default = {true},
},
["pass"] = {
link = "+mountain pass",
-- The default plural algorithm gets this right but the singularization algorithm incorrectly converts
-- passes -> passe, so put an entry here to ensure we singularize correctly.
plural = "passes",
fallback = "mountain pass",
},
["path"] = {
link = true,
fallback = "road",
},
["peak"] = {
link = true,
fallback = "ภูเขา",
},
["peninsula"] = {
link = true,
class = "natural feature",
addl_bare_category_parents = {"ธรณีสัณฐาน"},
default = {true},
},
["periphery"] = {
link = true,
preposition = "ของ",
class = "subpolity",
},
["สถานที่!"] = {
generic_before_non_cities = "ใน",
generic_before_cities = "ใน",
class = "generic place",
category_link = "[[place]]s of all sorts",
-- `category_link_top_level` control the description used in the top-level [[Category:Places]] and
-- language-specific variants such as [[Category:en:Places]]. The actual text for a language-spefic variant is
-- "{{{langname}}} names of [[geographical]] [[place]]s of all sorts; [[toponym]]s." where the "names of"
-- portion is automatically generated by the appropriate handler in
-- [[Module:category tree/topic cat/data/Places]].
category_link_top_level = "[[geographical]] [[place]]s of all sorts; [[toponym]]s",
bare_category_parent = "ชื่อ (หัวข้อ)",
},
["planned community"] = {
-- Include this so we don't categorize 'planned community' into villages, as 'community' does.
link = true,
class = "settlement",
has_neighborhoods = true,
},
["plateau"] = {
link = true,
class = "natural feature",
addl_bare_category_parents = {"ธรณีสัณฐาน"},
default = {true},
-- FIXME: Should generate both "Plateaus" and the appropriate 'geographic and cultural area' category
},
["Polish colony"] = {
link = "[[w:colony (Poland)|colony]]",
affix_type = "suf",
affix = "colony",
fallback = "village",
has_neighborhoods = true,
},
["political divisions!"] = {
category_link = "[[political]] [[division]]s and [[subdivision]]s, such as [[state]]s, [[province]]s, [[county|counties]] or [[district]]s",
bare_category_parent = "สถานที่",
},
["องค์การทางการเมือง"] = {
link = true,
category_link = "[[independent]] or [[semi-]][[independent]] [[polity|polities]]",
class = "polity", --ห้ามแปล class
bare_category_parent = "สถานที่",
default = {true},
},
["populated place"] = {
link = "+w:populated place",
-- not necessarily true, but usually is the case
fallback = "village",
},
["port"] = {
link = true,
class = "man-made structure",
default = {true},
},
["port city"] = {
-- FIXME: should categorize into "Ports" as well as "นคร"
link = true,
fallback = "นคร",
},
["port town"] = {
-- FIXME: should categorize into "Ports" as well as "เมือง"
link = "w",
fallback = "เมือง",
},
["prefecture"] = {
-- FIXME! `prefecture` is like a county in Japan and elsewhere but a department capital city in France.
-- May need `has_neighborhoods` to be a function.
link = true,
preposition = "ของ",
display_handler = prefecture_display_handler,
class = "subpolity",
},
["prefecture-level city"] = {
-- China; they are huge entities with a central city; not cities themselves.
link = "w",
preposition = "ของ",
class = "subpolity",
},
["preserved county"] = {
-- In Wales; they are former counties enshrined in law; there are 8 of them and each consists of one or more
-- "principal areas" (styled as "เทศมณฑล" or "county boroughs"), of which there are 22.
link = "w",
preposition = "ของ",
class = "subpolity",
inherently_former = {"FORMER"},
},
["primary area"] = {
-- a grouping of "อำเภอ" (neighborhoods) in Gothenburg, Sweden
link = "+w:sv:primärområde",
fallback = "neighborhood",
},
["principality"] = {
link = true,
fallback = "monarchy",
},
["promontory"] = {
link = true,
fallback = "headland",
},
["protectorate"] = {
link = true,
fallback = "dependent territory",
},
["จังหวัด"] = {
link = true,
preposition = "ของ",
display_handler = province_display_handler,
class = "subpolity",
},
["provinces and autonomous regions!"] = {
-- This and other similar "combined placetypes" are for use in the plural when grouping first-level
-- administrative regions of certain countries, in this case China.
category_link = "[[province]]s and [[autonomous region]]s",
class = "subpolity",
},
["provinces and territories!"] = {
-- This and other similar "combined placetypes" are for use in the plural when grouping first-level
-- administrative regions of certain countries, in this case Canada and Pakistan.
category_link = "[[province]]s and [[territory|territories]]",
class = "subpolity",
},
["provincial capital"] = {
link = true,
fallback = "เมืองหลวง",
},
["raion"] = {
link = true,
preposition = "ของ",
affix_type = "Suf",
class = "subpolity",
},
["ranch"] = {
link = true,
fallback = "farm",
},
["range"] = {
-- FIXME: Where is this used? Is it a mountain range?
link = true,
holonym_use_the = true,
class = "natural feature",
},
["regency"] = {
link = true,
preposition = "ของ",
class = "subpolity",
},
["ภูมิภาค"] = {
link = true,
preposition = "ของ",
-- If 'region' isn't a specific administrative division, fall back to 'geographic and cultural area'
fallback = "geographic and cultural area",
-- "former region" is a subpolity but traditional/historic(al)/ancient/medieval/etc. is a geographic region
class = "geographic region",
},
["regional capital"] = {
link = "separately",
fallback = "เมืองหลวง",
},
["regional county municipality"] = {
-- Quebec
link = "w",
preposition = "ของ",
affix_type = "Suf",
no_affix_strings = {"เทศบาล", "เทศมณฑล"},
fallback = "เทศบาล",
},
["regional district"] = {
link = "w",
preposition = "ของ",
affix_type = "Pref",
no_affix_strings = "อำเภอ",
fallback = "อำเภอ",
},
["regional municipality"] = {
link = "w",
preposition = "ของ",
affix_type = "Pref",
no_affix_strings = "เทศบาล",
fallback = "เทศบาล",
},
["regional unit"] = {
link = "w",
preposition = "ของ",
affix_type = "suf",
class = "subpolity",
},
["registration county"] = {
-- Used in Scotland for land registration purposes; formerly used in England, Wales and Ireland for statistical
-- purposes (registration of births, deaths and marriages, and for the output of census information).
link = "w",
fallback = "เทศมณฑล",
},
["republic"] = {
-- Of Russia, Yugoslavia, etc. "Republics" in general are sovereign but we use "ประเทศ" in that case.
link = true,
fallback = "constituent republic",
},
["research base"] = {
link = "+w:research station",
fallback = "research station",
},
["research station"] = {
link = "w",
class = "non-admin settlement", -- or "man-made structure"?
default = {true},
},
["reservoir"] = {
link = true,
fallback = "ทะเลสาบ",
},
["residential area"] = {
link = "separately",
fallback = "neighborhood",
},
["resort city"] = {
link = "w",
fallback = "นคร",
},
["resort town"] = {
link = "w",
fallback = "เมือง",
},
["แม่น้ำ"] = {
link = true,
generic_before_non_cities = "ใน",
holonym_use_the = true,
class = "natural feature",
addl_bare_category_parents = {"bodies of water"},
cat_handler = city_type_cat_handler,
["continent/*"] = {true},
default = {true},
},
["river island"] = {
link = "w",
fallback = "เกาะ",
},
["road"] = {
link = true,
class = "man-made structure",
default = {"Named roads"},
},
["Roman province"] = {
-- FIXME! Eliminate this in favor of 'former province|emp/Roman Empire'
link = "w",
default = {"Provinces of the Roman Empire"},
class = "subpolity",
},
["royal borough"] = {
link = "w",
preposition = "ของ",
affix_type = "Pref",
no_affix_strings = {"royal", "borough"},
fallback = "local government district with borough status",
has_neighborhoods = true,
},
["royal burgh"] = {
link = true,
fallback = "borough",
},
["royal capital"] = {
link = "w",
fallback = "เมืองหลวง",
},
["rural committee"] = {
-- Hong Kong; a group of villages
link = "w",
affix_type = "Suf",
has_neighborhoods = true,
class = "settlement",
},
["rural community"] = {
-- New Brunswick
link = "+w:list of municipalities in New_Brunswick#Rural communities",
fallback = "เทศบาล",
},
["rural hromada"] = {
link = "[[rural]] [[w:hromada|hromada]]",
affix_type = "suf",
fallback = "hromada",
},
["rural municipality"] = {
link = "w",
preposition = "ของ",
affix_type = "Pref",
no_affix_strings = "เทศบาล",
fallback = "เทศบาล",
has_neighborhoods = true, --?
},
["rural township"] = {
-- Taiwan
link = "+w:rural township (Taiwan)",
fallback = "township",
},
["sanctuary"] = {
link = true,
fallback = "temple",
},
["satrapy"] = {
link = true,
preposition = "ของ",
class = "subpolity",
inherently_former = {"ANCIENT", "FORMER"},
},
["ทะเล"] = {
link = true,
holonym_use_the = true,
class = "natural feature",
addl_bare_category_parents = {"bodies of water"},
default = {true},
},
["seaport"] = {
link = true,
fallback = "port",
},
["seat"] = {
link = true,
fallback = "administrative centre",
},
["self-administered area"] = {
-- Myanmar (groups self-administered divisions and zones)
link = "+w:self-administered zone",
preposition = "ของ",
class = "subpolity",
},
["self-administered division"] = {
-- Myanmar (only one of them: Wa Self-Administered Division)
link = "w",
fallback = "self-administered area",
},
["self-administered zone"] = {
-- Myanmar (five of them)
link = "w",
fallback = "self-administered area",
},
["separatist state"] = {
link = "separately",
fallback = "unrecognized country",
},
["การตั้งถิ่นฐาน"] = {
link = true,
category_link = "[[settlement]]s such as [[city|cities]], [[village]]s and [[farm]]s",
bare_category_parent = "สถานที่",
-- not necessarily true, but usually is the case
fallback = "village",
},
["settlement hromada"] = {
link = "[[w:Populated สถานที่ในUkraine#Rural settlements|การตั้งถิ่นฐาน]] [[w:hromada|hromada]]",
affix_type = "suf",
fallback = "hromada",
},
["sheading"] = {
-- Isle of Man
link = true,
fallback = "อำเภอ",
},
["sheep station"] = {
-- Australia
link = true,
fallback = "farm",
},
["shire"] = {
link = true,
fallback = "เทศมณฑล",
},
["shire county"] = {
link = "w",
fallback = "เทศมณฑล",
},
["shire town"] = {
link = true,
fallback = "county seat",
},
["ski resort city"] = {
link = "[[ski resort]] [[city]]",
fallback = "นคร",
},
["ski resort town"] = {
link = "[[ski resort]] [[town]]",
fallback = "เมือง",
},
["spa city"] = {
link = "+w:spa town",
fallback = "นคร",
},
["spa town"] = {
link = "w",
fallback = "เมือง",
},
["space station"] = {
link = true,
fallback = "research station",
},
["special administrative region"] = {
-- in China; in practice they are city-like (Hong Kong, Macau); also [[Oecusse]] in East Timor is formally a
-- "special administrative region"; North Korea had one such region planned (Sinuiju) but abandoned; Indonesia
-- has similar "special regions" of Jakarta, Yogyakarta and Aceh; and South Sudan has three "special
-- administrative areas"
link = "+w:special administrative regions of China",
preposition = "ของ",
class = "subpolity",
has_neighborhoods = true, --?
-- no suffix since สถานที่ในHong Kong or Macau are listed without China, except Hong Kong and Macau themselves
-- they also contain regions (or areas), e.g. [[Kowloon]], so it would be confusing
suffix = "",
},
["special collectivity"] = {
link = "w",
fallback = "collectivity",
},
["special municipality"] = {
-- formerly linked to the Taiwan article but there are also special municipalities of the Netherlands
link = "w",
fallback = "เทศบาล",
},
["special ward"] = {
-- Tokyo
link = true,
fallback = "เทศบาล",
},
["spit"] = {
link = true,
fallback = "peninsula",
},
["spring"] = {
link = true,
class = "natural feature",
default = {true},
},
["star"] = {
link = true,
class = "natural feature",
default = {true},
},
["รัฐ"] = {
link = true,
preposition = "ของ",
class = "subpolity",
-- 'former/historical state' could refer either to a state of a country (a division) or a state = sovereign
-- entity. The latter appears more common (e.g. in various "ancient states" of East Asia).
former_type = "องค์การทางการเมือง",
},
["states and territories!"] = {
-- This and other similar "combined placetypes" are for use in the plural when grouping first-level
-- administrative regions of certain countries, in this case Australia.
category_link = "[[state]]s and [[territory|territories]]",
class = "subpolity",
},
["states and union territories!"] = {
-- This and other similar "combined placetypes" are for use in the plural when grouping first-level
-- administrative regions of certain countries, in this case India.
category_link = "[[state]]s and [[union territory|union territories]]",
class = "subpolity",
},
["state capital"] = {
link = true,
fallback = "เมืองหลวง",
},
["state park"] = {
link = true,
fallback = "park",
},
["state-level new area"] = {
-- China (type of economic development zone, varying greatly in size)
link = "w",
fallback = "new area",
},
["statistical region"] = {
-- Slovenia
link = true,
fallback = "administrative region",
},
["statutory city"] = {
link = "w",
fallback = "นคร",
},
["statutory town"] = {
link = "w",
fallback = "เมือง",
},
["strait"] = {
link = true,
class = "natural feature",
addl_bare_category_parents = {"bodies of water"},
default = {true},
},
["stream"] = {
link = true,
fallback = "แม่น้ำ",
},
["street"] = {
link = true,
fallback = "road",
},
["strip"] = {
link = true,
fallback = "geographic region",
},
["strip of land"] = {
link = "[[strip]] of [[land]]",
plural = "strips of land",
plural_link = "[[strip]]s of [[land]]",
fallback = "geographic region",
},
["sub-metropolitan city"] = {
link = "+w:List of cities in Nepal#Sub-metropolitan cities",
fallback = "นคร",
},
["sub-prefectural city"] = {
link = "w",
fallback = "subprovincial city",
},
["ตำบล"] = {
link = true,
preposition = "ของ",
has_neighborhoods = true, --?
-- FIXME: subdistricts can be neighborhood-like (of Jakarta) or larger (in China); need a handler
class = "subpolity",
default = {true},
},
["subdivision"] = {
link = true,
preposition = "ของ",
affix_type = "suf",
-- FIXME: subdivisions can be neighborhood-like or larger; need a handler
class = "subpolity",
cat_handler = district_neighborhood_cat_handler,
},
["submerged ghost town"] = {
-- FIXME: Consider just having "submerged" as a qualifier.
link = "[[submerged]] [[ghost town]]",
fallback = "ghost town",
},
["subnational kingdom"] = {
link = "+w:subnational monarchy",
fallback = "non-sovereign kingdom",
},
["subnational monarchy"] = {
link = "w",
fallback = "non-sovereign kingdom",
},
["subprefecture"] = {
link = true,
affix_type = "suf",
preposition = "ของ",
class = "subpolity",
},
["subprovince"] = {
link = true,
preposition = "ของ",
class = "subpolity",
},
["subprovincial city"] = {
link = "w",
-- China; special status given to certain prefecture-level cities
fallback = "prefecture-level city",
},
["subprovincial district"] = {
link = "w",
-- China; special status given to Binhai New Area and Pudong New Area, which are county-level districts
preposition = "ของ",
class = "subpolity",
},
["subregion"] = {
link = true,
fallback = "geographic region",
},
["suburb"] = {
link = true,
-- The following text is suitable for the top-level description of a suburb as well as categories of the form
-- 'Suburbs in POLDIV' e.g. 'Suburbs in Illinois, USA' but not for categories of the form 'Suburbs of Chicago',
-- where we'd get "[[suburb]]s of [[city|cities]] of [[Chicago]]".
category_link = "[[suburb]]s of [[city|cities]]",
category_link_before_city = "[[suburb]]s",
-- See comments under "neighborhood" for the following three settings. They are used by
-- [[Module:category tree/topic cat/data/Places]] for generating the text of 'Suburbs in/of PLACE' categories
-- but currently ignored by district_neighborhood_cat_handler (which actually generates the categories for a
-- given page), which hardcodes "ใน" for non-cities and "ของ" for cities. (FIXME: Change this.)
generic_before_non_cities = "ใน",
generic_before_cities = "ของ",
preposition = "ของ",
has_neighborhoods = true, --?
class = "non-admin settlement", --?
cat_handler = district_neighborhood_cat_handler,
},
["suburban area"] = {
link = "w",
fallback = "suburb",
},
["subway station"] = {
link = "w",
fallback = "metro station",
},
["sum"] = {
-- In China, Mongolia, Russia; something like a county in Mongolia but a township in China (Inner Mongolia),
-- and equivalent to a [[selsoviet]] in the parts of Russia where it's in use (a rural council, below a raion).
link = "+w:sum (administrative division)",
-- This fallback is somewha arbitrary. We could use "เทศมณฑล" but that has a display handler
-- which we don't want to be active (FIXME: If the display handler would be active, that's a bug).
fallback = "division",
},
["supercontinent"] = {
link = true,
fallback = "continent",
},
["tehsil"] = {
link = true,
affix_type = "suf",
no_affix_strings = {"tehsil", "tahsil"},
class = "subpolity",
},
["temple"] = {
link = true,
fallback = "building",
},
["territorial authority"] = {
link = "w",
fallback = "อำเภอ",
},
["ดินแดน"] = {
link = true,
preposition = "ของ",
class = "subpolity",
},
["theme"] = {
link = "+w:theme (Byzantine district)",
preposition = "ของ",
class = "subpolity",
},
["เมือง"] = {
link = true,
generic_before_non_cities = "ใน",
has_neighborhoods = true,
class = "settlement",
cat_handler = city_type_cat_handler,
default = {true},
},
["town with bystatus"] = {
-- can't use templates in links currently
link = "[[town]] with [[bystatus#Norwegian Bokmål|bystatus]]",
plural = "towns with bystatus",
plural_link = "[[town]]s with [[bystatus#Norwegian Bokmål|bystatus]]",
fallback = "เมือง",
},
["township"] = {
link = true,
has_neighborhoods = true,
class = "settlement", --?
default = {true},
},
["township municipality"] = {
-- Quebec
link = "+w:township municipality (Quebec)",
preposition = "ของ",
fallback = "เทศบาล",
has_neighborhoods = true, --?
},
["traditional county"] = {
link = true,
fallback = "เทศมณฑล",
},
["traditional region"] = {
-- FIXME: Verify this works. Same for 'historic(al) region'.
-- provided only for the link
link = "w",
fallback = "FORMER geographic region",
},
["trail"] = {
link = true,
fallback = "road",
},
["treaty port"] = {
link = "w",
fallback = "นคร",
class = "settlement",
inherently_former = {"FORMER"},
},
["tributary"] = {
link = true,
preposition = "ของ",
fallback = "แม่น้ำ",
},
["underground station"] = {
link = "w",
fallback = "metro station",
},
["unincorporated area"] = {
link = "w",
-- I don't know if this fallback makes sense everywhere.
fallback = "unincorporated community",
},
["unincorporated community"] = {
link = true,
generic_before_non_cities = "ใน",
class = "non-admin settlement",
},
["unincorporated territory"] = {
link = "w",
fallback = "ดินแดน",
},
["union territory"] = {
-- India
link = true,
preposition = "ของ",
entry_placetype_indefinite_article = "a",
class = "subpolity",
},
["unitary authority"] = {
-- UK, New Zealand
link = true,
entry_placetype_indefinite_article = "a",
fallback = "local government district",
},
["unitary district"] = {
link = "w",
entry_placetype_indefinite_article = "a",
fallback = "local government district",
},
["united township municipality"] = {
-- Quebec
link = "+w:united township municipality (Quebec)",
entry_placetype_indefinite_article = "a",
fallback = "township municipality",
has_neighborhoods = true, --?
},
["university"] = {
link = true,
entry_placetype_indefinite_article = "a",
class = "man-made structure",
default = {true},
},
["unrecognised country"] = {
link = "w",
fallback = "unrecognized country",
},
["unrecognized and nearly unrecognized countries!"] = {
category_link = "[[de facto]] [[independent]] [[state]]s with little or no {{w|international recognition}}",
bare_category_parent = "country-like entities",
},
["unrecognized country"] = {
link = "w",
class = "polity", --ห้ามแปล class
default = {"Unrecognized and nearly unrecognized countries"},
},
["unrecognised state"] = {
link = "w",
fallback = "unrecognized country",
},
["unrecognized state"] = {
link = "w",
fallback = "unrecognized country",
},
["urban area"] = {
link = "separately",
fallback = "neighborhood",
},
["urban hromada"] = {
link = "[[urban]] [[w:hromada|hromada]]",
affix_type = "suf",
fallback = "hromada",
},
["urban service area"] = {
-- A strange beast existing in Alberta; technically a type of hamlet but in practice used for much larger
-- cities and treated equivalent to a city. (There are only two of them, [[Fort McMurray]] and [[Sherwood Park]]).
link = "w",
fallback = "นคร",
},
["urban township"] = {
link = "w",
fallback = "township",
},
["urban-type settlement"] = {
-- appears to be a particular type of small urban settlement in post-Soviet states,
-- had an administrative function.
link = "w",
fallback = "เมือง",
},
["valley"] = {
link = true,
class = "natural feature",
addl_bare_category_parents = {"ธรณีสัณฐาน", "water"},
default = {true},
},
["viceroyalty"] = {
-- in essence, a type of colony
link = true,
fallback = "dependent territory",
},
["village"] = {
link = true,
generic_before_non_cities = "ใน",
category_link = "[[village]]s, [[hamlet]]s, and other small [[community|communities]] and [[settlement]]s",
class = "settlement",
cat_handler = city_type_cat_handler,
default = {true},
},
["village development committee"] = {
-- former administrative structure in Nepal; also exists in India but not as a formal unit
link = "+w:village development committee (Nepal)",
inherently_former = {"FORMER"},
fallback = "village",
},
["village municipality"] = {
-- Quebec
link = "+w:village municipality (Quebec)",
preposition = "ของ",
fallback = "เทศบาล",
has_neighborhoods = true, --?
},
["voivodeship"] = {
-- Poland
link = true,
display_handler = voivodeship_display_handler,
preposition = "ของ",
class = "subpolity",
},
["volcano"] = {
link = true,
plural = "volcanoes",
class = "natural feature",
addl_bare_category_parents = {"ธรณีสัณฐาน"},
default = {true, "ภูเขา"},
},
["ward"] = {
link = true,
class = "settlement",
-- Wards are formal administrative divisions of a city but have some properties of neighborhoods.
fallback = "neighborhood",
},
["watercourse"] = {
link = true,
fallback = "channel",
},
["Welsh community"] = {
-- Wales
link = "[[w:community (Wales)|community]]",
preposition = "ของ",
affix_type = "suf",
affix = "community",
has_neighborhoods = true,
class = "settlement",
},
["zone"] = {
-- administrative division of Ethiopia, Qatar, Nepal, India
link = "+w:zone#Place names",
preposition = "ของ",
class = "subpolity",
},
----------------------------------------------------------------------------------------------
-- Categories for former places --
----------------------------------------------------------------------------------------------
["ANCIENT capital"] = {
link = false,
entry_placetype_use_the = true,
preposition = "ของ",
has_neighborhoods = true,
class = "capital",
-- FIXME: Consider removing 'ancient settlements' here. Ancient capitals, like former capitals, often still
-- exist but just aren't the capital any more. Maybe we should have an 'Ancient capitals' category.
default = {"Ancient settlements", "Former capitals"},
},
["ANCIENT non-admin settlement"] = {
link = false,
class = "non-admin settlement",
fallback = "ANCIENT settlement",
},
["ANCIENT settlement"] = {
link = false,
has_neighborhoods = true,
class = "settlement",
default = {"Ancient settlements"},
},
["ancient settlements!"] = {
category_link = "former [[city|cities]], [[town]]s and [[village]]s that existed in [[antiquity]]",
bare_category_parent = "former settlements",
},
["FORMER capital"] = {
link = false,
entry_placetype_use_the = true,
preposition = "ของ",
has_neighborhoods = true,
class = "capital",
default = {"Former capitals"},
},
["former capitals!"] = {
category_link = "former [[capital]] [[city|cities]] and [[town]]s",
bare_category_parent = "การตั้งถิ่นฐาน",
},
["former counties and county-level cities!"] = {
-- For categorizing former counties and county-level cities of China
category_link = "no-longer existing [[county|counties]] and [[county-level city|county-level cities]]",
bare_category_breadcrumb = "counties and county-level cities",
bare_category_parent = "former political divisions",
},
["FORMER county"] = {
-- For categorizing former counties and county-level cities of China
link = false,
fallback = "FORMER subpolity",
},
["FORMER county-level city"] = {
-- For categorizing former counties and county-level cities of China
link = false,
fallback = "FORMER subpolity",
},
["former countries and country-like entities!"] = {
category_link = "[[country|countries]] and similar [[polity|polities]] that no longer exist",
bare_category_breadcrumb = "countries and country-like entities",
bare_category_parent = "former polities",
},
["FORMER country"] = {
link = false,
class = "polity", --ห้ามแปล class
default = {"Former countries and country-like entities"},
},
["former dependent territories!"] = {
category_link = "[[w:dependent territory|dependent territories]] (colonies, dependencies, protectorates, etc.) that no longer exist",
bare_category_breadcrumb = "dependent territories",
bare_category_parent = "former political divisions",
},
["FORMER dependent territory"] = {
link = false,
preposition = "ของ",
class = "subpolity",
default = {"Former dependent territories"},
},
["former districts!"] = {
-- For categorizing former districts of China
category_link = "no-longer-existing [[district]]s",
bare_category_breadcrumb = "อำเภอ",
bare_category_parent = "former political divisions",
},
["FORMER district"] = {
-- For categorizing former districts of China
link = false,
fallback = "FORMER subpolity",
},
["FORMER geographic region"] = {
link = false,
fallback = "geographic and cultural area",
},
["FORMER man-made structure"] = {
link = false,
class = "man-made structure",
default = {"Former man-made structures"},
},
["former man-made structures!"] = {
category_link = "man-made structures such as [[airport]]s and [[park]]s that no longer exist",
bare_category_breadcrumb = "man-made structures",
bare_category_parent = "former places",
},
["former municipalities!"] = {
-- For categorizing former municipalities of the Netherlands
category_link = "no-longer-existing [[municipality|municipalities]]",
bare_category_breadcrumb = "เทศบาล",
bare_category_parent = "former political divisions",
},
["FORMER municipality"] = {
-- For categorizing former municipalities of the Netherlands
link = false,
fallback = "FORMER subpolity",
},
["FORMER natural feature"] = {
link = false,
class = "natural feature",
default = {"Former natural features"},
},
["former natural features!"] = {
category_link = "natural features such as [[lake]]s, [[river]]s and [[island]]s that no longer exist",
bare_category_breadcrumb = "natural features",
bare_category_parent = "former places",
},
["FORMER non-admin settlement"] = {
link = false,
class = "non-admin settlement",
fallback = "FORMER settlement",
},
["former places!"] = {
category_link = "[[place]]s of all sorts that no longer exist",
bare_category_breadcrumb = "former",
bare_category_parent = "สถานที่",
},
["former political divisions!"] = {
category_link = "[[political]] [[division]]s (states, provinces, counties, etc.) that no longer exist",
bare_category_breadcrumb = "political divisions",
bare_category_parent = "former places",
},
["former polities!"] = {
category_link = "[[polity|polities]] (countries, kingdoms, empires, etc.) that no longer exist",
bare_category_breadcrumb = "องค์การทางการเมือง",
bare_category_parent = "former places",
},
["FORMER polity"] = {
link = false,
class = "polity", --ห้ามแปล class
default = {"Former polities"},
},
["former prefectures!"] = {
-- For categorizing former prefectures of China
category_link = "no-longer-existing [[prefecture]]s",
bare_category_breadcrumb = "prefectures",
bare_category_parent = "former political divisions",
},
["FORMER prefecture"] = {
-- For categorizing former prefectures of China
link = false,
fallback = "FORMER subpolity",
},
["former provinces!"] = {
-- For categorizing former provinces of China, etc.
category_link = "no-longer-existing [[province]]s",
bare_category_breadcrumb = "จังหวัด",
bare_category_parent = "former political divisions",
},
["FORMER province"] = {
-- For categorizing ancient/historical/former provinces of the Roman Empire
link = false,
fallback = "FORMER subpolity",
},
["former region"] = {
-- A former region is considered a former political division, but not a 'historical/traditional/etc.' region.
link = "separately",
preposition = "ของ",
inherently_former = {"FORMER"},
class = "subpolity",
},
["FORMER settlement"] = {
link = false,
has_neighborhoods = true,
class = "settlement",
default = {"Former settlements"},
},
["former settlements!"] = {
category_link = "[[city|cities]], [[town]]s and [[village]]s that no longer exist or have been merged or reclassified",
bare_category_breadcrumb = "การตั้งถิ่นฐาน",
bare_category_parent = "former political divisions",
},
["FORMER subpolity"] = {
link = false,
preposition = "ของ",
class = "subpolity",
default = {"Former political divisions"},
},
----------------------------------------------------------------------------------------------
-- form-of categories --
----------------------------------------------------------------------------------------------
---------- Abbreviations ----------
["abbreviations of counties!"] = {
-- For categorizing abbreviations of counties of e.g. England
full_category_link = "{{glossary|abbreviation}}s of [[name]]s of [[county|counties]]",
bare_category_breadcrumb = "เทศมณฑล",
bare_category_parent = "abbreviations of political divisions",
},
["abbreviations of countries!"] = {
full_category_link = "{{glossary|abbreviation}}s of [[name]]s of [[country|countries]]",
bare_category_breadcrumb = "ประเทศ",
bare_category_parent = "abbreviations of places",
},
["abbreviations of departments!"] = {
-- For categorizing abbreviations of departments of e.g. France
full_category_link = "{{glossary|abbreviation}}s of [[name]]s of [[department]]s",
bare_category_breadcrumb = "departments",
bare_category_parent = "abbreviations of political divisions",
},
["abbreviations of districts!"] = {
-- For categorizing abbreviations of districts of e.g. ???
full_category_link = "{{glossary|abbreviation}}s of [[name]]s of [[district]]s",
bare_category_breadcrumb = "อำเภอ",
bare_category_parent = "abbreviations of political divisions",
},
["abbreviations of divisions!"] = {
-- For categorizing abbreviations of divisions of e.g. Bangladesh
full_category_link = "{{glossary|abbreviation}}s of [[name]]s of [[division]]s",
bare_category_breadcrumb = "divisions",
bare_category_parent = "abbreviations of political divisions",
},
["abbreviations of former countries!"] = {
full_category_link = "{{glossary|abbreviation}}s of [[country|countries]] that no longer [[exist]]",
bare_category_breadcrumb = "ประเทศ",
bare_category_parent = "abbreviations of former places",
},
["abbreviations of former places!"] = {
full_category_link = "{{glossary|abbreviation}}s of [[place]]s that no longer [[exist]]",
bare_category_breadcrumb = "abbreviations",
bare_category_parent = "former places",
addl_bare_category_parents = {{name = "abbreviations of places", sort = "former"}},
},
["abbreviations of places!"] = {
full_category_link = "{{glossary|abbreviation}}s of [[name]]s of [[place]]s",
bare_category_breadcrumb = "abbreviations",
bare_category_parent = "สถานที่",
},
["abbreviations of political divisions!"] = {
full_category_link = "{{glossary|abbreviation}}s of [[name]]s of [[political]] [[division]]s",
bare_category_breadcrumb = "political divisions",
bare_category_parent = "abbreviations of places",
},
["abbreviations of prefectures!"] = {
-- For categorizing abbreviations of prefectures of e.g. Japan
full_category_link = "{{glossary|abbreviation}}s of [[name]]s of [[prefecture]]s",
bare_category_breadcrumb = "prefectures",
bare_category_parent = "abbreviations of political divisions",
},
["abbreviations of provinces!"] = {
-- For categorizing abbreviations of provinces of e.g. Canada
full_category_link = "{{glossary|abbreviation}}s of [[name]]s of [[province]]s",
bare_category_breadcrumb = "จังหวัด",
bare_category_parent = "abbreviations of political divisions",
},
["abbreviations of provinces and territories!"] = {
full_category_link = "{{glossary|abbreviation}}s of [[name]]s of [[province]]s and [[territory|territories]]",
bare_category_breadcrumb = "provinces and territories",
bare_category_parent = "abbreviations of political divisions",
},
["abbreviations of regions!"] = {
-- For categorizing abbreviations of regions of e.g. Italy
full_category_link = "{{glossary|abbreviation}}s of [[name]]s of [[administrative region]]s",
bare_category_breadcrumb = "ภูมิภาค",
bare_category_parent = "abbreviations of political divisions",
},
["abbreviations of states!"] = {
-- For categorizing abbreviations of states of e.g. the United States
full_category_link = "{{glossary|abbreviation}}s of [[name]]s of [[state]]s",
bare_category_breadcrumb = "รัฐ",
bare_category_parent = "abbreviations of political divisions",
},
["abbreviations of states and territories!"] = {
full_category_link = "{{glossary|abbreviation}}s of [[name]]s of [[state]]s and [[territory|territories]]",
bare_category_breadcrumb = "states and territories",
bare_category_parent = "abbreviations of political divisions",
},
["abbreviations of states and union territories!"] = {
full_category_link = "{{glossary|abbreviation}}s of [[name]]s of [[state]]s and [[union territory|union territories]]",
bare_category_breadcrumb = "states and union territories",
bare_category_parent = "abbreviations of political divisions",
},
["abbreviations of territories!"] = {
full_category_link = "{{glossary|abbreviation}}s of [[name]]s of [[territory|territories]]",
bare_category_breadcrumb = "ดินแดน",
bare_category_parent = "abbreviations of political divisions",
},
["ABBREVIATION_OF country"] = {
link = false,
default = {"Abbreviations of countries"},
},
["ABBREVIATION_OF county"] = {
link = false,
fallback = "ABBREVIATION_OF subpolity",
},
["ABBREVIATION_OF department"] = {
link = false,
fallback = "ABBREVIATION_OF subpolity",
},
["ABBREVIATION_OF district"] = {
link = false,
fallback = "ABBREVIATION_OF subpolity",
},
["ABBREVIATION_OF division"] = {
link = false,
fallback = "ABBREVIATION_OF subpolity",
},
["ABBREVIATION_OF FORMER country"] = {
link = false,
default = {"Abbreviations of former countries"},
},
["ABBREVIATION_OF FORMER place"] = {
link = false,
default = {"Abbreviations of former places"},
},
["ABBREVIATION_OF place"] = {
link = false,
default = {"Abbreviations of places"},
},
["ABBREVIATION_OF prefecture"] = {
link = false,
fallback = "ABBREVIATION_OF subpolity",
},
["ABBREVIATION_OF province"] = {
link = false,
fallback = "ABBREVIATION_OF subpolity",
},
["ABBREVIATION_OF region"] = {
link = false,
fallback = "ABBREVIATION_OF subpolity",
},
["ABBREVIATION_OF state"] = {
link = false,
fallback = "ABBREVIATION_OF subpolity",
},
["ABBREVIATION_OF subpolity"] = {
link = false,
default = {"Abbreviations of political divisions"},
},
["ABBREVIATION_OF territory"] = {
link = false,
fallback = "ABBREVIATION_OF subpolity",
},
["ABBREVIATION_OF union territory"] = {
link = false,
fallback = "ABBREVIATION_OF subpolity",
},
---------- Archaic forms ----------
["archaic forms of places!"] = {
full_category_link = "{{glossary|archaic}} [[form]]s of [[name]]s of [[place]]s",
bare_category_breadcrumb = "archaic forms",
bare_category_parent = "สถานที่",
},
["ARCHAIC_FORM_OF place"] = {
link = false,
default = {"Archaic forms of places"},
},
---------- Clippings ----------
["clippings of places!"] = {
full_category_link = "{{glossary|clipping}}s of [[name]]s of [[place]]s",
bare_category_breadcrumb = "clippings",
bare_category_parent = "สถานที่",
},
["CLIPPING_OF place"] = {
link = false,
default = {"Clippings of places"},
},
---------- Dated forms ----------
["dated forms of places!"] = {
full_category_link = "{{glossary|dated}} [[form]]s of [[name]]s of [[place]]s",
bare_category_breadcrumb = "dated forms",
bare_category_parent = "สถานที่",
},
["DATED_FORM_OF place"] = {
link = false,
default = {"Dated forms of places"},
},
---------- Derogatory names ----------
["derogatory names for cities!"] = {
full_category_link = "{{glossary|derogatory}} [[name]]s for [[city|cities]]",
bare_category_breadcrumb = "นคร",
bare_category_parent = "derogatory names for places",
addl_bare_category_parents = {"nicknames for cities"},
},
["derogatory names for continents!"] = {
full_category_link = "{{glossary|derogatory}} [[name]]s for [[continent]]s",
bare_category_breadcrumb = "ทวีป",
bare_category_parent = "derogatory names for places",
addl_bare_category_parents = {"nicknames for continents"},
},
["derogatory names for countries!"] = {
full_category_link = "{{glossary|derogatory}} [[name]]s for [[country|countries]]",
bare_category_breadcrumb = "ประเทศ",
bare_category_parent = "derogatory names for places",
addl_bare_category_parents = {"nicknames for countries"},
},
["derogatory names for places!"] = {
full_category_link = "{{glossary|derogatory}} [[name]]s for [[place]]s",
bare_category_breadcrumb = "derogatory names",
bare_category_parent = "nicknames for places",
},
["derogatory names for states!"] = {
full_category_link = "{{glossary|derogatory}} [[name]]s for [[state]]s",
bare_category_breadcrumb = "รัฐ",
bare_category_parent = "derogatory names for places",
addl_bare_category_parents = {"nicknames for states"},
},
["DEROGATORY_NAME_FOR capital"] = {
link = false,
default = {"Derogatory names for cities"},
},
["DEROGATORY_NAME_FOR city"] = {
link = false,
default = {"Derogatory names for cities"},
},
["DEROGATORY_NAME_FOR continent"] = {
link = false,
default = {"Derogatory names for continents"},
},
["DEROGATORY_NAME_FOR country"] = {
link = false,
default = {"Derogatory names for countries"},
},
["DEROGATORY_NAME_FOR metropolitan city"] = {
-- "metropolitan city" doesn't fall back to "นคร"
link = false,
default = {"Derogatory names for cities"},
},
["DEROGATORY_NAME_FOR place"] = {
link = false,
default = {"Derogatory names for places"},
},
["DEROGATORY_NAME_FOR prefecture-level city"] = {
-- "prefecture-level city" doesn't fall back to "นคร" but things like "county-level city" and
-- "subprovincial city" fall back to "prefecture-level city"
link = false,
default = {"Derogatory names for cities"},
},
["DEROGATORY_NAME_FOR state"] = {
link = false,
default = {"Derogatory names for states"},
},
["DEROGATORY_NAME_FOR town"] = {
link = false,
default = {"Derogatory names for cities"},
},
---------- Ellipses ----------
["ellipses of places!"] = {
full_category_link = "{{glossary|ellipsis|ellipses}} of [[name]]s of [[place]]s",
bare_category_breadcrumb = "ellipses",
bare_category_parent = "สถานที่",
},
["ELLIPSIS_OF place"] = {
link = false,
default = {"Ellipses of places"},
},
---------- Former long-form names ----------
["former long-form names of countries!"] = {
full_category_link = "no-longer-[[use]]d [[long]]-[[form]] (but typically [[unofficial]]) [[name]]s of [[country|countries]]",
bare_category_breadcrumb = "ประเทศ",
bare_category_parent = "former long-form names of places",
addl_bare_category_parents = {{name = "former names of countries", sort = "long-form"}},
},
["former long-form names of places!"] = {
full_category_link = "no-longer-[[use]]d [[long]]-[[form]] (but typically [[unofficial]]) [[name]]s of [[place]]s",
bare_category_breadcrumb = "long-form",
bare_category_parent = "former names of places",
},
["FORMER_LONG_FORM_OF country"] = {
link = false,
default = {"Former long-form names of countries"},
},
["FORMER_LONG_FORM_OF place"] = {
link = false,
default = {"Former long-form names of places"},
},
---------- Former names ----------
["former names of capitals!"] = {
full_category_link = "[[former]] [[name]]s of [[capital city|capital cities]] that generally still exist but under a different name",
bare_category_breadcrumb = "เมืองหลวง",
bare_category_parent = "former names of settlements",
},
["former names of countries!"] = {
full_category_link = "[[former]] [[name]]s of [[country|countries]] that generally still exist but under a different name",
bare_category_breadcrumb = "ประเทศ",
bare_category_parent = "former names of places",
},
["former names of places!"] = {
full_category_link = "[[former]] [[name]]s of [[place]]s that generally still exist but under a different name",
bare_category_breadcrumb = "former names",
bare_category_parent = "สถานที่",
},
["former names of political divisions!"] = {
full_category_link = "[[former]] [[name]]s of [[political]] [[division]]s (states, provinces, counties, etc.) that generally still exist but under a different name",
bare_category_breadcrumb = "political divisions",
bare_category_parent = "former names of places",
},
["former names of polities!"] = {
full_category_link = "[[former]] [[name]]s of [[polity|polities]] (e.g. [[country|countries]]) that generally still exist but under a different name",
bare_category_breadcrumb = "องค์การทางการเมือง",
bare_category_parent = "former names of places",
},
["former names of settlements!"] = {
full_category_link = "[[former]] [[name]]s of [[city|cities]], [[town]]s, [[village]]s, etc. that generally still exist but under a different name",
bare_category_breadcrumb = "การตั้งถิ่นฐาน",
bare_category_parent = "former names of political divisions",
},
["FORMER_NAME_OF capital"] = {
link = false,
default = {"Former names of capitals"},
},
["FORMER_NAME_OF country"] = {
link = false,
default = {"Former names of countries"},
},
["FORMER_NAME_OF place"] = {
link = false,
default = {"Former names of places"},
},
["FORMER_NAME_OF polity"] = {
link = false,
default = {"Former names of polities"},
},
["FORMER_NAME_OF region"] = {
link = false,
fallback = "FORMER_NAME_OF subpolity",
},
["FORMER_NAME_OF settlement"] = {
link = false,
default = {"Former names of settlements"},
},
["FORMER_NAME_OF subpolity"] = {
link = false,
default = {"Former names of political divisions"},
},
---------- Former nicknames ----------
["former nicknames for cities!"] = {
full_category_link = "no-longer-used [[nickname]]s for [[city|cities]], e.g. the [[Eternal City]] for [[Kyoto]] during the {{w|Heian period}} ({{circa2|800–1100|short=yes}} {{AD}})",
bare_category_breadcrumb = "นคร",
bare_category_parent = "former nicknames for places",
addl_bare_category_parents = {"nicknames for cities"},
},
["former nicknames for places!"] = {
full_category_link = "no-longer-used [[nickname]]s for [[place]]s",
bare_category_breadcrumb = "former",
bare_category_parent = "nicknames for places",
addl_bare_category_parents = {{name = "former names of places", sort = "nicknames"}},
},
["FORMER_NICKNAME_FOR capital"] = {
link = false,
default = {"Former nicknames for cities"},
},
["FORMER_NICKNAME_FOR city"] = {
link = false,
default = {"Former nicknames for cities"},
},
["FORMER_NICKNAME_FOR metropolitan city"] = {
-- "metropolitan city" doesn't fall back to "นคร"
link = false,
default = {"Former nicknames for cities"},
},
["FORMER_NICKNAME_FOR place"] = {
link = false,
default = {"Former nicknames for places"},
},
["FORMER_NICKNAME_FOR prefecture-level city"] = {
-- "prefecture-level city" doesn't fall back to "นคร" but things like "county-level city" and
-- "subprovincial city" fall back to "prefecture-level city"
link = false,
default = {"Former nicknames for cities"},
},
["FORMER_NICKNAME_FOR town"] = {
link = false,
default = {"Former nicknames for cities"},
},
---------- Former official names ----------
["former official names of countries!"] = {
full_category_link = "no-longer-[[use]]d [[official]] [[name]]s of [[country|countries]]",
bare_category_breadcrumb = "ประเทศ",
bare_category_parent = "former official names of places",
addl_bare_category_parents = {{name = "former names of countries", sort = "official"}},
},
["former official names of places!"] = {
full_category_link = "no-longer-[[use]]d [[official]] [[name]]s of [[place]]s",
bare_category_breadcrumb = "official",
bare_category_parent = "former names of places",
},
["FORMER_OFFICIAL_NAME_OF country"] = {
link = false,
default = {"Former official names of countries"},
},
["FORMER_OFFICIAL_NAME_OF place"] = {
link = false,
default = {"Former official names of places"},
},
---------- Long-form names ----------
["long-form names of countries!"] = {
full_category_link = "[[long]]-[[form]] (but typically [[unofficial]]) [[name]]s of [[country|countries]]",
bare_category_breadcrumb = "ประเทศ",
bare_category_parent = "long-form names of places",
},
["long-form names of places!"] = {
full_category_link = "[[long]]-[[form]] (but typically [[unofficial]]) [[name]]s of [[place]]s",
bare_category_breadcrumb = "long-form names",
bare_category_parent = "สถานที่",
},
["LONG_FORM_OF country"] = {
link = false,
default = {"Long-form names of countries"},
},
["LONG_FORM_OF place"] = {
link = false,
default = {"Long-form names of places"},
},
---------- Nicknames ----------
["nicknames for cities!"] = {
full_category_link = "[[nickname]]s for [[city|cities]], e.g. the [[Big Apple]] for [[New York City]]",
bare_category_breadcrumb = "นคร",
bare_category_parent = "nicknames for places",
addl_bare_category_parents = {"นคร"},
},
["nicknames for continents!"] = {
full_category_link = "[[nickname]]s for [[continent]]s",
bare_category_breadcrumb = "ทวีป",
bare_category_parent = "nicknames for places",
addl_bare_category_parents = {"ทวีป"},
},
["nicknames for countries!"] = {
full_category_link = "[[nickname]]s for [[country|countries]]",
bare_category_breadcrumb = "ประเทศ",
bare_category_parent = "nicknames for places",
addl_bare_category_parents = {"ประเทศ"},
},
["nicknames for places!"] = {
full_category_link = "[[nickname]]s for [[place]]s",
bare_category_breadcrumb = "สถานที่",
bare_category_parent = "nicknames",
addl_bare_category_parents = {"สถานที่"},
},
["nicknames for states!"] = {
-- For categorizing nicknames for states of e.g. the United States
full_category_link = "[[nicknames]] for [[state]]s",
bare_category_breadcrumb = "รัฐ",
bare_category_parent = "nicknames for places",
addl_bare_category_parents = {"รัฐ"},
},
["NICKNAME_FOR capital"] = {
link = false,
default = {"Nicknames for cities"},
},
["NICKNAME_FOR city"] = {
link = false,
default = {"Nicknames for cities"},
},
["NICKNAME_FOR continent"] = {
link = false,
default = {"Nicknames for continents"},
},
["NICKNAME_FOR country"] = {
link = false,
default = {"Nicknames for countries"},
},
["NICKNAME_FOR metropolitan city"] = {
-- "metropolitan city" doesn't fall back to "นคร"
link = false,
default = {"Nicknames for cities"},
},
["NICKNAME_FOR place"] = {
link = false,
default = {"Nicknames for places"},
},
["NICKNAME_FOR prefecture-level city"] = {
-- "prefecture-level city" doesn't fall back to "นคร" but things like "county-level city" and
-- "subprovincial city" fall back to "prefecture-level city"
link = false,
default = {"Nicknames for cities"},
},
["NICKNAME_FOR state"] = {
link = false,
default = {"Nicknames for states"},
},
["NICKNAME_FOR town"] = {
link = false,
default = {"Nicknames for cities"},
},
---------- Obsolete forms ----------
["obsolete forms of places!"] = {
full_category_link = "{{glossary|obsolete}} [[form]]s of [[name]]s of [[place]]s",
bare_category_breadcrumb = "obsolete forms",
bare_category_parent = "สถานที่",
},
["OBSOLETE_FORM_OF place"] = {
link = false,
default = {"Obsolete forms of places"},
},
---------- Official names ----------
["official names of countries!"] = {
full_category_link = "[[official]] [[name]]s of [[country|countries]]",
bare_category_breadcrumb = "ประเทศ",
bare_category_parent = "official names of places",
},
["official names of former countries!"] = {
full_category_link = "[[official]] [[name]]s of [[country|countries]] that no longer [[exist]]",
bare_category_breadcrumb = "ประเทศ",
bare_category_parent = "official names of former places",
},
["official names of former places!"] = {
full_category_link = "[[official]] [[name]]s of [[place]]s that no longer [[exist]]",
bare_category_breadcrumb = "official names",
bare_category_parent = "former places",
addl_bare_category_parents = {{name = "official names of places", sort = "former"}},
},
["official names of places!"] = {
full_category_link = "[[official]] [[name]]s of [[place]]s",
bare_category_breadcrumb = "official names",
bare_category_parent = "สถานที่",
},
["OFFICIAL_NAME_OF country"] = {
link = false,
default = {"Official names of countries"},
},
["OFFICIAL_NAME_OF FORMER country"] = {
link = false,
default = {"Official names of former countries"},
},
["OFFICIAL_NAME_OF FORMER place"] = {
link = false,
default = {"Official names of former places"},
},
["OFFICIAL_NAME_OF place"] = {
link = false,
default = {"Official names of places"},
},
---------- Official nicknames ----------
["official nicknames for places!"] = {
full_category_link = "[[official]] [[nickname]]s for [[place]]s",
bare_category_breadcrumb = "official",
bare_category_parent = "nicknames for places",
},
["official nicknames for states!"] = {
-- For categorizing official nicknames for states of e.g. the United States
full_category_link = "[[official]] [[nicknames]] for [[state]]s",
bare_category_breadcrumb = "official",
bare_category_parent = "nicknames for states",
addl_bare_category_parents = {"รัฐ"},
},
["OFFICIAL_NICKNAME_FOR place"] = {
link = false,
default = {"Official nicknames for places"},
},
["OFFICIAL_NICKNAME_FOR state"] = {
link = false,
default = {"Official nicknames for states"},
},
}
export.plural_placetype_to_singular = {}
for sg_placetype, spec in pairs(export.placetype_data) do
if spec.plural then
export.plural_placetype_to_singular[spec.plural] = sg_placetype
end
end
return export
kzziwt1anaajgn8bqqnebvdl0xawlmt
ᥟᥧᥲ
0
2300501
5720732
5713805
2026-04-21T05:30:31Z
Ai Ku Karng
17824
/* ภาษาไทใต้คง */
5720732
wikitext
text/x-wiki
== ภาษาไทใต้คง ==
=== รากศัพท์ ===
ร่วมเชื้อสายกับ{{cog|shn|ဢူႈ}}
=== การออกเสียง ===
* {{IPA|tdd|/ʔuː˧˩/}}
=== คำนาม ===
{{tdd-verb}}
# [[พ่อ]]
#: {{syn|tdd|ᥙᥨᥝ}}
mdabjkfmi6ti26qu1sgzxy6knh25v4c
5720735
5720732
2026-04-21T05:58:30Z
Ai Ku Karng
17824
5720735
wikitext
text/x-wiki
== ภาษาไทใต้คง ==
=== รากศัพท์ ===
ร่วมเชื้อสายกับ{{cog|shn|ဢူႈ}}
=== การออกเสียง ===
* {{IPA|tdd|/ʔu˧˩/}}
=== คำนาม ===
{{tdd-verb}}
# [[พ่อ]]
#: {{syn|tdd|ᥙᥨᥝ}}
1x83efkxg1dx293z7xu4g0eruuc2lpn
ᥙᥨᥝ
0
2300515
5720733
5652871
2026-04-21T05:39:40Z
Ai Ku Karng
17824
/* ภาษาไทใต้คง */
5720733
wikitext
text/x-wiki
== ภาษาไทใต้คง ==
=== รากศัพท์ ===
{{inh+|tdd|tai-pro|*boːᴮ}}; ร่วมเชื้อสายกับ{{cog|th|พ่อ}}, {{cog|sou|ผอ}}, {{cog|tts|พ่อ}}, {{cog|lo|ພໍ່}}, {{cog|nod|ᨻᩬᩴ᩵}}, {{cog|kkh|ᨻᩳ᩵}}, {{cog|khb|ᦗᦸᧈ}}, {{cog|blt|ꪝꪷ꪿}}, {{cog|shn|ပေႃႈ}}, {{cog|aho|𑜆𑜦𑜡}} หรือ {{m|aho|𑜆𑜨𑜦𑜡}}
=== การออกเสียง ===
* {{IPA|tdd|/po˧˧/}}
=== คำนาม ===
{{tdd-num}}
# [[พ่อ]]
#: {{syn|tdd|ᥟᥧᥲ}}
appmftusqpetu3m0heqn3i7l9hzl7f8
ᥞᥝᥳ
0
2300563
5720728
5653055
2026-04-21T05:16:09Z
Ai Ku Karng
17824
/* ภาษาไทใต้คง */
5720728
wikitext
text/x-wiki
== ภาษาไทใต้คง ==
=== การออกเสียง ===
* {{IPA|tdd|/haw˦˧/}}
=== คำอนุภาค ===
{{tdd-part}}
# [[แล้ว]]
#: {{syn|tdd|ᥕᥝᥳ}}
irl4pgwjt7czdbqaaapepbg03kyvyed
ᥖᥭᥰ
0
2300567
5720730
5653063
2026-04-21T05:21:40Z
Ai Ku Karng
17824
5720730
wikitext
text/x-wiki
== ภาษาไทใต้คง ==
=== รากศัพท์ ===
{{inh+|th|tai-swe-pro|*dajᴬ²}}, จาก{{inh|th|tai-pro|*ɗwɤːjᴬ}}; ร่วมเชื้อสายกับ{{cog|
tts|ไท}}, {{cog|nod|ᨴᩱ}}, {{cog|lo|ໄທ}}, {{cog|
nyw|ไท}}, {{cog|khb|ᦺᦑ}}, {{cog|blt|ꪼꪕ}}, {{cog|shn|တႆး}}, {{cog|aio|တႝ}}, {{cog|phk|တႝ}}, {{cog|aho|𑜄𑜩}}
=== คำนาม ===
{{tdd-verb}}
# [[ไท]]
l2o88cul6cvmyjsx9fidifogyiceh5p
มอดูล:languages/chars
828
2323855
5720750
5683829
2026-04-21T07:00:44Z
OctraBot
3198
บอต: แทนที่ข้อความโดยอัตโนมัติ (-standardChars +standard_chars)
5720750
Scribunto
text/plain
local export = {}
local table = table
local insert = table.insert
local u = require("Module:string/char")
-- UTF-8 encoded strings for some commonly-used diacritics.
local c = {
prime = u(0x02B9),
grave = u(0x0300),
acute = u(0x0301),
circ = u(0x0302), -- circumflex
tilde = u(0x0303),
macron = u(0x0304),
overline = u(0x0305),
breve = u(0x0306),
dotabove = u(0x0307),
diaer = u(0x0308), -- diaeresis
ringabove = u(0x030A),
dacute = u(0x030B), -- double acute
caron = u(0x030C),
lineabove = u(0x030D),
dgrave = u(0x030F), -- double grave
invbreve = u(0x0311), -- inverted breve
turnedcommaabove = u(0x0312),
commaabove = u(0x0313),
revcommaabove = u(0x0314), -- reversed comma above
dotbelow = u(0x0323),
diaerbelow = u(0x0324), -- diaeresis below
ringbelow = u(0x0325),
cedilla = u(0x0327),
ogonek = u(0x0328),
caronbelow = u(0x032C),
brevebelow = u(0x032E),
macronbelow = u(0x0331),
perispomeni = u(0x0342),
ypogegrammeni = u(0x0345),
CGJ = u(0x034F), -- combining grapheme joiner
zigzag = u(0x035B),
dbrevebelow = u(0x035C), -- double breve below
dmacron = u(0x035E), -- double macron
dtilde = u(0x0360), -- double tilde
dinvbreve = u(0x0361), -- double inverted breve
small_a = u(0x0363),
small_e = u(0x0364),
small_i = u(0x0365),
small_o = u(0x0366),
small_u = u(0x0367),
keraia = u(0x0374),
lowerkeraia = u(0x0375),
tonos = u(0x0384),
palatalization = u(0x0484),
dasiapneumata = u(0x0485),
psilipneumata = u(0x0486),
kashida = u(0x0640),
fathatan = u(0x064B),
dammatan = u(0x064C),
kasratan = u(0x064D),
fatha = u(0x064E),
damma = u(0x064F),
kasra = u(0x0650),
shadda = u(0x0651),
sukun = u(0x0652),
hamzaabove = u(0x0654),
nunghunna = u(0x0658),
zwarakay = u(0x0659),
smallv = u(0x065A),
superalef = u(0x0670),
udatta = u(0x0951),
anudatta = u(0x0952),
tacute = u(0x1ACB), -- triple acute
dottedgrave = u(0x1DC0),
dottedacute = u(0x1DC1),
coronis = u(0x1FBD),
psili = u(0x1FBF),
dasia = u(0x1FEF),
ZWNJ = u(0x200C), -- zero width non-joiner
ZWJ = u(0x200D), -- zero width joiner
RSQuo = u(0x2019), -- right single quote
kavyka = u(0xA67C),
VS01 = u(0xFE00), -- variation selector 1
-- Punctuation for the standard_chars field.
-- Note: characters are literal (i.e. no magic characters).
punc = " ',-‐‑‒–—…∅",
-- Range covering all diacritics.
diacritics = u(0x300) .. "-" .. u(0x34E) ..
u(0x350) .. "-" .. u(0x36F) ..
u(0x1AB0) .. "-" .. u(0x1ACE) ..
u(0x1DC0) .. "-" .. u(0x1DFF) ..
u(0x20D0) .. "-" .. u(0x20F0) ..
u(0xFE20) .. "-" .. u(0xFE2F),
}
-- Braille characters for the standard_chars field.
local braille = {}
for i = 0x2800, 0x28FF do
insert(braille, u(i))
end
c.braille = table.concat(braille)
export.chars = c
-- PUA characters, generally used in sortkeys.
-- Note: if the limit needs to be increased, do so in powers of 2 (due to the way memory is allocated for tables).
local p = {}
for i = 1, 32 do
p[i] = u(0xF000+i-1)
end
export.puaChars = p
local cs = {}
-- Used for the default display_text and strip_diacritics for Grek, but parts also used directly by Albanian (sq).
cs["Grek-displaytext"] = {
from = {"Þ", "þ", c.turnedcommaabove, "['ʼ" .. c.RSQuo .. c.prime .. c.keraia .. c.coronis .. c.psili .. "]"}, -- Not tonos: used as the numeral sign in entries.
to = {"Ϸ", "ϸ", c.revcommaabove, c.RSQuo}
}
cs["Grek-stripdiacritics"] = {
remove_diacritics = c.caron .. c.diaerbelow .. c.brevebelow,
from = cs["Grek-displaytext"].from,
to = {"Ϸ", "ϸ", c.revcommaabove, "'"}
}
-- Used in the default strip_diacritics and sort_key for Cyrs, but also used directly by Old Ruthenian (zle-ort).
cs["Cyrs_remove_diacritics"] =
c.grave .. c.acute .. c.dotabove .. c.diaer .. c.invbreve .. c.palatalization .. c.dasiapneumata .. c.psilipneumata .. c.dottedgrave .. c.dottedacute .. c.kavyka
export.chars_substitutions = cs
return export
8n2w5fgofa7b3yf0owwxcxi9c45tihv
คุยกับผู้ใช้:Bikkola
3
2330421
5720674
2026-04-20T12:57:58Z
New user message
2698
เพิ่ม[[Template:Welcome|สารต้อนรับ]]ในหน้าคุยของผู้ใช้ใหม่
5720674
wikitext
text/x-wiki
{{Template:Welcome|realName=|name=Bikkola}}
-- [[ผู้ใช้:New user message|New user message]] ([[คุยกับผู้ใช้:New user message|คุย]]) 19:57, 20 เมษายน 2569 (+07)
ni1cz40uu55lcesh51gcphcsjzdlvhi
คุยกับผู้ใช้:Octahedron80/อักษรไทธรรม
3
2330422
5720675
2026-04-20T13:47:20Z
Ai Ku Karng
17824
/* รูปแบบการเขียน */ ส่วนใหม่
5720675
wikitext
text/x-wiki
== รูปแบบการเขียน ==
รูปแบบการเขียนอักษรธรรมล้านนาภาษาไทลื้อ เช่นᨠᩬᩳ อิงจากอะไรครับ [[ผู้ใช้:Ai Ku Karng|Ai Ku Karng]] ([[คุยกับผู้ใช้:Ai Ku Karng|คุย]]) 20:47, 20 เมษายน 2569 (+07)
2pdjquj3px57b6ktrq2q85ko4oqp0sk
5720676
5720675
2026-04-20T15:24:03Z
OctraBot
3198
/* รูปแบบการเขียน */
5720676
wikitext
text/x-wiki
== รูปแบบการเขียน ==
รูปแบบการเขียนอักษรธรรมล้านนาภาษาไทลื้อ เช่นᨠᩬᩳ อิงจากอะไรครับ [[ผู้ใช้:Ai Ku Karng|Ai Ku Karng]] ([[คุยกับผู้ใช้:Ai Ku Karng|คุย]]) 20:47, 20 เมษายน 2569 (+07)
[https://drive.google.com/open?id=10c5lzGMittfoU-BvIHv0XJLkgaEPhuyR&usp=drive_fs] [https://drive.google.com/open?id=1x29dkAuhsbeM0Anp_Ok9x4yu_CMBvKJD&usp=drive_fs] --[[ผู้ใช้:OctraBot|OctraBot]] ([[คุยกับผู้ใช้:OctraBot|คุย]]) 22:24, 20 เมษายน 2569 (+07)
alxn1fewhgpsh2hy6km2wgf2ihzct0w
5720677
5720676
2026-04-20T15:39:36Z
OctraBot
3198
/* รูปแบบการเขียน */
5720677
wikitext
text/x-wiki
== รูปแบบการเขียน ==
รูปแบบการเขียนอักษรธรรมล้านนาภาษาไทลื้อ เช่นᨠᩬᩳ อิงจากอะไรครับ [[ผู้ใช้:Ai Ku Karng|Ai Ku Karng]] ([[คุยกับผู้ใช้:Ai Ku Karng|คุย]]) 20:47, 20 เมษายน 2569 (+07)
[https://drive.google.com/open?id=10c5lzGMittfoU-BvIHv0XJLkgaEPhuyR&usp=drive_fs] [https://drive.google.com/open?id=1x29dkAuhsbeM0Anp_Ok9x4yu_CMBvKJD&usp=drive_fs] [https://wrdingham.co.uk/lanna/renderer_test.htm] ทั้งหมดนี้เป็นการประมวลผลรวมมาแล้ว --[[ผู้ใช้:OctraBot|OctraBot]] ([[คุยกับผู้ใช้:OctraBot|คุย]]) 22:24, 20 เมษายน 2569 (+07)
71hzou5b1vdifddm9zxh5gvssx60jxo
ciudades
0
2330423
5720707
2026-04-21T01:58:34Z
OctraBot
3198
นำเข้าจาก enwikt เก็บกวาด
5720707
wikitext
text/x-wiki
== ภาษาสเปน ==
=== การออกเสียง ===
{{es-pr}}
=== คำนาม ===
{{head|es|รูปนาม|g=f-p}}
# {{noun form of|es|ciudad||p}}
0mcxg8xtweb4gjim192ke67p9b1cpuk
rhinos
0
2330424
5720710
2026-04-21T02:08:10Z
OctraBot
3198
นำเข้าจาก enwikt เก็บกวาด เรียงลำดับหัวเรื่องภาษา
5720710
wikitext
text/x-wiki
== ภาษาฝรั่งเศส ==
=== คำนาม ===
{{head|fr|รูปนาม|g=m-p}}
# {{plural of|fr|rhino}}
== ภาษาอังกฤษ ==
=== คำนาม ===
{{head|en|รูปนาม}}
# {{plural of|en|rhino}}
=== คำสลับอักษร ===
* {{anagrams|en|a=hinors|Rishon|Hirson|orhnis|rishon}}
1fj28dhzbm8h8impx8ese120vigtpu9
capital loss
0
2330425
5720716
2026-04-21T02:17:03Z
OctraBot
3198
นำเข้าจาก enwikt เก็บกวาด
5720716
wikitext
text/x-wiki
== ภาษาอังกฤษ ==
=== คำนาม ===
{{en-noun|~}}
# {{lb|en|economics|business|finance}} [[ขาดทุนประเภททุน]]; การลดลงของมูลค่าสินทรัพย์ประเภททุน; จำนวนที่มูลค่าหรือรายได้จากการขายสินทรัพย์ประเภททุนโดยเจ้าของ น้อยกว่าต้นทุนของเจ้าของ
#: {{ant|en|capital gain}}
4mh2njrh8av5ds99094tgihcaqksyo2
kühl
0
2330426
5720731
2026-04-21T05:29:49Z
Ponpan
693
สร้างหน้าด้วย "== ภาษาเยอรมัน == === รากศัพท์ === {{root|de|ine-pro|*gel-}} จาก{{inh|de|gmh|küele}}, {{inh|de|goh|kuoli}}, จาก{{inh|de|gmw-pro|*kōl(ī)}}, จาก{{inh|de|gem-pro|*kōluz}}, {{m|gem-pro|*kōlaz}}, จาก{{der|de|ine-pro|*gel-}}; ร่วมเชื้อสายกับ{{cog|nl|koel}}, {{cog|en|cool}}; {{doublet|de|cool}} === การออกเสียง === * {{IPA|de|/kyːl/}} *..."
5720731
wikitext
text/x-wiki
== ภาษาเยอรมัน ==
=== รากศัพท์ ===
{{root|de|ine-pro|*gel-}}
จาก{{inh|de|gmh|küele}}, {{inh|de|goh|kuoli}}, จาก{{inh|de|gmw-pro|*kōl(ī)}}, จาก{{inh|de|gem-pro|*kōluz}}, {{m|gem-pro|*kōlaz}}, จาก{{der|de|ine-pro|*gel-}}; ร่วมเชื้อสายกับ{{cog|nl|koel}}, {{cog|en|cool}}; {{doublet|de|cool}}
=== การออกเสียง ===
* {{IPA|de|/kyːl/}}
* {{audio|de|De-kühl.ogg}}
* {{audio|de|De-kühl2.ogg|a=<<Germany>> (<<Berlin>>)}}
=== คำคุณศัพท์ ===
{{de-adj|comp}}
# [[เย็น]]
#: {{ant|de|heiß|kalt|lau|warm}}
#: {{coi|de|etwas '''kühl''' lagern|เก็บบางสิ่งในที่เย็น}}
#: {{uxi|de|Es ist '''kühler''' geworden.|อากาศเย็นลง}}
#: {{uxi|de|Das Wasser ist angenehm '''kühl'''.|น้ำเย็นสบาย}}
# [[สงบ]], [[เยือกเย็น]]
# [[เย็นชา]], [[ไร้]][[อารมณ์]], [[ไม่]][[สนใจ]]; ไม่[[ตอบสนอง]]ทาง[[เพศ]]
#: {{uxi|de|Warum bist du so '''kühl''' mir gegenüber.|ทำไมเธอถึงเย็นชากับฉันอย่างนี้}}
==== การผันรูป ====
{{de-adecl|comp}}
==== คำเกี่ยวข้อง ====
{{col|de|kühlen|Kühle|gekühlt|kühlgemäßigt|die kühle Schulter zeigen}}
=== อ่านเพิ่ม ===
* {{R:de:Duden}}
* {{R:de:DWDS}}
k1yus2vfje245rhtriviczuzgdhnfma
หมวดหมู่:แม่แบบแหล่งอ้างอิงภาษากรีกโบราณ
14
2330427
5720739
2026-04-21T06:28:44Z
OctraBot
3198
สร้างหน้าด้วย "{{auto cat}}"
5720739
wikitext
text/x-wiki
{{auto cat}}
eomzlm5v4j7ond1phrju7cnue91g5qx
หมวดหมู่:แม่แบบลิงก์ภาษากรีกโบราณ
14
2330428
5720740
2026-04-21T06:28:51Z
OctraBot
3198
สร้างหมวดหมู่อัตโนมัติ
5720740
wikitext
text/x-wiki
{{auto cat}}
eomzlm5v4j7ond1phrju7cnue91g5qx
มอดูล:Copt-sortkey
828
2330429
5720773
2026-04-21T07:09:19Z
OctraBot
3198
สร้างหน้าด้วย "export = {} local match = mw.ustring.match local str_gsub = string.gsub local function ugsub(text, regex, replacement) local out = mw.ustring.gsub(text, regex, replacement) return out end local alphabet = "ⲁⲃⲅⲇⲉⲍⲏⲑⲓⲕⲗⲙⲛⲝⲟⲡⲣⲥⲧⲩⲫⲭⲯⲱϣϥⳉϧϩϫϭw" local vowels = "ⲁⲉⲏⲓⲟⲩⲱ" local vowel = "[" .. vowels .. "]" local consonants = ugsub(alphabet, vowel, "") local consonant = "[" .. consona..."
5720773
Scribunto
text/plain
export = {}
local match = mw.ustring.match
local str_gsub = string.gsub
local function ugsub(text, regex, replacement)
local out = mw.ustring.gsub(text, regex, replacement)
return out
end
local alphabet = "ⲁⲃⲅⲇⲉⲍⲏⲑⲓⲕⲗⲙⲛⲝⲟⲡⲣⲥⲧⲩⲫⲭⲯⲱϣϥⳉϧϩϫϭw"
local vowels = "ⲁⲉⲏⲓⲟⲩⲱ"
local vowel = "[" .. vowels .. "]"
local consonants = ugsub(alphabet, vowel, "")
local consonant = "[" .. consonants .. "]"
local replacements = {
["ⲟⲩ"] = "ⲩ",
["ⳤ"] = "ⲕⲉ",
["ⲉⲓ"] = "ⲓ",
["ϯ"] = "ⲧⲓ",
["-"] = "",
["⸗"] = "",
["ˋ"] = "",
}
local CopticToGreek = {
["ⲁ"] = "α",
["ⲃ"] = "β",
["ⲅ"] = "γ",
["ⲇ"] = "δ",
["ⲉ"] = "ε",
["ⲍ"] = "ζ",
["ⲏ"] = "η",
["ⲑ"] = "θ",
["ⲓ"] = "ι",
["ⲕ"] = "κ",
["ⲗ"] = "λ",
["ⲙ"] = "μ",
["ⲛ"] = "ν",
["ⲝ"] = "ξ",
["ⲟ"] = "ο",
["ⲡ"] = "π",
["ⲣ"] = "ρ",
["ⲥ"] = "σ",
["ⲧ"] = "τ",
["ⲩ"] = "υ",
["ⲫ"] = "φ",
["ⲭ"] = "χ",
["ⲯ"] = "ψ",
["ⲱ"] = "ω",
}
function export.makeSortKey(text, lang, sc)
text = mw.ustring.lower(text)
for letter, replacement in pairs(replacements) do
text = str_gsub(text, letter, replacement)
end
local origText = text
text = ugsub(text, "ⲩ(" .. vowel .. ")", "w%1")
text = ugsub(text, "(" .. vowel .. ")ⲩ", "%1w")
-- mw.log(origText, text)
local sort = {}
for word in mw.ustring.gmatch(text, "%S+") do
-- Add initial vowel (if any).
table.insert(sort, match(word, "^" .. vowel) )
-- Add consonants (in order).
table.insert(sort, ugsub(word, vowel .. "+", ""))
--[[
Add the number "1" if word ends in consonant.
"1" sorts before Greek–Coptic and Coptic Unicode blocks.
]]
if mw.ustring.match(word, consonant .. "$") then
table.insert(sort, "1")
elseif mw.ustring.match(word, vowel .. "$") then
table.insert(sort, "2")
end
-- Get non-initial vowels (in order) by removing initial vowel and all consonants.
table.insert(sort, ugsub(ugsub(word, "^" .. vowel, ""), consonant, ""))
table.insert(sort, " ")
end
sort = table.concat(sort)
sort = str_gsub(sort, "w", "ⲩ")
--[[
Convert Greek-derived Coptic characters to Greek ones.
Otherwise, the uniquely Coptic letters would sort first, because
they were added to Unicode earlier.
ϣϥⳉϧϩϫϭ ⲁⲃⲅⲇⲉⲍⲏⲑⲓⲕⲗⲙⲛⲝⲟⲡⲣⲥⲧⲩⲫⲭⲯⲱ
⇓
αβγδεζηθικλμνξοπρστυφχψω ϣϥⳉϧϩϫϭ
]]
sort = str_gsub(sort, "[\194-\244][\128-\191]+", CopticToGreek)
return mw.ustring.upper(sort)
end
local lang = require("Module:languages").getByCode("cop")
local sc = require("Module:scripts").getByCode("Copt")
local function tag(text)
return require("Module:script utilities").tag_text(text, lang, sc)
end
function export.showSorting(frame)
local terms = {}
for i, term in ipairs(frame.args) do
table.insert(terms, term)
end
local function comp(term1, term2)
return export.makeSortKey(term1) < export.makeSortKey(term2)
end
table.sort(terms, comp)
for i, term in pairs(terms) do
terms[i] = "\n* " .. tag(term) .. " (<code>" .. export.makeSortKey(term) .. "</code>)"
end
return table.concat(terms)
end
return export
8o899v7yx7qyh5uiyv6wmfsrsqo8b0t
มอดูล:wlm-sortkey
828
2330430
5720775
2026-04-21T07:13:56Z
OctraBot
3198
สร้างหน้าด้วย "local export = {} local u = mw.ustring.char local a = u(0xF000) local remove_diacritics = u(0x0300) .. "-" .. u(0x0302) .. u(0x0308) .. "'" -- grave, acute, circumflex, diaeresis, apostrophe local oneChar = { ["k"] = "c" } local twoChars = { ["ch"] = "c" .. a, ["dd"] = "d" .. a, ["ff"] = "f" .. a, ["ll"] = "l" .. a, ["ph"] = "p" .. a, ["rh"] = "r" .. a, ["th"] = "t" .. a } local threeChars = { ["ngh"] = "g" .. a } function export.makeSortKey..."
5720775
Scribunto
text/plain
local export = {}
local u = mw.ustring.char
local a = u(0xF000)
local remove_diacritics = u(0x0300) .. "-" .. u(0x0302) .. u(0x0308) .. "'" -- grave, acute, circumflex, diaeresis, apostrophe
local oneChar = {
["k"] = "c"
}
local twoChars = {
["ch"] = "c" .. a, ["dd"] = "d" .. a, ["ff"] = "f" .. a, ["ll"] = "l" .. a, ["ph"] = "p" .. a, ["rh"] = "r" .. a, ["th"] = "t" .. a
}
local threeChars = {
["ngh"] = "g" .. a
}
function export.makeSortKey(text, lang, sc)
text = mw.ustring.lower(text)
for from, to in pairs(threeChars) do
text = mw.ustring.gsub(text, from, to)
end
for from, to in pairs(twoChars) do
text = mw.ustring.gsub(text, from, to)
end
return mw.ustring.upper(mw.ustring.toNFC(mw.ustring.gsub(mw.ustring.toNFD(mw.ustring.gsub(text, ".", oneChar)), "[" .. remove_diacritics .. "]", ""))) -- decompose, remove appropriate diacritics, then recompose again
end
return export
ofbrvuy006obg7gppomrydjtvu7udcs
มอดูล:mdf-sortkey
828
2330431
5720777
2026-04-21T07:15:02Z
OctraBot
3198
สร้างหน้าด้วย "local export = {} local u = mw.ustring.char local a = u(0xF000) local oneChar = { ["ё"] = "е" .. a } function export.makeSortKey(text, lang, sc) return mw.ustring.upper(mw.ustring.gsub(mw.ustring.lower(text), ".", oneChar)) end return export"
5720777
Scribunto
text/plain
local export = {}
local u = mw.ustring.char
local a = u(0xF000)
local oneChar = {
["ё"] = "е" .. a
}
function export.makeSortKey(text, lang, sc)
return mw.ustring.upper(mw.ustring.gsub(mw.ustring.lower(text), ".", oneChar))
end
return export
fpynrgnyrcig23pqmf0ixa0z2wahmle
มอดูล:gmw-pro-sortkey
828
2330432
5720778
2026-04-21T07:15:59Z
OctraBot
3198
สร้างหน้าด้วย "local export = {} local u = mw.ustring.char local remove_diacritics = u(0x0304) .. u(0x0328) -- macron, ogonek local oneChar = { ["ʀ"] = "r" } function export.makeSortKey(text, lang, sc) return mw.ustring.upper(mw.ustring.toNFC(mw.ustring.gsub(mw.ustring.toNFD(mw.ustring.gsub(mw.ustring.lower(text), ".", oneChar)), "[" .. remove_diacritics .. "]", ""))) -- decompose, remove appropriate diacritics, then recompose again end return export"
5720778
Scribunto
text/plain
local export = {}
local u = mw.ustring.char
local remove_diacritics = u(0x0304) .. u(0x0328) -- macron, ogonek
local oneChar = {
["ʀ"] = "r"
}
function export.makeSortKey(text, lang, sc)
return mw.ustring.upper(mw.ustring.toNFC(mw.ustring.gsub(mw.ustring.toNFD(mw.ustring.gsub(mw.ustring.lower(text), ".", oneChar)), "[" .. remove_diacritics .. "]", ""))) -- decompose, remove appropriate diacritics, then recompose again
end
return export
7lc3wpilcm5n7sf864u24efw71z1off
มอดูล:bnt-pro-sortkey
828
2330433
5720779
2026-04-21T07:16:14Z
OctraBot
3198
สร้างหน้าด้วย "local export = {} local u = mw.ustring.char local a, b = u(0xF000), u(0xF001) local remove_diacritics = u(0x0300) .. u(0x0301) -- grave, acute local oneChar = { ["ɪ"] = "i" .. a, ["ì"] = "i" .. b, ["í"] = "i" .. b, ["ʊ"] = "u" .. a, ["ù"] = "u" .. b, ["ú"] = "u" .. b } function export.makeSortKey(text, lang, sc) return mw.ustring.upper(mw.ustring.toNFC(mw.ustring.gsub(mw.ustring.toNFD(mw.ustring.gsub(mw.ustring.lower(text), ".", oneChar))..."
5720779
Scribunto
text/plain
local export = {}
local u = mw.ustring.char
local a, b = u(0xF000), u(0xF001)
local remove_diacritics = u(0x0300) .. u(0x0301) -- grave, acute
local oneChar = {
["ɪ"] = "i" .. a, ["ì"] = "i" .. b, ["í"] = "i" .. b, ["ʊ"] = "u" .. a, ["ù"] = "u" .. b, ["ú"] = "u" .. b
}
function export.makeSortKey(text, lang, sc)
return mw.ustring.upper(mw.ustring.toNFC(mw.ustring.gsub(mw.ustring.toNFD(mw.ustring.gsub(mw.ustring.lower(text), ".", oneChar)), "[" .. remove_diacritics .. "]", ""))) -- decompose, remove appropriate diacritics, then recompose again
end
return export
0xagsm5z1meiajymzdjlpul8rs3bk3m
มอดูล:cel-pro-sortkey
828
2330434
5720780
2026-04-21T07:16:28Z
OctraBot
3198
สร้างหน้าด้วย "local export = {} local u = mw.ustring.char local a = u(0xF000) local remove_diacritics = u(0x0304) -- macron local oneChar = { ["ɸ"] = "f", ["φ"] = "f", ["ʷ"] = "w" } function export.makeSortKey(text, lang, sc) text = mw.ustring.gsub(mw.ustring.lower(text), "w", "w" .. a) -- ensure "w" comes after "ʷ" return mw.ustring.upper(mw.ustring.toNFC(mw.ustring.gsub(mw.ustring.toNFD(mw.ustring.gsub(text, ".", oneChar)), "[" .. remove_diacritics..."
5720780
Scribunto
text/plain
local export = {}
local u = mw.ustring.char
local a = u(0xF000)
local remove_diacritics = u(0x0304) -- macron
local oneChar = {
["ɸ"] = "f", ["φ"] = "f", ["ʷ"] = "w"
}
function export.makeSortKey(text, lang, sc)
text = mw.ustring.gsub(mw.ustring.lower(text), "w", "w" .. a) -- ensure "w" comes after "ʷ"
return mw.ustring.upper(mw.ustring.toNFC(mw.ustring.gsub(mw.ustring.toNFD(mw.ustring.gsub(text, ".", oneChar)), "[" .. remove_diacritics .. "]", ""))) -- decompose, remove appropriate diacritics, then recompose again
end
return export
s4tpyk1s0a6cln7elxc6pbgn54mljlx
มอดูล:gem-pro-sortkey
828
2330435
5720781
2026-04-21T07:16:41Z
OctraBot
3198
สร้างหน้าด้วย "local export = {} local u = mw.ustring.char local remove_diacritics = u(0x0302) .. u(0x0304) -- circumflex, macron local oneChar = { ["ą"] = "an", ["į"] = "in", ["ǫ"] = "on", ["ų"] = "un" } function export.makeSortKey(text, lang, sc) return mw.ustring.upper(mw.ustring.toNFC(mw.ustring.gsub(mw.ustring.toNFD(mw.ustring.gsub(mw.ustring.lower(text), ".", oneChar)), "[" .. remove_diacritics .. "]", ""))) -- decompose, remove appropriate diacriti..."
5720781
Scribunto
text/plain
local export = {}
local u = mw.ustring.char
local remove_diacritics = u(0x0302) .. u(0x0304) -- circumflex, macron
local oneChar = {
["ą"] = "an", ["į"] = "in", ["ǫ"] = "on", ["ų"] = "un"
}
function export.makeSortKey(text, lang, sc)
return mw.ustring.upper(mw.ustring.toNFC(mw.ustring.gsub(mw.ustring.toNFD(mw.ustring.gsub(mw.ustring.lower(text), ".", oneChar)), "[" .. remove_diacritics .. "]", ""))) -- decompose, remove appropriate diacritics, then recompose again
end
return export
4tap68mjd5qt3mdukwabrpdbjban8wb
มอดูล:sma-sortkey
828
2330436
5720782
2026-04-21T07:17:09Z
OctraBot
3198
สร้างหน้าด้วย "local export = {} local u = mw.ustring.char local a, b, c = u(0xF000), u(0xF001), u(0xF002) local oneChar = { ["ï"] = "i" .. a, ["æ"] = "z" .. a, ["ä"] = "z" .. a, ["ø"] = "z" .. b, ["ö"] = "z" .. b, ["å"] = "z" .. c } function export.makeSortKey(text, lang, sc) return mw.ustring.upper(mw.ustring.gsub(mw.ustring.lower(text), ".", oneChar)) end return export"
5720782
Scribunto
text/plain
local export = {}
local u = mw.ustring.char
local a, b, c = u(0xF000), u(0xF001), u(0xF002)
local oneChar = {
["ï"] = "i" .. a, ["æ"] = "z" .. a, ["ä"] = "z" .. a, ["ø"] = "z" .. b, ["ö"] = "z" .. b, ["å"] = "z" .. c
}
function export.makeSortKey(text, lang, sc)
return mw.ustring.upper(mw.ustring.gsub(mw.ustring.lower(text), ".", oneChar))
end
return export
9uw1xjoz1haiv6ju835pesrtd8tkhmd
หมวดหมู่:ภาษาซามีใต้
14
2330437
5720784
2026-04-21T07:17:56Z
OctraBot
3198
สร้างหน้าด้วย "{{auto cat|นอร์เวย์|สวีเดน|setwiki=Southern Sami}}"
5720784
wikitext
text/x-wiki
{{auto cat|นอร์เวย์|สวีเดน|setwiki=Southern Sami}}
bl7hv18m0mrs85oml6ckkhx0zhj4yym
5720785
5720784
2026-04-21T07:18:35Z
OctraBot
3198
5720785
wikitext
text/x-wiki
{{auto cat|นอร์เวย์|สวีเดน}}
ez99jpf91za3w9tokozpw89n58i5gqo
หมวดหมู่:คำนามนับได้ภาษาจอร์เจีย
14
2330438
5720788
2026-04-21T07:31:43Z
OctraBot
3198
สร้างหมวดหมู่อัตโนมัติ
5720788
wikitext
text/x-wiki
{{auto cat}}
eomzlm5v4j7ond1phrju7cnue91g5qx
ᥞᥧᥳ
0
2330439
5720789
2026-04-21T07:31:56Z
Ai Ku Karng
17824
สร้างหน้าด้วย "=== รากศัพท์ === {{inh+|lo|tai-pro|*rɯːwꟲ}}; ร่วมเชื้อสายกับ{{cog|th|รู้}}, {{cog|nod|ᩁᩪ᩶}}, {{cog|tts|ฮู้}}, {{cog|shn|ႁူႉ}}, {{cog|khb|ᦣᦴᧉ}}, {{cog|blt|ꪭꪴ꫁}}, {{cog|twh|ꪭꪴꫂ}}, {{cog|aho|𑜍𑜥}} หรือ {{m|aho|𑜍𑜤𑜈𑜫}}, {{cog|skb|รอ}}, {{cog|za|rox}}, {{cog|zzj|rux}}, {{cog|lo|ຮູ້}} === การออกเสียง === * {{..."
5720789
wikitext
text/x-wiki
=== รากศัพท์ ===
{{inh+|lo|tai-pro|*rɯːwꟲ}}; ร่วมเชื้อสายกับ{{cog|th|รู้}}, {{cog|nod|ᩁᩪ᩶}}, {{cog|tts|ฮู้}}, {{cog|shn|ႁူႉ}}, {{cog|khb|ᦣᦴᧉ}}, {{cog|blt|ꪭꪴ꫁}}, {{cog|twh|ꪭꪴꫂ}}, {{cog|aho|𑜍𑜥}} หรือ {{m|aho|𑜍𑜤𑜈𑜫}}, {{cog|skb|รอ}}, {{cog|za|rox}}, {{cog|zzj|rux}}, {{cog|lo|ຮູ້}}
=== การออกเสียง ===
* {{IPA|tdd|/hu˦˧/}}
=== คำกริยา ===
{{tdd-verb}}
# [[รู้]]
4ji16hf2h971ihfjdezxy64dyvfsqgc
5720791
5720789
2026-04-21T07:32:19Z
Ai Ku Karng
17824
/* การออกเสียง */
5720791
wikitext
text/x-wiki
=== รากศัพท์ ===
{{inh+|lo|tai-pro|*rɯːwꟲ}}; ร่วมเชื้อสายกับ{{cog|th|รู้}}, {{cog|nod|ᩁᩪ᩶}}, {{cog|tts|ฮู้}}, {{cog|shn|ႁူႉ}}, {{cog|khb|ᦣᦴᧉ}}, {{cog|blt|ꪭꪴ꫁}}, {{cog|twh|ꪭꪴꫂ}}, {{cog|aho|𑜍𑜥}} หรือ {{m|aho|𑜍𑜤𑜈𑜫}}, {{cog|skb|รอ}}, {{cog|za|rox}}, {{cog|zzj|rux}}, {{cog|lo|ຮູ້}}
== การออกเสียง ==
* {{IPA|tdd|/hu˦˧/}}
=== คำกริยา ===
{{tdd-verb}}
# [[รู้]]
3ub1pyud7la5fr1alhayz36grmg6b97
5720792
5720791
2026-04-21T07:32:30Z
Ai Ku Karng
17824
5720792
wikitext
text/x-wiki
== รากศัพท์ ==
{{inh+|lo|tai-pro|*rɯːwꟲ}}; ร่วมเชื้อสายกับ{{cog|th|รู้}}, {{cog|nod|ᩁᩪ᩶}}, {{cog|tts|ฮู้}}, {{cog|shn|ႁူႉ}}, {{cog|khb|ᦣᦴᧉ}}, {{cog|blt|ꪭꪴ꫁}}, {{cog|twh|ꪭꪴꫂ}}, {{cog|aho|𑜍𑜥}} หรือ {{m|aho|𑜍𑜤𑜈𑜫}}, {{cog|skb|รอ}}, {{cog|za|rox}}, {{cog|zzj|rux}}, {{cog|lo|ຮູ້}}
== การออกเสียง ==
* {{IPA|tdd|/hu˦˧/}}
=== คำกริยา ===
{{tdd-verb}}
# [[รู้]]
okdm7i6fjjz9lp76axlhnn9gbuktr02
5720793
5720792
2026-04-21T07:32:43Z
Ai Ku Karng
17824
5720793
wikitext
text/x-wiki
== รากศัพท์ ==
{{inh+|lo|tai-pro|*rɯːwꟲ}}; ร่วมเชื้อสายกับ{{cog|th|รู้}}, {{cog|nod|ᩁᩪ᩶}}, {{cog|tts|ฮู้}}, {{cog|shn|ႁူႉ}}, {{cog|khb|ᦣᦴᧉ}}, {{cog|blt|ꪭꪴ꫁}}, {{cog|twh|ꪭꪴꫂ}}, {{cog|aho|𑜍𑜥}} หรือ {{m|aho|𑜍𑜤𑜈𑜫}}, {{cog|skb|รอ}}, {{cog|za|rox}}, {{cog|zzj|rux}}, {{cog|lo|ຮູ້}}
=== การออกเสียง ===
* {{IPA|tdd|/hu˦˧/}}
=== คำกริยา ===
{{tdd-verb}}
# [[รู้]]
5iigbz935uuhh0x5f2irzpsjcsspk9z
5720794
5720793
2026-04-21T07:33:40Z
Apisite
10648
5720794
wikitext
text/x-wiki
== ภาษาไทใต้คง ==
=== รากศัพท์ ===
{{inh+|lo|tai-pro|*rɯːwꟲ}}; ร่วมเชื้อสายกับ{{cog|th|รู้}}, {{cog|nod|ᩁᩪ᩶}}, {{cog|tts|ฮู้}}, {{cog|shn|ႁူႉ}}, {{cog|khb|ᦣᦴᧉ}}, {{cog|blt|ꪭꪴ꫁}}, {{cog|twh|ꪭꪴꫂ}}, {{cog|aho|𑜍𑜥}} หรือ {{m|aho|𑜍𑜤𑜈𑜫}}, {{cog|skb|รอ}}, {{cog|za|rox}}, {{cog|zzj|rux}}, {{cog|lo|ຮູ້}}
=== การออกเสียง ===
* {{IPA|tdd|/hu˦˧/}}
=== คำกริยา ===
{{tdd-verb}}
# [[รู้]]
imhds95aowa85w6kqheznjlgirxgcu7
หมวดหมู่:แม่แบบแหล่งอ้างอิงภาษาซีรีแอกคลาสสิก
14
2330440
5720790
2026-04-21T07:32:05Z
OctraBot
3198
สร้างหน้าด้วย "{{auto cat}}"
5720790
wikitext
text/x-wiki
{{auto cat}}
eomzlm5v4j7ond1phrju7cnue91g5qx
หมวดหมู่:แม่แบบลิงก์ภาษาซีรีแอกคลาสสิก
14
2330441
5720795
2026-04-21T07:37:42Z
OctraBot
3198
สร้างหมวดหมู่อัตโนมัติ
5720795
wikitext
text/x-wiki
{{auto cat}}
eomzlm5v4j7ond1phrju7cnue91g5qx
หมวดหมู่:แม่แบบภาษาซีรีแอกคลาสสิก
14
2330442
5720796
2026-04-21T07:37:47Z
OctraBot
3198
สร้างหมวดหมู่อัตโนมัติ
5720796
wikitext
text/x-wiki
{{auto cat}}
eomzlm5v4j7ond1phrju7cnue91g5qx
หมวดหมู่:mul:ประเทศ
14
2330443
5720797
2026-04-21T07:38:01Z
OctraBot
3198
สร้างหมวดหมู่อัตโนมัติ
5720797
wikitext
text/x-wiki
{{auto cat}}
eomzlm5v4j7ond1phrju7cnue91g5qx
หมวดหมู่:mul:รายชื่อหมวดหมู่ชื่อ
14
2330444
5720798
2026-04-21T07:38:09Z
OctraBot
3198
สร้างหมวดหมู่อัตโนมัติ
5720798
wikitext
text/x-wiki
{{auto cat}}
eomzlm5v4j7ond1phrju7cnue91g5qx
หมวดหมู่:mul:องค์การทางการเมือง
14
2330445
5720799
2026-04-21T07:38:10Z
OctraBot
3198
สร้างหมวดหมู่อัตโนมัติ
5720799
wikitext
text/x-wiki
{{auto cat}}
eomzlm5v4j7ond1phrju7cnue91g5qx
หมวดหมู่:mul:สถานที่
14
2330446
5720800
2026-04-21T07:38:15Z
OctraBot
3198
สร้างหมวดหมู่อัตโนมัติ
5720800
wikitext
text/x-wiki
{{auto cat}}
eomzlm5v4j7ond1phrju7cnue91g5qx
หมวดหมู่:mul:ชื่อ (หัวข้อ)
14
2330447
5720801
2026-04-21T07:38:20Z
OctraBot
3198
สร้างหมวดหมู่อัตโนมัติ
5720801
wikitext
text/x-wiki
{{auto cat}}
eomzlm5v4j7ond1phrju7cnue91g5qx
หมวดหมู่:คำนามผันรูปไม่ได้ภาษาฝรั่งเศส
14
2330448
5720802
2026-04-21T07:38:42Z
OctraBot
3198
สร้างหมวดหมู่อัตโนมัติ
5720802
wikitext
text/x-wiki
{{auto cat}}
eomzlm5v4j7ond1phrju7cnue91g5qx
หมวดหมู่:แม่แบบแหล่งอ้างอิงภาษาอิงกุช
14
2330449
5720803
2026-04-21T07:40:00Z
OctraBot
3198
สร้างหมวดหมู่อัตโนมัติ
5720803
wikitext
text/x-wiki
{{auto cat}}
eomzlm5v4j7ond1phrju7cnue91g5qx
หมวดหมู่:แม่แบบลิงก์ภาษาอิงกุช
14
2330450
5720804
2026-04-21T07:40:06Z
OctraBot
3198
สร้างหมวดหมู่อัตโนมัติ
5720804
wikitext
text/x-wiki
{{auto cat}}
eomzlm5v4j7ond1phrju7cnue91g5qx
หมวดหมู่:แม่แบบภาษาอิงกุช
14
2330451
5720805
2026-04-21T07:40:12Z
OctraBot
3198
สร้างหมวดหมู่อัตโนมัติ
5720805
wikitext
text/x-wiki
{{auto cat}}
eomzlm5v4j7ond1phrju7cnue91g5qx
หมวดหมู่:แม่แบบแหล่งอ้างอิงภาษาแมนจู
14
2330452
5720806
2026-04-21T07:40:34Z
OctraBot
3198
สร้างหมวดหมู่อัตโนมัติ
5720806
wikitext
text/x-wiki
{{auto cat}}
eomzlm5v4j7ond1phrju7cnue91g5qx
หมวดหมู่:แม่แบบลิงก์ภาษาแมนจู
14
2330453
5720807
2026-04-21T07:40:39Z
OctraBot
3198
สร้างหมวดหมู่อัตโนมัติ
5720807
wikitext
text/x-wiki
{{auto cat}}
eomzlm5v4j7ond1phrju7cnue91g5qx
หมวดหมู่:แม่แบบภาษาแมนจู
14
2330454
5720808
2026-04-21T07:40:44Z
OctraBot
3198
สร้างหมวดหมู่อัตโนมัติ
5720808
wikitext
text/x-wiki
{{auto cat}}
eomzlm5v4j7ond1phrju7cnue91g5qx
หมวดหมู่:แม่แบบแหล่งอ้างอิงภาษาบัตส์
14
2330455
5720809
2026-04-21T07:41:02Z
OctraBot
3198
สร้างหมวดหมู่อัตโนมัติ
5720809
wikitext
text/x-wiki
{{auto cat}}
eomzlm5v4j7ond1phrju7cnue91g5qx
หมวดหมู่:แม่แบบลิงก์ภาษาบัตส์
14
2330456
5720810
2026-04-21T07:41:08Z
OctraBot
3198
สร้างหมวดหมู่อัตโนมัติ
5720810
wikitext
text/x-wiki
{{auto cat}}
eomzlm5v4j7ond1phrju7cnue91g5qx
หมวดหมู่:แม่แบบภาษาบัตส์
14
2330457
5720811
2026-04-21T07:41:13Z
OctraBot
3198
สร้างหมวดหมู่อัตโนมัติ
5720811
wikitext
text/x-wiki
{{auto cat}}
eomzlm5v4j7ond1phrju7cnue91g5qx
หมวดหมู่:ศัพท์ภาษาโปรตุเกสที่สะกดด้วย Ù
14
2330458
5720812
2026-04-21T07:41:41Z
OctraBot
3198
สร้างหน้าด้วย "{{auto cat}}"
5720812
wikitext
text/x-wiki
{{auto cat}}
eomzlm5v4j7ond1phrju7cnue91g5qx
หมวดหมู่:แม่แบบแหล่งอ้างอิงภาษาเบลารุส
14
2330459
5720813
2026-04-21T07:42:02Z
OctraBot
3198
สร้างหมวดหมู่อัตโนมัติ
5720813
wikitext
text/x-wiki
{{auto cat}}
eomzlm5v4j7ond1phrju7cnue91g5qx
หมวดหมู่:แม่แบบแหล่งอ้างอิงภาษาเอสเปรันโต
14
2330460
5720814
2026-04-21T07:42:03Z
OctraBot
3198
สร้างหมวดหมู่อัตโนมัติ
5720814
wikitext
text/x-wiki
{{auto cat}}
eomzlm5v4j7ond1phrju7cnue91g5qx
หมวดหมู่:แม่แบบลิงก์ภาษาเบลารุส
14
2330461
5720815
2026-04-21T07:42:08Z
OctraBot
3198
สร้างหมวดหมู่อัตโนมัติ
5720815
wikitext
text/x-wiki
{{auto cat}}
eomzlm5v4j7ond1phrju7cnue91g5qx
หมวดหมู่:แม่แบบลิงก์ภาษาเอสเปรันโต
14
2330462
5720816
2026-04-21T07:42:10Z
OctraBot
3198
สร้างหมวดหมู่อัตโนมัติ
5720816
wikitext
text/x-wiki
{{auto cat}}
eomzlm5v4j7ond1phrju7cnue91g5qx
หมวดหมู่:แม่แบบภาษาเบลารุส
14
2330463
5720817
2026-04-21T07:42:15Z
OctraBot
3198
สร้างหมวดหมู่อัตโนมัติ
5720817
wikitext
text/x-wiki
{{auto cat}}
eomzlm5v4j7ond1phrju7cnue91g5qx
หมวดหมู่:แม่แบบภาษาเอสเปรันโต
14
2330464
5720818
2026-04-21T07:42:17Z
OctraBot
3198
สร้างหมวดหมู่อัตโนมัติ
5720818
wikitext
text/x-wiki
{{auto cat}}
eomzlm5v4j7ond1phrju7cnue91g5qx